Files
CGA-bench/analysis/small_batch_run_guide.md
2026-05-22 10:02:42 +08:00

2.4 KiB

Small-Batch Run Guide

This guide keeps the existing CorrectBench pipeline unchanged and only switches between clean task data/configs.

Custom tasks first

Run the two repaired custom tasks one at a time:

venv/bin/python main.py -c config/myproject_clean_spi_baseline.yaml
venv/bin/python main.py -c config/myproject_clean_spi.yaml
venv/bin/python main.py -c config/myproject_clean_i2c_baseline.yaml
venv/bin/python main.py -c config/myproject_clean_i2c.yaml

Use these runs for:

  • structural coverage comparison,
  • CGA iteration logs,
  • protocol-controller case studies.

Do not merge these custom tasks into the HDLBits Eval2 aggregate table.

Combined clean custom config

If you want both custom tasks in one run after single-task smoke passes:

venv/bin/python main.py -c config/myproject_clean_baseline.yaml
venv/bin/python main.py -c config/myproject_clean.yaml

HDLBits batches

Run only one batch at a time:

venv/bin/python main.py -c config/configs/hdlbits_batch_01_baseline.yaml
venv/bin/python main.py -c config/configs/hdlbits_batch_01.yaml

Then continue with:

  • config/configs/hdlbits_batch_02_baseline.yaml
  • config/configs/hdlbits_batch_02.yaml
  • config/configs/hdlbits_batch_03_baseline.yaml
  • config/configs/hdlbits_batch_03.yaml
  • config/configs/hdlbits_batch_04_baseline.yaml
  • config/configs/hdlbits_batch_04.yaml
  • config/configs/hdlbits_batch_05_baseline.yaml
  • config/configs/hdlbits_batch_05.yaml
  • config/configs/hdlbits_batch_06_baseline.yaml
  • config/configs/hdlbits_batch_06.yaml
  • config/configs/hdlbits_batch_07_baseline.yaml
  • config/configs/hdlbits_batch_07.yaml
  • config/configs/hdlbits_batch_08_baseline.yaml
  • config/configs/hdlbits_batch_08.yaml

Repeat policy

  • First pass: run every HDLBits batch once.
  • Second pass: only rerun anchor cases from config/configs/hdlbits_key_cases_baseline.yaml and config/configs/hdlbits_key_cases.yaml.
  • Anchor positives: fsm_ps2, lemmings3, lemmings4.
  • Anchor negatives: 2014_q3fsm, ece241_2013_q8, m2014_q6, review2015_fsm.

Interpretation notes

  • Use analysis/hdlbits_dead_branch_notes.md when a task plateaus near full coverage because of illegal-state/default guards.
  • Keep the original data/myproject/combined.jsonl as the failed historical version.
  • Use data/myproject/combined_clean.jsonl for the repaired custom-task runs.