Files
CGA-bench/analysis/custom_task_diagnosis.md
2026-05-22 10:02:42 +08:00

37 lines
1.4 KiB
Markdown

# Custom Task Diagnosis
This note records why the original custom-task experiment should not be used as
direct CGA evidence.
## Current failure causes
- `i2c_controller`
- The original DUT references `scl_in` without defining it.
- CGA baseline already fails at Verilator compile, so `coverage = 0.0` is not
evidence about branch reachability.
- The original testbench is stimulus-only and does not provide a clear pass/fail
contract or a slave-side bus model.
- `spi_controller`
- The original DUT leaves `spi_clk` effectively disconnected from observable
transaction progress.
- The original `ERROR/mode_fault` path is not meaningfully driven.
- The pipeline spends most of its time in `TBcheck -> reboot`, so the run is
dominated by unstable generated TB quality rather than CGA coverage search.
## Clean-data policy
- Keep the original `data/myproject/combined.jsonl` as the historical failure case.
- Use `data/myproject/combined_clean.jsonl` with:
- compileable DUTs,
- self-checking golden testbenches,
- descriptions that match the actual reachable protocol behavior.
## Paper usage
- Use the original custom-task run as a "task definition not yet valid" failure mode.
- Use the clean custom tasks only for:
- structural coverage comparison,
- CGA iteration case studies,
- qualitative discussion of protocol-style controllers.