1.9 KiB
1.9 KiB
| 1 | experiment_name | model | condition | repeat | run_dir | task_id | coverage | semantic_coverage | eval1_pass | eval2_pass | eval2_ratio | eval2_ratio_float | eval2_failed_mutants | full_pass | time_sec | token_cost | first_improvement_iter | op_record | task_log |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | paper_fsm | qwen-max | baseline | 1 | /home/zhang/CorrectBench/saves/0406~0412/Paper_Experiments/paper_fsm/qwen-max/baseline/paper_fsm_qwen-max_baseline_r01_20260410_184106 | 2013_q2afsm | 92.3076923076923 | 73.51 | False | False | 519.09 | 0.7850400000000001 | gen,syncheck,funccheck,coverage_eval,eval | /home/zhang/CorrectBench/saves/0406~0412/Paper_Experiments/paper_fsm/qwen-max/baseline/paper_fsm_qwen-max_baseline_r01_20260410_184106/2013_q2afsm/task_log.log | |||||
| 3 | paper_fsm | qwen-max | baseline | 1 | /home/zhang/CorrectBench/saves/0406~0412/Paper_Experiments/paper_fsm/qwen-max/baseline/paper_fsm_qwen-max_baseline_r01_20260410_184106 | 2012_q2fsm | 91.66666666666666 | 61.17 | True | False | 7/10 | 0.7 | 6,7,8 | False | 348.97 | 0.62682 | gen,syncheck,funccheck,coverage_eval,eval | /home/zhang/CorrectBench/saves/0406~0412/Paper_Experiments/paper_fsm/qwen-max/baseline/paper_fsm_qwen-max_baseline_r01_20260410_184106/2012_q2fsm/task_log.log | |
| 4 | paper_fsm | qwen-max | cga | 1 | /home/zhang/CorrectBench/saves/0406~0412/Paper_Experiments/paper_fsm/qwen-max/cga/paper_fsm_qwen-max_cga_r01_20260410_185537 | 2013_q2afsm | 92.3076923076923 | 73.51 | False | False | 1803.02 | 2.66142 | gen,syncheck,funccheck,gen,syncheck,funccheck,cga,eval | /home/zhang/CorrectBench/saves/0406~0412/Paper_Experiments/paper_fsm/qwen-max/cga/paper_fsm_qwen-max_cga_r01_20260410_185537/2013_q2afsm/task_log.log | |||||
| 5 | paper_fsm | qwen-max | cga | 1 | /home/zhang/CorrectBench/saves/0406~0412/Paper_Experiments/paper_fsm/qwen-max/cga/paper_fsm_qwen-max_cga_r01_20260410_185537 | 2012_q2fsm | 91.66666666666666 | 61.17 | True | False | 3/10 | 0.3 | 2,4,6,7,8,9,10 | False | 489.72 | 0.7213599999999999 | gen,syncheck,funccheck,cga,eval | /home/zhang/CorrectBench/saves/0406~0412/Paper_Experiments/paper_fsm/qwen-max/cga/paper_fsm_qwen-max_cga_r01_20260410_185537/2012_q2fsm/task_log.log |