
### DataSet Evaluation

| Metric                        | Value |
|-------------------------------|-------|
| execution_match               | 0.14423076923076922 |
| execution_match_non_empty     | 0.1746031746031746 |
| total                         | 104 |
| total_non_empty               | 63 |
| total_gt_non_empty            | 100 |
| correct                       | 15 |
| correct_non_empty             | 11 |
| count_gt_read_failure         | 4 |
| count_response_read_failure   | 37 |

