# Full Run Manual Review: Generated Explanations

This file shows the 20 held-out examples used for the original full GPT-2-small-style manual qualitative review. Each section includes the oracle/reference explanation, the mismatched explanation used in training, and the generated prose from the consistency-loss variant plus the three baselines/controls.

## Variant columns

- `Consistency loss`: original full mechanism using explanation pooling plus consistency loss.
- `Baseline: no consistency loss`: LM-only baseline.
- `Baseline: claim-only pooling`: negative control pooling claim tokens instead of explanation tokens.
- `Baseline: random-label consistency`: negative control using shuffled labels for the consistency loss.

## Sample 1: `check_all_pairs_equal`

- Ground truth: time=O(n^2), space=O(1), correctness=1
- Reference explanation: Checks if all elements are equal via pairwise comparison. O(n^2) time, O(1) space.
- Mismatched training explanation: Searches a list sequentially for a target value. O(n) time, O(1) space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: no consistency loss | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: claim-only pooling | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: random-label consistency | Computesn! it er atively. O(n) time, O(1) space. |

## Sample 2: `generate_all_pairs`

- Ground truth: time=O(n^2), space=O(n^2), correctness=1
- Reference explanation: Generates all ordered pairs. O(n^2) time and O(n^2) space.
- Mismatched training explanation: Builds a frequency map in O(n) time and O(n) space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Re turnsuniqueelementspre se rv in gord er. O(n) time and O(n) space. |
| Baseline: no consistency loss | Computesthesum of a list withas in glepass. Timecom pl exity O(n), sp a ce O(1). |
| Baseline: claim-only pooling | Re turnsuniqueelementspre se rv in gord er. O(n) time and O(n) space. |
| Baseline: random-label consistency | Computestheabsolutevalue of anumb er in constant time and space. |

## Sample 3: `selection_sort`

- Ground truth: time=O(n^2), space=O(1), correctness=1
- Reference explanation: Selection sort: selects the minimum in O(n^2) nested passes, O(1) space.
- Mismatched training explanation: Detects duplicates with nested loops. O(n^2) time, O(1) space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Computesthesum of a list withas in glepass. Timecom pl exity O(n), sp a ce O(1). |
| Baseline: no consistency loss | Computesthesum of a list withas in glepass. Timecom pl exity O(n), sp a ce O(1). |
| Baseline: claim-only pooling | Computesthesum of a list withas in glepass. Timecom pl exity O(n), sp a ce O(1). |
| Baseline: random-label consistency | Computesthesum of a list withas in glepass. Timecom pl exity O(n), sp a ce O(1). |

## Sample 4: `selection_sort`

- Ground truth: time=O(n^2), space=O(1), correctness=1
- Reference explanation: Selection sort: selects the minimum in O(n^2) nested passes, O(1) space.
- Mismatched training explanation: Detects duplicates with nested loops. O(n^2) time, O(1) space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Computesthesum of a list withas in glepass. Timecom pl exity O(n), sp a ce O(1). |
| Baseline: no consistency loss | Computesthesum of a list withas in glepass. Timecom pl exity O(n), sp a ce O(1). |
| Baseline: claim-only pooling | Computesthesum of a list withas in glepass. Timecom pl exity O(n), sp a ce O(1). |
| Baseline: random-label consistency | Computesthesum of a list withas in glepass. Timecom pl exity O(n), sp a ce O(1). |

## Sample 5: `matrix_multiply_buggy`

- Ground truth: time=O(n^2), space=O(1), correctness=0
- Reference explanation: Attempts 2x2 matrix multiplication but is buggy (missing accumulation). O(n^2) time, O(1) auxiliary space.
- Mismatched training explanation: Builds a frequency map in O(n) time and O(n) space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Checks if astr in gconta in sa ch aract er vial in earscan. O(n) time, O(1) space. |
| Baseline: no consistency loss | Computesprefixsumarray. O(n) time and O(n) space. |
| Baseline: claim-only pooling | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: random-label consistency | Computesprefixsumarray. O(n) time and O(n) space. |

## Sample 6: `matrix_multiply_buggy`

- Ground truth: time=O(n^2), space=O(1), correctness=0
- Reference explanation: Attempts 2x2 matrix multiplication but is buggy (missing accumulation). O(n^2) time, O(1) auxiliary space.
- Mismatched training explanation: Attempts to swap two values but contains a logic bug, returning incorrect results.

| Variant | Generated prose |
|---|---|
| Consistency loss | Checks if astr in gconta in sa ch aract er vial in earscan. O(n) time, O(1) space. |
| Baseline: no consistency loss | Computesprefixsumarray. O(n) time and O(n) space. |
| Baseline: claim-only pooling | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: random-label consistency | Computesprefixsumarray. O(n) time and O(n) space. |

## Sample 7: `is_sorted`

- Ground truth: time=O(n), space=O(1), correctness=1
- Reference explanation: Checks whether a list is sorted in ascending order. O(n) time, O(1) space.
- Mismatched training explanation: Returns unique elements preserving order. O(n) time and O(n) space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Computesprefixsumarray. O(n) time and O(n) space. |
| Baseline: no consistency loss | Computesprefixsumarray. O(n) time and O(n) space. |
| Baseline: claim-only pooling | Computesprefixsumarray. O(n) time and O(n) space. |
| Baseline: random-label consistency | Computesprefixsumarray. O(n) time and O(n) space. |

## Sample 8: `insertion_sort`

- Ground truth: time=O(n^2), space=O(1), correctness=1
- Reference explanation: Insertion sort: O(n^2) worst-case time, O(1) space.
- Mismatched training explanation: Searches a list sequentially for a target value. O(n) time, O(1) space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Computesprefixsumarray. O(n) time and O(n) space. |
| Baseline: no consistency loss | Computesprefixsumarray. O(n) time and O(n) space. |
| Baseline: claim-only pooling | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: random-label consistency | Computesprefixsumarray. O(n) time and O(n) space. |

## Sample 9: `selection_sort`

- Ground truth: time=O(n^2), space=O(1), correctness=1
- Reference explanation: Selection sort: selects the minimum in O(n^2) nested passes, O(1) space.
- Mismatched training explanation: Insertion sort: O(n^2) worst-case time, O(1) space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Computesthesum of a list withas in glepass. Timecom pl exity O(n), sp a ce O(1). |
| Baseline: no consistency loss | Computesthesum of a list withas in glepass. Timecom pl exity O(n), sp a ce O(1). |
| Baseline: claim-only pooling | Computesthesum of a list withas in glepass. Timecom pl exity O(n), sp a ce O(1). |
| Baseline: random-label consistency | Computesthesum of a list withas in glepass. Timecom pl exity O(n), sp a ce O(1). |

## Sample 10: `matrix_multiply_buggy`

- Ground truth: time=O(n^2), space=O(1), correctness=0
- Reference explanation: Attempts 2x2 matrix multiplication but is buggy (missing accumulation). O(n^2) time, O(1) auxiliary space.
- Mismatched training explanation: Attempts to reverse a list in-place but has a bug: it swaps an element with itself.

| Variant | Generated prose |
|---|---|
| Consistency loss | Checks if astr in gconta in sa ch aract er vial in earscan. O(n) time, O(1) space. |
| Baseline: no consistency loss | Computesprefixsumarray. O(n) time and O(n) space. |
| Baseline: claim-only pooling | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: random-label consistency | Computesprefixsumarray. O(n) time and O(n) space. |

## Sample 11: `sign`

- Ground truth: time=O(1), space=O(1), correctness=1
- Reference explanation: Returns the sign of a number (-1, 0, or 1) in O(1) time and space.
- Mismatched training explanation: Checks if all elements are equal via pairwise comparison. O(n^2) time, O(1) space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: no consistency loss | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: claim-only pooling | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: random-label consistency | Computesprefixsumarray. O(n) time and O(n) space. |

## Sample 12: `string_contains`

- Ground truth: time=O(n), space=O(1), correctness=1
- Reference explanation: Checks if a string contains a character via linear scan. O(n) time, O(1) space.
- Mismatched training explanation: Reverses a string character by character. O(n) time, O(n) space due to string accumulation.

| Variant | Generated prose |
|---|---|
| Consistency loss | Computesprefixsumarray. O(n) time and O(n) space. |
| Baseline: no consistency loss | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: claim-only pooling | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: random-label consistency | Computesn! it er atively. O(n) time, O(1) space. |

## Sample 13: `all_pairs_sum`

- Ground truth: time=O(n^2), space=O(1), correctness=1
- Reference explanation: Sums all pairs from the list. Double nested loop gives O(n^2) time, O(1) space.
- Mismatched training explanation: Checks whether a list is sorted in ascending order. O(n) time, O(1) space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Computesprefixsumarray. O(n) time and O(n) space. |
| Baseline: no consistency loss | Computesprefixsumarray. O(n) time and O(n) space. |
| Baseline: claim-only pooling | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: random-label consistency | Computesprefixsumarray. O(n) time and O(n) space. |

## Sample 14: `max_of_two`

- Ground truth: time=O(1), space=O(1), correctness=1
- Reference explanation: Returns the larger of two numbers using a conditional expression. O(1) time and space.
- Mismatched training explanation: Naive polynomial/array convolution. O(n^2) time, O(n) output space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: no consistency loss | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: claim-only pooling | Checks if astr in gconta in sa ch aract er vial in earscan. O(n) time, O(1) space. |
| Baseline: random-label consistency | Computesn! it er atively. O(n) time, O(1) space. |

## Sample 15: `string_reverse`

- Ground truth: time=O(n), space=O(n), correctness=1
- Reference explanation: Reverses a string character by character. O(n) time, O(n) space due to string accumulation.
- Mismatched training explanation: Insertion sort: O(n^2) worst-case time, O(1) space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Checks if allelementsareequalviapairwi se comparison. O( n ^ 2 ) time, O(1) space. |
| Baseline: no consistency loss | Computesprefixsumarray. O(n) time and O(n) space. |
| Baseline: claim-only pooling | Computesprefixsumarray. O(n) time and O(n) space. |
| Baseline: random-label consistency | Computesprefixsumarray. O(n) time and O(n) space. |

## Sample 16: `clamp`

- Ground truth: time=O(1), space=O(1), correctness=1
- Reference explanation: Clamps a value between lo and hi bounds in constant time and space.
- Mismatched training explanation: Flattens one level of nesting in a list of lists. O(n) time and O(n) space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Computesprefixsumarray. O(n) time and O(n) space. |
| Baseline: no consistency loss | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: claim-only pooling | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: random-label consistency | Computesprefixsumarray. O(n) time and O(n) space. |

## Sample 17: `compute_mean`

- Ground truth: time=O(n), space=O(1), correctness=1
- Reference explanation: Computes arithmetic mean. The built-in sum is O(n); overall O(n) time, O(1) space.
- Mismatched training explanation: Naive polynomial/array convolution. O(n^2) time, O(n) output space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Computestheabsolutevalue of anumb er in constant time and space. |
| Baseline: no consistency loss | Computestheabsolutevalue of anumb er in constant time and space. |
| Baseline: claim-only pooling | Computestheabsolutevalue of anumb er in constant time and space. |
| Baseline: random-label consistency | Computestheabsolutevalue of anumb er in constant time and space. |

## Sample 18: `prefix_sums`

- Ground truth: time=O(n), space=O(n), correctness=1
- Reference explanation: Computes prefix sum array. O(n) time and O(n) space.
- Mismatched training explanation: Converts Celsius to Fahrenheit in constant time and space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Computestheabsolutevalue of anumb er in constant time and space. |
| Baseline: no consistency loss | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: claim-only pooling | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: random-label consistency | Computesn! it er atively. O(n) time, O(1) space. |

## Sample 19: `string_contains`

- Ground truth: time=O(n), space=O(1), correctness=1
- Reference explanation: Checks if a string contains a character via linear scan. O(n) time, O(1) space.
- Mismatched training explanation: Detects duplicates with nested loops. O(n^2) time, O(1) space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Computesprefixsumarray. O(n) time and O(n) space. |
| Baseline: no consistency loss | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: claim-only pooling | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: random-label consistency | Computesn! it er atively. O(n) time, O(1) space. |

## Sample 20: `naive_convolution`

- Ground truth: time=O(n^2), space=O(n), correctness=1
- Reference explanation: Naive polynomial/array convolution. O(n^2) time, O(n) output space.
- Mismatched training explanation: Computes prefix sum array. O(n) time and O(n) space.

| Variant | Generated prose |
|---|---|
| Consistency loss | Checks if allelementsareequalviapairwi se comparison. O( n ^ 2 ) time, O(1) space. |
| Baseline: no consistency loss | Checkswheth er an in teg er isevenus in gthemoduloop er ator. Runs in O(1) time and O(1) space. |
| Baseline: claim-only pooling | Computesn! it er atively. O(n) time, O(1) space. |
| Baseline: random-label consistency | Computestheabsolutevalue of anumb er in constant time and space. |
