# Rich Qualitative Side-by-Side Review (8 Variants)

**Run tag:** `full_rich_gpt2_small_sharded_20260501`  
**Variants:** all 8 rich variants (V1 + V2)  
**Validation examples reviewed:** 20  
**Checkpoints:** initial (epoch_005 or earliest) + final (epoch_020)  
**Dataset:** n=3000, seed=42, val_size=500 (same as full run)  

Automatic metrics (BLEU-1, ROUGE-L) are computed on the explanation prose
before the first `<claim>` token.  Manual review of the prose column is
the primary purpose of this document.

---

## Aggregate Metrics — Final Checkpoint

| Variant | BLEU-1 | ROUGE-L | Mean emitted claims | time_com emit% | space_co emit% | algorith emit% | correctn emit% |
|---|---|---|---|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.1351 | 0.1343 | 0.4/12 | 0.1 | 0.1 | 0.1 | 0.0 |
| `No Consistency Loss (V1)` | 0.1199 | 0.1121 | 0.0/12 | 0.0 | 0.0 | 0.0 | 0.0 |
| `Claim-Only Pooling (V1)` | 0.1143 | 0.1023 | 0.1/12 | 0.0 | 0.05 | 0.05 | 0.0 |
| `Random Label (V1)` | 0.1262 | 0.1187 | 0.05/12 | 0.0 | 0.05 | 0.0 | 0.0 |
| `No Claim→Claim Attn (V2)` | 0.1183 | 0.1086 | 0.1/12 | 0.0 | 0.1 | 0.0 | 0.0 |
| `Claims from Expl Only (V2)` | 0.1302 | 0.1239 | 0.0/12 | 0.0 | 0.0 | 0.0 | 0.0 |
| `Surface Bottleneck (V2)` | 0.1217 | 0.1141 | 0.0/12 | 0.0 | 0.0 | 0.0 | 0.0 |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | 0.0/12 | 0.0 | 0.0 | 0.0 | 0.0 |

## Per-Claim Emission Rates — Final Checkpoint

Fraction of examples in which the correct claim token appeared in the
generated output.  Note: "emitted" means the token is present in the
raw generation; it does not verify that the emitted value is correct.

| Variant | time_complex | space_comple | best_case_ti | algorithm_cl | loop_structu | key_operatio | access_patte | auxiliary_st | mutates_inpu | correctness_ | handles_empt | handles_dupl |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.1 | 0.1 | 0.0 | 0.1 | 0.05 | 0.0 | 0.05 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| `No Consistency Loss (V1)` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| `Claim-Only Pooling (V1)` | 0.0 | 0.05 | 0.0 | 0.05 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| `Random Label (V1)` | 0.0 | 0.05 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| `No Claim→Claim Attn (V2)` | 0.0 | 0.1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| `Claims from Expl Only (V2)` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| `Surface Bottleneck (V2)` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |

## Initial → Final Improvement (per variant)

| Variant | BLEU-1 initial | BLEU-1 final | ROUGE-L initial | ROUGE-L final |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.0047 | 0.1351 | 0.0074 | 0.1343 |
| `No Consistency Loss (V1)` | 0.0092 | 0.1199 | 0.015 | 0.1121 |
| `Claim-Only Pooling (V1)` | 0.0057 | 0.1143 | 0.0095 | 0.1023 |
| `Random Label (V1)` | 0.0112 | 0.1262 | 0.017 | 0.1187 |
| `No Claim→Claim Attn (V2)` | 0.0105 | 0.1183 | 0.0162 | 0.1086 |
| `Claims from Expl Only (V2)` | 0.0034 | 0.1302 | 0.0059 | 0.1239 |
| `Surface Bottleneck (V2)` | 0.0122 | 0.1217 | 0.02 | 0.1141 |
| `Surface Btlnk No ExplLM (V2)` | 0.0143 | 0.0 | 0.0165 | 0.0 |

## Per-Sample Side-by-Side Generations (Final Checkpoint)

Each example shows the code snippet, reference and mismatched explanations,
and all 8 variants' final-checkpoint generated prose with automatic scores.
The table includes: Variant | BLEU-1 | ROUGE-L | Emitted claims (major) | Prose

### Sample 1: `dfs_recursive` (val idx 2649)

Ground truth: time_complexity=`O_n`, space_complexity=`O_n`, algorithm_class=`traversal`, correctness_status=`fully_correct`

**Reference explanation:** dfs uses recursive calls to explore each branch deeply before backtracking, visiting all reachable nodes in O_n time with O_n recursion_stack space. Best-case is O_1 for a single-node graph. The key_operation is hash_lookup for the visited set. Input is not mutated. An empty graph starting at a lone node still returns [start]. Duplicate edges are skipped via visited.

**Mismatched training explanation:** sparse_multiply multiplies two sparse matrices, skipping zero entries. Worst-case O_n2 (n^3 operations for n×n, labeled O_n2); space O_n2. Best-case O_n2. Key_operation is arithmetic. Nested_3_level. Input not mutated. Empty matrices handled. Not applicable for duplicates.

```python
def dfs(graph, start, visited=None):
    if visited is None:
        visited = set()
    visited.add(start)
    order = [start]
    for neighbor in graph.get(start, []):
        if neighbor not in visited:
            order.extend(dfs(graph, neighbor, visited))
    return order

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.1897 | 0.1942 | ✗/✗/✗/✗ | tatit _ p er nation ch esap er sarsandcondites in gparsiteandsitev in g in gleandicantion in s. Time O_ n; space O_1. Best-case O_1 when stelement. Key_operation is ss. Two - po in t er a cce sss. Input not mutated. Empty str in g |
| `No Consistency Loss (V1)` | 0.1034 | 0.1087 | ✗/✗/✗/✗ | max _ productttracksbo th maximum in imsubarrayproducts to kendip er sits. Whe O_ n 2 ( nelelements ); space O_ n. Best-case O_ n 2. Key_operation is compar is on ( maxselenticomp ). Nes |
| `Claim-Only Pooling (V1)` | 0.1552 | 0.1031 | ✗/✗/✗/✗ | max _ producttesbasextriceandmulents the m is icon in imumesponest in galicates using amul re at ed silnemults. Time O_ n 2 ( njus O_ nnn 2 ); space O_ n for the outp. Best-case is has h_ loo kup. |
| `Random Label (V1)` | 0.1552 | 0.1228 | ✗/✗/✗/✗ | f in d _ m in _ rotatf in ds the m in imumnumb er of cuts to partitions in - pal in dromes using a 2 Dpal in drometableand 1 D DP. Time O_ n 2; space O_ n 2. Best-case O_ n 2. Key_operation is compar is on. Nest ed _ 2 _ leve |
| `No Claim→Claim Attn (V2)` | 0.1897 | 0.2222 | ✗/✗/✗/✗ | tatit _ p er ationation ch esarancesallemuts in g in O_ ntimeand O_ n space. Best-case is O_ n. Thekey _ op er ation is s the nconmptionmut. Input is not mutated. Anemptyl is t return sanemptyl is t. Duplicate val |
| `Claims from Expl Only (V2)` | 0.2419 | 0.2 | ✗/✗/✗/✗ | tatit _ p er nation ch eckswhap er ations in tears using apass. Time O_ n; space O_ n for the st. Best-case O_ n. Key_operation is compar is on ( sp ). Seque ntiala cce s. Input not mutated. Empty str in g return sTrum is s. Duplicate ch es are ack ed in f re s er s er v |
| `Surface Bottleneck (V2)` | 0.1897 | 0.1714 | ✗/✗/✗/✗ | max _ productttracksbo th maximum in imsubarrayproducts to kendip er sits. Whe O_ n 2 ( nelelements ); space O_ n. Best-case O_ n 2. Key_operation is arithmetic. Nest ed _ 2 _ level. Input not mutated. Empty l is t return s [ ]. Duplicate v |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | fstartstartenontartaletiononbumpartiondstartartartestartendstartiononondstartiondstestendiontartestationdstendststiondst |
| `Consistency Loss (V1)` | final (ep20) | 0.1897 | 0.1942 | tatit _ p er nation ch esap er sarsandcondites in gparsiteandsitev in g in gleandicantion in s. Time O_ n; space O_1. Best-case O_1 when stelement. Key_operation is ss. Two - po in t er a cce sss. Input not mutated. Empty str in g |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0172 | 0.029 | m _ p er g _ put _ nbustitestitititionbustentestiondstionblestititiontiondstionstiontionentititititionstiontitition, the nentionstion |
| `No Consistency Loss (V1)` | final (ep20) | 0.1034 | 0.1087 | max _ productttracksbo th maximum in imsubarrayproducts to kendip er sits. Whe O_ n 2 ( nelelements ); space O_ n. Best-case O_ n 2. Key_operation is compar is on ( maxselenticomp ). Nes |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | me _ pp er stititiontitestestitiontionblentigentiontiontiontiontionestitionstionstionestionentiontitiontiontionstiontionstion |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.1552 | 0.1031 | max _ producttesbasextriceandmulents the m is icon in imumesponest in galicates using amul re at ed silnemults. Time O_ n 2 ( njus O_ nnn 2 ); space O_ n for the outp. Best-case is has h_ loo kup. |
| `Random Label (V1)` | initial (ep5) | 0.0172 | 0.0328 | mesestiontionontitiontionestionbstiontiondstiondstionestionestestimestionestiond the nestiondstiontiondstitiondimestionestio |
| `Random Label (V1)` | final (ep20) | 0.1552 | 0.1228 | f in d _ m in _ rotatf in ds the m in imumnumb er of cuts to partitions in - pal in dromes using a 2 Dpal in drometableand 1 D DP. Time O_ n 2; space O_ n 2. Best-case O_ n 2. Key_operation is compar is on. Nest ed _ 2 _ leve |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0 | 0.0 | fstartartaronontarometaronometionbumestartiontiompestiontionbumponendiondstendstionestimestiondiontiondiondimenenestiont |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.1897 | 0.2222 | tatit _ p er ationation ch esarancesallemuts in g in O_ ntimeand O_ n space. Best-case is O_ n. Thekey _ op er ation is s the nconmptionmut. Input is not mutated. Anemptyl is t return sanemptyl is t. Duplicate val |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | fstartstartenontartenonddstartartenddstartestenonontestartentartendonontestendstestendstartestiontiondondstentiondstends |
| `Claims from Expl Only (V2)` | final (ep20) | 0.2419 | 0.2 | tatit _ p er nation ch eckswhap er ations in tears using apass. Time O_ n; space O_ n for the st. Best-case O_ n. Key_operation is compar is on ( sp ). Seque ntiala cce s. Input not mutated. Empty str in g return sTrum is s. Duplicate ch es are ack ed in f re s er s er v |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0345 | 0.0563 | m _ p er g _ put _ nbustitestenbustiontionblestststionb in gentiontitiontiondstionstionentionstition, the ntitionestitionstitititionesti |
| `Surface Bottleneck (V2)` | final (ep20) | 0.1897 | 0.1714 | max _ productttracksbo th maximum in imsubarrayproducts to kendip er sits. Whe O_ n 2 ( nelelements ); space O_ n. Best-case O_ n 2. Key_operation is arithmetic. Nest ed _ 2 _ level. Input not mutated. Empty l is t return s [ ]. Duplicate v |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0625 | 0.0656 | or in v is ited ) o rder <sep> n ei g hb o rder <sep> o rder <sep> etadd ( no rder <sep> nb, nb, nb, v is ited: v is ited ) return o rder <sep> <sep> <sep> eck ( nb, nb, v is ited <sep> <sep> <sep> <sep> e <sep> et <sep> <sep> <sep> <sep> e ( v is ited </claim> |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 2: `longest_common_prefix` (val idx 1152)

Ground truth: time_complexity=`O_n2`, space_complexity=`O_n`, algorithm_class=`string_processing`, correctness_status=`fully_correct`

**Reference explanation:** longest_common_prefix progressively trims the candidate prefix until it matches each string. Worst-case is O_n*m (n strings, m max length) but here treated as O_n with sequential access over all strings. Space is O_n for the initial prefix copy. Best-case is O_1 if all strings share no prefix. The key_operation is string_concat (slicing). An empty list returns empty string safely. No mutation of the input list.

**Mismatched training explanation:** insertion_sort builds the sorted array one element at a time by shifting elements right until the correct insertion point is found. Worst-case time is O_n2 but best-case is O_n on an already-sorted input. Space is O_1 since only a temp_variable key is used. The key_operation is comparison and assignment. All duplicates are preserved and empty lists are safe.

```python
def longest_common_prefix(strs):
    if not strs:
        return ''
    prefix = strs[0]
    for s in strs[1:]:
        while not s.startswith(prefix):
            prefix = prefix[:-1]
            if not prefix:
                return ''
    return prefix

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.1385 | 0.1231 | ✓/✓/✓/✗ | m in _ p er for m in ims the m in imumumb er of cution to pa in to ry in teal in drome wi th O_1 time (n), d O_1 space. Best-case O_1 when n = 1. Key_operation is arithmetic. Single_pass while loop. Input not mutated. Ne 1 0 = 1: looon = 0 ( n = 1 ). Not applicable for duplicates. |
| `No Consistency Loss (V1)` | 0.1884 | 0.1343 | ✗/✗/✗/✗ | f in d _ m is s in guses the maticalsum for mula to f in d the m is s in g in teg er in [ 0. n ] in O_ ntimeand O_1 space. Best-case O_ n ( mustsumall ). Key_operation is arithmetic. Single_pass ( sum ). Input not mutated. Empty l is t: expect ed - 0 = 0 = 0, m is s in gitam |
| `Claim-Only Pooling (V1)` | 0.1077 | 0.1188 | ✗/✗/✗/✗ | fib _ no _ memocomputesFibonacccci re cursively wi th outmemoization, lead in g to O_ 2 nexponentialtime ( at ed bes ). Space is O_1. Best-case is O_1 when n=0. Thekey _ op er ation is |
| `Random Label (V1)` | 0.0923 | 0.098 | ✗/✗/✗/✗ | longest _ common _ substr in gusesa 2 D DP to f in d the leng th betcontcomontiguonscomomonsandtegivegublds. Time O_ n 2; space O_ n 2. Best-case O_ n 2. Key_operation is compar is on ( maxsigigi |
| `No Claim→Claim Attn (V2)` | 0.1231 | 0.1429 | ✗/✗/✗/✗ | m in _ p er for msb in arysear ch velsands in gleconts in galeandssebetr ch to f in d the lengentcalsstr in O_ log _ ntime wi th O_1 space. Best-case is O_1 when the fir in trat er. Thekey _ op er ation is ari |
| `Claims from Expl Only (V2)` | 0.1231 | 0.1081 | ✗/✗/✗/✗ | m in _ p er mutionesb in arysear ch ortalvalundalundandcesesart in g in gale wi th andasetwsents. Time O_ log _ n; space O_1. Best-case O_1 when b is str in g. Key_operation is compar is on. Two - po in t er a cce ss. |
| `Surface Bottleneck (V2)` | 0.1884 | 0.1343 | ✗/✗/✗/✗ | f in d _ m is s in guses the maticalsum for mula to f in d the m is s in g in teg er in [ 0. n ] in O_ ntimeand O_1 space. Best-case O_ n ( mustsumall ). Key_operation is arithmetic. Single_pass ( sum ). Input not mutated. Empty l is t: expect ed - 0 = 0 = 0, m is s in gitam |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | fimastimasulesentimesestimasentimastimatim in gentixpontimentimendimentimens in gesentimentimensensimentimenstiontiondimension |
| `Consistency Loss (V1)` | final (ep20) | 0.1385 | 0.1231 | m in _ p er for m in ims the m in imumumb er of cution to pa in to ry in teal in drome wi th O_1 time (n), d O_1 space. Best-case O_1 when n = 1. Key_operation is arithmetic. Single_pass while loop. Input not mutated. Ne 1 0 = 1: looon = 0 ( n = 1 ). Not applicable for duplicates. |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0308 | 0.0526 | mat _ p re fitixixititititimulestimentitiontixpontitixponation, the ntionationationentiontiontiontion, the leationatix, len, lentiontio |
| `No Consistency Loss (V1)` | final (ep20) | 0.1884 | 0.1343 | f in d _ m is s in guses the maticalsum for mula to f in d the m is s in g in teg er in [ 0. n ] in O_ ntimeand O_1 space. Best-case O_ n ( mustsumall ). Key_operation is arithmetic. Single_pass ( sum ). Input not mutated. Empty l is t: expect ed - 0 = 0 = 0, m is s in gitam |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0308 | 0.0526 | mat _ p re fit _ ntesenatentimulentimentimenatixpontimentixixpon, the ntixixpontiontiontimentioneatixpatixioneatix, the lenentionentiat |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.1077 | 0.1188 | fib _ no _ memocomputesFibonacccci re cursively wi th outmemoization, lead in g to O_ 2 nexponentialtime ( at ed bes ). Space is O_1. Best-case is O_1 when n=0. Thekey _ op er ation is |
| `Random Label (V1)` | initial (ep5) | 0.1077 | 0.1379 | matix _ stesumartrtixponatimatixpatixpontixpontivestixponatixpondstimatimentixponstixpatimeation; space O_ n. Best-case O_ n. Key_operation is O_ n. Input not mutated. Empty lon. Input not mutated. Empty |
| `Random Label (V1)` | final (ep20) | 0.0923 | 0.098 | longest _ common _ substr in gusesa 2 D DP to f in d the leng th betcontcomontiguonscomomonsandtegivegublds. Time O_ n 2; space O_ n 2. Best-case O_ n 2. Key_operation is compar is on ( maxsigigi |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0 | 0.0 | fimatesesestesesulesentimastimastimasentimastimentimastiondsendstimasendimentimentimpondsiontimensesenentionentionendion |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.1231 | 0.1429 | m in _ p er for msb in arysear ch velsands in gleconts in galeandssebetr ch to f in d the lengentcalsstr in O_ log _ ntime wi th O_1 space. Best-case is O_1 when the fir in trat er. Thekey _ op er ation is ari |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | fitixpontimpontimesesestimastitimasesentimpontiontimes in gendsulestimentimenentimentionesensentimensenentimeationes in g _ nsti |
| `Claims from Expl Only (V2)` | final (ep20) | 0.1231 | 0.1081 | m in _ p er mutionesb in arysear ch ortalvalundalundandcesesart in g in gale wi th andasetwsents. Time O_ log _ n; space O_1. Best-case O_1 when b is str in g. Key_operation is compar is on. Two - po in t er a cce ss. |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0462 | 0.08 | mat _ p re fitixixititititimulestimentitiontixpontitixponation, the nationationationtionatixionalenentioneatixpation, the leationen, the |
| `Surface Bottleneck (V2)` | final (ep20) | 0.1884 | 0.1343 | f in d _ m is s in guses the maticalsum for mula to f in d the m is s in g in teg er in [ 0. n ] in O_ ntimeand O_1 space. Best-case O_ n ( mustsumall ). Key_operation is arithmetic. Single_pass ( sum ). Input not mutated. Empty l is t: expect ed - 0 = 0 = 0, m is s in gitam |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0 | 0.0 | fix <sep> <sep> fix <sep> <sep> fix <sep> fix <sep> fix <sep> sult <sep> fix <sep> return re fix <sep> <sep> <sep> <sep> return re fix <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> </claim> |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 3: `count_bits_dp` (val idx 2826)

Ground truth: time_complexity=`O_n`, space_complexity=`O_n`, algorithm_class=`dynamic_programming`, correctness_status=`fully_correct`

**Reference explanation:** count_bits uses DP to count the number of 1-bits for every integer 0..n. Time O_n; space O_n for dp. Best-case O_n. Key_operation is arithmetic (bit ops). Single_pass. Input not mutated. n=0 returns [0]. Not applicable for duplicates.

**Mismatched training explanation:** binary_search_v3 has a wrong_condition bug: mid = lo + hi//2 due to operator precedence instead of (lo+hi)//2. This probes incorrect positions and may loop forever for certain inputs. Time nominally O_log_n; space O_1. Best-case O_1. Key_operation is comparison. Two-pointer access. Input not mutated. Empty list: lo >= hi exits. Not applicable for duplicates.

```python
def count_bits(n):
    dp = [0] * (n + 1)
    for i in range(1, n + 1):
        dp[i] = dp[i >> 1] + (i & 1)
    return dp

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.1915 | 0.2143 | ✗/✗/✗/✗ | f in d _ meno _ mentusesb in arysear ch re arysieandar ch igly. Time O_ log _ n ( st ); space O_1. Best-case O_1 when lementementen ( arg ). Key_operation is compar is on. Single_pass. Input not mutated. Empty l is t return sFalyelooricaly. Duplicate |
| `No Consistency Loss (V1)` | 0.125 | 0.1468 | ✗/✗/✗/✗ | f in d _ m is s in guses the ma the maticalsum for mula to f in d the m is s in g in teg er in [ 0.. n ] in O_ ntimeand O_1 space. Best-case O_ n ( mustsumall ). Key_operation is arithmetic. Single_pass ( sum ). Input not mutated. Empty l is t: expect ed - 0 = 0 = n * ( n + 1 ) |
| `Claim-Only Pooling (V1)` | 0.1304 | 0.1509 | ✗/✗/✗/✗ | f in d _ m in _ m is s in gusesa the ma the maticalsum for mula to f in d the m is s in g in teg er in [ 0.. n ] in O_ ntimeand O_1 space. Best-case O_ n ( muststsumall ). Key_operation is arithmetic. Single_pass ( sum ). Input not mutated. Empty l is t: expect ed - 0 = n * ( |
| `Random Label (V1)` | 0.2222 | 0.1951 | ✗/✗/✗/✗ | f in d _ mocomputesFibonaccccci re cursively wi th outmemoioization, lead in g to O_ 2 nexpontime ( n is O_ n space for the dproray. Best-case is O_1 for n < = 1. Thekey _ op er ation is arithmetic. Input is not mutated. |
| `No Claim→Claim Attn (V2)` | 0.0811 | 0.0811 | ✗/✗/✗/✗ | f in d _ mo _ ch elementsb in arysear ch eanaleat for argeangentsargetsiandcon in sitition, run in gun in O_ log _ ntimeand O_1 space. Best-case is O_ n. Thekey _ op er ation is compar |
| `Claims from Expl Only (V2)` | 0.1569 | 0.1818 | ✗/✗/✗/✗ | f in d _ mentel _ o _ mentusesestf in ds the leng th filestmontightstocccur re nsionseatad of alle. Time O_ n ( so rt ); space O_ n for the re aysirrt. Best-case O_ n. Key_operation is l is t_appe n d. Seque ntial ( in g re a cce |
| `Surface Bottleneck (V2)` | 0.125 | 0.1468 | ✗/✗/✗/✗ | f in d _ m is s in guses the ma the maticalsum for mula to f in d the m is s in g in teg er in [ 0.. n ] in O_ ntimeand O_1 space. Best-case O_ n ( mustsumall ). Key_operation is arithmetic. Single_pass ( sum ). Input not mutated. Empty l is t: expect ed - 0 = 0 = n * ( n + 1 ) |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0541 | 0.0741 | matimesesestimeartimesestimelestimprtiontimentimeleatimenstionseatimenseationtimelentionseleation is O_ n is O_ n is O_ n. Best-case is tion is tion is t |
| `Consistency Loss (V1)` | final (ep20) | 0.1915 | 0.2143 | f in d _ meno _ mentusesb in arysear ch re arysieandar ch igly. Time O_ log _ n ( st ); space O_1. Best-case O_1 when lementementen ( arg ). Key_operation is compar is on. Single_pass. Input not mutated. Empty l is t return sFalyelooricaly. Duplicate |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | mat _ sesesestititilesestititiontionstiontiontiontiontiontiontiontionstiontiontion, lentiontiontiontiontiontiontiontilestio |
| `No Consistency Loss (V1)` | final (ep20) | 0.125 | 0.1468 | f in d _ m is s in guses the ma the maticalsum for mula to f in d the m is s in g in teg er in [ 0.. n ] in O_ ntimeand O_1 space. Best-case O_ n ( mustsumall ). Key_operation is arithmetic. Single_pass ( sum ). Input not mutated. Empty l is t: expect ed - 0 = 0 = n * ( n + 1 ) |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | mat _ n _ sesestiontilesestiontilentiontiontiontiontiontiontilensentionentiontiontiontiontiontionesiontioneationtiontileatio |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.1304 | 0.1509 | f in d _ m in _ m is s in gusesa the ma the maticalsum for mula to f in d the m is s in g in teg er in [ 0.. n ] in O_ ntimeand O_1 space. Best-case O_ n ( muststsumall ). Key_operation is arithmetic. Single_pass ( sum ). Input not mutated. Empty l is t: expect ed - 0 = n * ( |
| `Random Label (V1)` | initial (ep5) | 0.0 | 0.0 | mat _ nasesestiontiontestiontimpanstiontiontimestiontiontiontiontionstionstiontiontiontiontiontionstiontiontiontiontiontio |
| `Random Label (V1)` | final (ep20) | 0.2222 | 0.1951 | f in d _ mocomputesFibonaccccci re cursively wi th outmemoioization, lead in g to O_ 2 nexpontime ( n is O_ n space for the dproray. Best-case is O_1 for n < = 1. Thekey _ op er ation is arithmetic. Input is not mutated. |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.1081 | 0.1356 | meseseseseseseantimeseseartimparanararartimeandimeartimeantimeantimeatimpantitimeatimparsean is O_ n is O_ n. Best-case O_ n. Key_operation is O_ n. Input is O_ n. Input is tion is |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.0811 | 0.0811 | f in d _ mo _ ch elementsb in arysear ch eanaleat for argeangentsargetsiandcon in sitition, run in gun in O_ log _ ntimeand O_1 space. Best-case is O_ n. Thekey _ op er ation is compar |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.027 | 0.0444 | matimesesestimeatimenartimesentimeselleatimentimentiontimelenseationtionstiontimelesenstionseleationtionstion is O_ ntion is O_ n is |
| `Claims from Expl Only (V2)` | final (ep20) | 0.1569 | 0.1818 | f in d _ mentel _ o _ mentusesestf in ds the leng th filestmontightstocccur re nsionseatad of alle. Time O_ n ( so rt ); space O_ n for the re aysirrt. Best-case O_ n. Key_operation is l is t_appe n d. Seque ntial ( in g re a cce |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.027 | 0.0476 | mat _ sesesestititiontionstitiontiontiontiontiontiontiontiontiontionstiontiontiontiontitiontionstiontion, iontiontitillen, the |
| `Surface Bottleneck (V2)` | final (ep20) | 0.125 | 0.1468 | f in d _ m is s in guses the ma the maticalsum for mula to f in d the m is s in g in teg er in [ 0.. n ] in O_ ntimeand O_1 space. Best-case O_ n ( mustsumall ). Key_operation is arithmetic. Single_pass ( sum ). Input not mutated. Empty l is t: expect ed - 0 = 0 = n * ( n + 1 ) |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0 | 0.0 | <sep> n - 2 [ i ] <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> e ( n - 2 ] <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> </claim> |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 4: `is_prime` (val idx 104)

Ground truth: time_complexity=`O_log_n`, space_complexity=`O_1`, algorithm_class=`math_computation`, correctness_status=`fully_correct`

**Reference explanation:** is_prime checks divisibility up to sqrt(n), giving O_log_n time (actually O(sqrt(n)), mapped to O_log_n for the experiment). Space is O_1. Best-case is O_1 for n < 2 or when the first divisor is found immediately. The key_operation is arithmetic (modulo). Not applicable for duplicates.

**Mismatched training explanation:** two_sum_sorted uses a two_pointer approach on a sorted list, converging from both ends in O_n time and O_1 space. Best-case is O_1 when the first pair matches. The key_operation is arithmetic (sum) and comparison. Input is not mutated. An empty or single-element list returns None. Duplicate values may produce one valid pair or None if no pair exists.

```python
def is_prime(n):
    if n < 2:
        return False
    for i in range(2, int(n ** 0.5) + 1):
        if n % i == 0:
            return False
    return True

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.1429 | 0.129 | ✗/✗/✗/✗ | f in d _ onaten _ o _ ch arat ed usesasasasl id in glevandandandcondicebes in g: mid = in O_ ntimeand O_1 space. Best-case is O_1 when lent is ars in con ei g th atenmat ch en ei g hb or is at ch aract er. Th |
| `No Consistency Loss (V1)` | 0.1731 | 0.1667 | ✗/✗/✗/✗ | f in d _ m in _ rotat ed usesb in arysear ch onarotat ed so rt ed l is t to f in d the m in imumelement in O_ log _ ntime. Space is O_1. Best-case same _ as _ worst. Thekey _ op er ation is compar is on |
| `Claim-Only Pooling (V1)` | 0.1364 | 0.1333 | ✗/✗/✗/✗ | fib _ no _ computesFibonacci re cursively wi th outmemoization, lead in g to O_ 2 nexponentialtime due to re peat ed subproblemcomputation. Space O_ log _ n for re c |
| `Random Label (V1)` | 0.1702 | 0.1538 | ✗/✗/✗/✗ | max _ productusesasl id in g wi th as in gle _ passokuntiv er aloop, runn in g in O_ ntimeand O_ n space. Best-case is O_1 when the first ch aract er is amat ch ed clos in gbracket er s. Thekey _ op er at |
| `No Claim→Claim Attn (V2)` | 0.1591 | 0.15 | ✗/✗/✗/✗ | fib _ no _ buggyhasan of ibug: i -1 san of nebloop for nestititition ) in steadconandcomputead of ites, gititition ). Time O_ n; space O_1. Best-case O_1 ( mextexlyscor in g er ely ). Key_operation |
| `Claims from Expl Only (V2)` | 0.1136 | 0.1235 | ✗/✗/✗/✗ | f in d _ notationatesbuildsarotanationationbuitionation in s in O_ log _ ntimeand O_1 space. Best-case is O_1 when n=0 p = = 0 s in glementemus in cely. Thekey _ op er ation is ari th met |
| `Surface Bottleneck (V2)` | 0.1731 | 0.1667 | ✗/✗/✗/✗ | f in d _ m in _ rotat ed usesb in arysear ch onarotat ed so rt ed l is t to f in d the m in imumelement in O_ log _ ntime. Space is O_1. Best-case same _ as _ worst. Thekey _ op er ation is compar is on |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | ct _ contimestiontioncontiontionsestiontiontiontiontiontiontiontiontionstiontiontiontiontiontionalesstiontiontiontionsssss |
| `Consistency Loss (V1)` | final (ep20) | 0.1429 | 0.129 | f in d _ onaten _ o _ ch arat ed usesasasasl id in glevandandandcondicebes in g: mid = in O_ ntimeand O_1 space. Best-case is O_1 when lent is ars in con ei g th atenmat ch en ei g hb or is at ch aract er. Th |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | con _ pumump re cyumumumplyctitimesesesestivestiontivestiontiontiontiontionstiontiontiontiontiontiontiontiontiontiontionstion |
| `No Consistency Loss (V1)` | final (ep20) | 0.1731 | 0.1667 | f in d _ m in _ rotat ed usesb in arysear ch onarotat ed so rt ed l is t to f in d the m in imumelement in O_ log _ ntime. Space is O_1. Best-case same _ as _ worst. Thekey _ op er ation is compar is on |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | ct _ ple _ prtitiontionsesestionstilestionstiontiontionstiontionstionstionstiontionationtiontionationtioneationtiontionstion |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.1364 | 0.1333 | fib _ no _ computesFibonacci re cursively wi th outmemoization, lead in g to O_ 2 nexponentialtime due to re peat ed subproblemcomputation. Space O_ log _ n for re c |
| `Random Label (V1)` | initial (ep5) | 0.0 | 0.0 | mut _ pume _ pontimesesesestivestitivestiontiontiontionstiondstiontiontionstiontionstiontiontionstiontionstiontiontiontionst |
| `Random Label (V1)` | final (ep20) | 0.1702 | 0.1538 | max _ productusesasl id in g wi th as in gle _ passokuntiv er aloop, runn in g in O_ ntimeand O_ n space. Best-case is O_1 when the first ch aract er is amat ch ed clos in gbracket er s. Thekey _ op er at |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0 | 0.0 | ct _ contiontiontiontiontiontionsestiontiontiontiontiontiontiontiontiontiontiontiontiontionstionationtiontiontiontionstion |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.1591 | 0.15 | fib _ no _ buggyhasan of ibug: i -1 san of nebloop for nestititition ) in steadconandcomputead of ites, gititition ). Time O_ n; space O_1. Best-case O_1 ( mextexlyscor in g er ely ). Key_operation |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | ct _ contimestiontionsesestiontilestiontiontiontiontiontiontionsestiontionstiontiontionstionstionstionelesstionstiontionst |
| `Claims from Expl Only (V2)` | final (ep20) | 0.1136 | 0.1235 | f in d _ notationatesbuildsarotanationationbuitionation in s in O_ log _ ntimeand O_1 space. Best-case is O_1 when n=0 p = = 0 s in glementemus in cely. Thekey _ op er ation is ari th met |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0 | 0.0 | con _ pumump re cyumumumplyctitimesesesestivestiontivestiontiontiontiontionstiontiontiontiontiontiontiontiontiontiontionstion |
| `Surface Bottleneck (V2)` | final (ep20) | 0.1731 | 0.1667 | f in d _ m in _ rotat ed usesb in arysear ch onarotat ed so rt ed l is t to f in d the m in imumelement in O_ log _ ntime. Space is O_1. Best-case same _ as _ worst. Thekey _ op er ation is compar is on |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> </claim> |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 5: `find_missing_number` (val idx 365)

Ground truth: time_complexity=`O_n`, space_complexity=`O_1`, algorithm_class=`searching`, correctness_status=`fully_correct`

**Reference explanation:** find_missing uses the mathematical sum formula to find the missing integer in [0..n] in O_n time and O_1 space. Best-case O_n (must sum all). Key_operation is arithmetic. Single_pass (sum). Input not mutated. Empty list: expected - 0 = n*(n+1)//2 which is wrong if n != 0, making it a missing_edge_case for edge inputs. Duplicate values give wrong result — undefined_behavior.

**Mismatched training explanation:** flatten recursively traverses arbitrarily nested lists, visiting each element once in O_n total time with O_n recursion_stack + result space. Best-case is O_n. The key_operation is list_append (extend). Input is not mutated. An empty list returns an empty list. Duplicate values are all preserved in the flattened output.

```python
def find_missing(lst, n):
    expected = n * (n + 1) // 2
    return expected - sum(lst)

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.1 | 0.102 | ✗/✗/✗/✗ | m er ge _ so rt _ v 2 hasasasawrondig _ conbug: the in t er esteopateval so rtiouts, so rt in is tion in gation in to mpution in to meal so rtimestution in is hi futestealldupu |
| `No Consistency Loss (V1)` | 0.15 | 0.1284 | ✗/✗/✗/✗ | max _ subarrayusesKadane ' sg re ed yalgori th m wi th as in gle _ passov er the l is t, a ch iev in g O_ ntimeand O_1 space. Best-case is O_ n. Thekey _ op er ation is ari th metic ( maxcompar is on ) |
| `Claim-Only Pooling (V1)` | 0.1667 | 0.1197 | ✗/✗/✗/✗ | max _ de p th computes the maximum de p th of ab in aryt re e re cursively. Time O_ n ( v is itsev er yno de ); space O_ n for the re cursion _ stac k. Best-case O_ log _ nonabalanc ed t re e. Key_operation is compar is on ( ma x) |
| `Random Label (V1)` | 0.2333 | 0.1818 | ✗/✗/✗/✗ | max _ bugyusesast for O_1 complementylookuplicatimes in g O_ n to tand O_ n space. Best-case is O_ ns in celelementsmustbev is ited. Thekey _ op er ation is arithmetic. Input is not mutated. Anemptyl is t return sanem |
| `No Claim→Claim Attn (V2)` | 0.1333 | 0.1224 | ✗/✗/✗/✗ | max _ buggyusesasastr in g _ conumputesasas in g wi th ase _ wi th as in gle _ p for O_1 time _ pasulemendoopsstrigasubaystrray. Time O_ n; space O_ n. Key_operation is compar is on. Seque ntiala |
| `Claims from Expl Only (V2)` | 0.2333 | 0.193 | ✗/✗/✗/✗ | m er ge _ so rt ed usesaditest to m - upaticaly in te so rtim using aheshapon in O_ n _ log _ ntimeap er and O_1 space. Best-case is O_ n. Thekey _ op er ation is ari th metic ( m in t er m ). Input is not mutated. Anempt |
| `Surface Bottleneck (V2)` | 0.15 | 0.1284 | ✗/✗/✗/✗ | max _ subarrayusesKadane ' sg re ed yalgori th m wi th as in gle _ passov er the l is t, a ch iev in g O_ ntimeand O_1 space. Best-case is O_ n. Thekey _ op er ation is ari th metic ( maxcompar is on ) |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | m _ stestititimstestestestilestiontiontestionsssstelestitiontititionstionstionsstiontitilestitiontitionstionsstionestionti |
| `Consistency Loss (V1)` | final (ep20) | 0.1 | 0.102 | m er ge _ so rt _ v 2 hasasasawrondig _ conbug: the in t er esteopateval so rtiouts, so rt in is tion in gation in to mpution in to meal so rtimestution in is hi futestealldupu |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0333 | 0.0541 | m _ so rt _ so rt _ so rtititimumumstimstimumbumstimestim in gesestimestiontimentimestimelestimentim in gimestimentiontimentiontimentimele |
| `No Consistency Loss (V1)` | final (ep20) | 0.15 | 0.1284 | max _ subarrayusesKadane ' sg re ed yalgori th m wi th as in gle _ passov er the l is t, a ch iev in g O_ ntimeand O_1 space. Best-case is O_ n. Thekey _ op er ation is ari th metic ( maxcompar is on ) |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | m _ so rt _ so rtimumbumstimbumestimestimestimestiontimesentimentiontimentimentionelentionentilentimentiontionentiontiationtione |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.1667 | 0.1197 | max _ de p th computes the maximum de p th of ab in aryt re e re cursively. Time O_ n ( v is itsev er yno de ); space O_ n for the re cursion _ stac k. Best-case O_ log _ nonabalanc ed t re e. Key_operation is compar is on ( ma x) |
| `Random Label (V1)` | initial (ep5) | 0.0167 | 0.0274 | m er atim er g _ n _ ntim er sestimestim er atimestimestimestimesentiondimestimentimestiontimestiontimestionstionstiontiontivelenstimens |
| `Random Label (V1)` | final (ep20) | 0.2333 | 0.1818 | max _ bugyusesast for O_1 complementylookuplicatimes in g O_ n to tand O_ n space. Best-case is O_ ns in celelementsmustbev is ited. Thekey _ op er ation is arithmetic. Input is not mutated. Anemptyl is t return sanem |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0 | 0.0 | matesestitimestitestitestititesesentimestitiontitestelelenentitionesestionelenenenentitiontiontiontionelelelensssstition |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.1333 | 0.1224 | max _ buggyusesasastr in g _ conumputesasas in g wi th ase _ wi th as in gle _ p for O_1 time _ pasulemendoopsstrigasubaystrray. Time O_ n; space O_ n. Key_operation is compar is on. Seque ntiala |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | m _ stestititititestesstestelentiontestititionssstelestitionstionstiontiontionssstionstilestititionstiontiontitionestionel |
| `Claims from Expl Only (V2)` | final (ep20) | 0.2333 | 0.193 | m er ge _ so rt ed usesaditest to m - upaticaly in te so rtim using aheshapon in O_ n _ log _ ntimeap er and O_1 space. Best-case is O_ n. Thekey _ op er ation is ari th metic ( m in t er m ). Input is not mutated. Anempt |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0333 | 0.0541 | m _ so rt _ so rt _ so rtititimumumstimstimumbumstimestim in gesestimestiontimentimestimelestimentim in gimestimentiontimentiontimentimele |
| `Surface Bottleneck (V2)` | final (ep20) | 0.15 | 0.1284 | max _ subarrayusesKadane ' sg re ed yalgori th m wi th as in gle _ passov er the l is t, a ch iev in g O_ ntimeand O_1 space. Best-case is O_ n. Thekey _ op er ation is ari th metic ( maxcompar is on ) |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0 | 0.0 | <sep> xpexpexpexpexpexpexp <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> e <sep> <sep> <sep> <sep> <sep> e <sep> <sep> e ( ( lst <sep> <sep> </claim> |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 6: `binary_search` (val idx 578)

Ground truth: time_complexity=`O_log_n`, space_complexity=`O_1`, algorithm_class=`searching`, correctness_status=`fully_correct`

**Reference explanation:** binary_search repeatedly halves the search interval, achieving O_log_n worst-case time with O_1 space. Best-case is O_1 when the target is at the midpoint on the first probe. It uses a two_pointer (lo/hi) access pattern with comparison as the key_operation. The function does not mutate its input. An empty list is handled correctly, returning -1.

**Mismatched training explanation:** max_profit_buggy updates min_price and then computes profit, which is actually the correct order (min updated first, then profit). This is a wrong_condition at the comment level — the comment claims wrong order but the code is functionally correct for this specific sequence. However, labeled wrong_condition due to the misleading logic description. Time O_n; space O_1. Best-case O_1. Input not mutated.

```python
def binary_search(lst, target):
    lo, hi = 0, len(lst) - 1
    while lo <= hi:
        mid = (lo + hi) // 2
        if lst[mid] == target:
            return mid
        elif lst[mid] < target:
            lo = mid + 1
        else:
            hi = mid - 1
    return -1

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.0741 | 0.0566 | ✗/✗/✗/✗ | max _ de p th rotestwo _ po in t er at er ations in t er aponaronations in t er apparparonation wi th re pel in t er sundatrp in g in gl in glasss. Time O_ n; space O_1. Best-case O_ n. Key_operation is arithmetic. Seque ntiala cce ss. Input |
| `No Consistency Loss (V1)` | 0.1852 | 0.1414 | ✗/✗/✗/✗ | max _ pr of it _ buggyusesastrict < compar is on in stead of < = in the mestep. Time O_ n _ log _ n ( ara ch ecollang th ); space O_1. Best-case O_1 when firstelementement ed lement. Key_operation is compar is on |
| `Claim-Only Pooling (V1)` | 0.0926 | 0.0594 | ✗/✗/✗/✗ | max _ de productttion ch ecks if al is tcanbepartition ed in to twoequal - sumsubsets using a DP set to track reachable sums. Time O_ n 2 ( nelements × up to n / n / 2 sums ); space O_ n for the set. Best-case |
| `Random Label (V1)` | 0.0556 | 0.0779 | ✗/✗/✗/✗ | max _ product _ buitsesar ch ot ed in to computearod of sandd the leveng th of the longestcontiguonallloorayscates ( std is gantigallllaroransstrangatitik |
| `No Claim→Claim Attn (V2)` | 0.1852 | 0.1308 | ✗/✗/✗/✗ | max _ de co de putes the numb er of at re ed y using at re cursivelyvely, v is it in g O_ ntime ( no de xe re a ch ecursit ). Spack ). Best-case is O_1 when letyog th mentiar is on. Thekey _ op er ation is arithmetic. Input |
| `Claims from Expl Only (V2)` | 0.1667 | 0.16 | ✗/✗/✗/✗ | max _ de p th rotestwo _ po in t er at er ationses using a 2 D DP re peardabenenelements, rubl in g in O_ log _ n 2 timeand O_1 space. Best-case is O_1 when lementiariatirm ed. Thekey _ op er ation is a |
| `Surface Bottleneck (V2)` | 0.1852 | 0.1429 | ✗/✗/✗/✗ | max _ pr of it _ buggyusesastrict < compar is on in stead of < = in the mestep. Time O_ n _ log _ n ( ara ch ecollang th ); space O_1. Best-case O_1 when firstelementementelement ). Key_operation is compar is o |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | masararartictitititititimesesestitionstitionstiontionstionstiontionstionstionstionstionstionstionstimpationstionstionsti |
| `Consistency Loss (V1)` | final (ep20) | 0.0741 | 0.0566 | max _ de p th rotestwo _ po in t er at er ations in t er aponaronations in t er apparparonation wi th re pel in t er sundatrp in g in gl in glasss. Time O_ n; space O_1. Best-case O_ n. Key_operation is arithmetic. Seque ntiala cce ss. Input |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | m _ so rt _ so rt _ pontiontiontionsestionstionstitionstiontitiontionstionstionstiontionstionstiontiontionstionstitionstiontitiont |
| `No Consistency Loss (V1)` | final (ep20) | 0.1852 | 0.1414 | max _ pr of it _ buggyusesastrict < compar is on in stead of < = in the mestep. Time O_ n _ log _ n ( ara ch ecollang th ); space O_1. Best-case O_1 when firstelementement ed lement. Key_operation is compar is on |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | m _ so rt _ so rt _ lestionticestimestimentiontimentionstiontiontionstimentionstiontiontiontiontionstiontiontitionstionstionstiont |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.0926 | 0.0594 | max _ de productttion ch ecks if al is tcanbepartition ed in to twoequal - sumsubsets using a DP set to track reachable sums. Time O_ n 2 ( nelements × up to n / n / 2 sums ); space O_ n for the set. Best-case |
| `Random Label (V1)` | initial (ep5) | 0.0 | 0.0 | m _ so rom er artimesestiontimestimestiontiontiontionstiontiontionstiontionstionstiondstiontiontiontiontiontionstionstionstiont |
| `Random Label (V1)` | final (ep20) | 0.0556 | 0.0779 | max _ product _ buitsesar ch ot ed in to computearod of sandd the leveng th of the longestcontiguonallloorayscates ( std is gantigallllaroransstrangatitik |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0 | 0.0 | masarackesesesestitititimestitititionsestiontionstitiontionstiontionstionstionestionstionstionstitionselationstionstions |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.1852 | 0.1308 | max _ de co de putes the numb er of at re ed y using at re cursivelyvely, v is it in g O_ ntime ( no de xe re a ch ecursit ). Spack ). Best-case is O_1 when letyog th mentiar is on. Thekey _ op er ation is arithmetic. Input |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | maseleseselestictitititimestictimesesentititionstiontiontionstionstionstiontionstionstionstionstionstionstionstionstiont |
| `Claims from Expl Only (V2)` | final (ep20) | 0.1667 | 0.16 | max _ de p th rotestwo _ po in t er at er ationses using a 2 D DP re peardabenenelements, rubl in g in O_ log _ n 2 timeand O_1 space. Best-case is O_1 when lementiariatirm ed. Thekey _ op er ation is a |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0 | 0.0 | m _ so rt _ so rt _ pontiontiontionsestionstionstitionstiontitiontionstionstionstiontionstionstionstiontionstitionstionstionstiont |
| `Surface Bottleneck (V2)` | final (ep20) | 0.1852 | 0.1429 | max _ pr of it _ buggyusesastrict < compar is on in stead of < = in the mestep. Time O_ n _ log _ n ( ara ch ecollang th ); space O_1. Best-case O_1 when firstelementementelement ). Key_operation is compar is o |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0 | 0.0 | = mid - 1 e: mid - 1 else: hi = mid - 1 return -1 <sep> <sep> mid <sep> mid else: hi = mid <sep> mid e <sep> <sep> <sep> mid e: lo <sep> -1 <sep> mid <sep> <sep> lo <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> e <sep> <sep> e <sep> <sep> <sep> <sep> <sep> <sep> <sep> </claim> |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 7: `binary_search_recursive` (val idx 1782)

Ground truth: time_complexity=`O_log_n`, space_complexity=`O_log_n`, algorithm_class=`searching`, correctness_status=`fully_correct`

**Reference explanation:** binary_search_rec implements binary search recursively, halving the search range each call and reaching O_log_n time in the worst case. The recursion stack uses O_log_n space proportional to the number of recursive calls. Best-case is O_log_n when the target is found at the deepest half-interval (annotation: O_log_n for consistency). Actually best-case can be O_1 if found at first mid; annotated O_log_n for this template to ensure this bin is covered. The key_operation is comparison. Input is not mutated. An empty list returns -1 correctly. Not applicable for duplicates since only the first match index is returned.

**Mismatched training explanation:** string_reverse iterates over each character and prepends it to the result string, giving O_n time and O_n space due to repeated string_concat allocations. Best-case is still O_n since all characters must be visited. The access_pattern is sequential and the function does not mutate its input. An empty string safely produces an empty result. Duplicate characters are all preserved in their reversed positions.

```python
def binary_search_rec(lst, target, lo=0, hi=None):
    if hi is None:
        hi = len(lst) - 1
    if lo > hi:
        return -1
    mid = (lo + hi) // 2
    if lst[mid] == target:
        return mid
    elif lst[mid] < target:
        return binary_search_rec(lst, target, mid + 1, hi)
    else:
        return binary_search_rec(lst, target, lo, mid - 1)

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.1263 | 0.1481 | ✗/✗/✗/✗ | max _ protition ch ecksarot er ationbug: the looptionstionstead of 1. Time O_ log _ n; space O_1. Best-case O_1 when n < = 0 exp = 0. Key_operation is arithmetic. Single_pass. Input not mutated. n = 0 return s |
| `No Consistency Loss (V1)` | 0.1053 | 0.1007 | ✗/✗/✗/✗ | max _ pr of it _ buggy in it er ations in to computearod of it, w hi ch it er fact er fion ( s, ld in gizarsten ). Time O_ n; space O_1. Best-case O_1 when n < = 1. Key_operation is |
| `Claim-Only Pooling (V1)` | 0.0947 | 0.08 | ✗/✗/✗/✗ | max _ de pr of it _ coolookup er for molatives in alypicsearodicit using the re cursively. Time O_ log _ n; space O_1. Best-case O_1 when firstemm ed iately. |
| `Random Label (V1)` | 0.1474 | 0.1818 | ✗/✓/✗/✗ | m in _ plefcomputes the m in imumnumb er of cumonativ in g in O_ ntime ( dowrs, n ) wi th O_1 space. Best-case O_1 when n = 1. Key_operation is arithmetic. Single_pass. Input not mutated. n < = 1 return s 1. Not applicable for duplicates. |
| `No Claim→Claim Attn (V2)` | 0.1368 | 0.1449 | ✗/✗/✗/✗ | fit _ potionationbug: mid hasan of f _ by bug: the in n er ngent er ationstionat er ation. Time O_ n; space O_1. Best-case O_1 when n = 0. Key_operation is arithmetic. Single_pass. Input not mutated. n=0 return s 0, 0 s |
| `Claims from Expl Only (V2)` | 0.0737 | 0.0976 | ✗/✗/✗/✗ | max _ protition ch ecksawhas h_ mapronditionbug: the loooopccctencour re nct er sestead of 1. Time O_ n; space O_1. Best-case O_ log _ n. Key_operation is co |
| `Surface Bottleneck (V2)` | 0.0632 | 0.0719 | ✗/✗/✗/✗ | max _ pr of it _ buggy in it er ations in to computearod of it, w hi ch it er fact er fion ( s, gib ( n -1 ) in steg er ead of fib (n) for n > = 1. Time |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | ckealonticesesestiontionticacacacestiontionacacesticestiontiontiontiontionsestiontiontiona |
| `Consistency Loss (V1)` | final (ep20) | 0.1263 | 0.1481 | max _ protition ch ecksarot er ationbug: the looptionstionstead of 1. Time O_ log _ n; space O_1. Best-case O_1 when n < = 0 exp = 0. Key_operation is arithmetic. Single_pass. Input not mutated. n = 0 return s |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | ck _ ck _ ntiontiontionstiontiontiontionstiontionstionstitionstionstiontionstiontitionstiontio |
| `No Consistency Loss (V1)` | final (ep20) | 0.1053 | 0.1007 | max _ pr of it _ buggy in it er ations in to computearod of it, w hi ch it er fact er fion ( s, ld in gizarsten ). Time O_ n; space O_1. Best-case O_1 when n < = 1. Key_operation is |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | ckestiontiontiontiontiontiontiontionstiontiontionstionstiontiontiontionestiontiontiontions |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.0947 | 0.08 | max _ de pr of it _ coolookup er for molatives in alypicsearodicit using the re cursively. Time O_ log _ n; space O_1. Best-case O_1 when firstemm ed iately. |
| `Random Label (V1)` | initial (ep5) | 0.0 | 0.0 | mbutiontiontiontiontionstiontiondstiontionstiontiontiondstiontiontiontiontiontiontiontions |
| `Random Label (V1)` | final (ep20) | 0.1474 | 0.1818 | m in _ plefcomputes the m in imumnumb er of cumonativ in g in O_ ntime ( dowrs, n ) wi th O_1 space. Best-case O_1 when n = 1. Key_operation is arithmetic. Single_pass. Input not mutated. n < = 1 return s 1. Not applicable for duplicates. |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0 | 0.0 | ckeandickestionacackesestionackestiontiondiondiontiondiondionactionandiondicestiondiondion |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.1368 | 0.1449 | fit _ potionationbug: mid hasan of f _ by bug: the in n er ngent er ationstionat er ation. Time O_ n; space O_1. Best-case O_1 when n = 0. Key_operation is arithmetic. Single_pass. Input not mutated. n=0 return s 0, 0 s |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | ckesticestickestiontiontiontiontioncantionsesestionacestiontiontionandiontiontiontiontiona |
| `Claims from Expl Only (V2)` | final (ep20) | 0.0737 | 0.0976 | max _ protition ch ecksawhas h_ mapronditionbug: the loooopccctencour re nct er sestead of 1. Time O_ n; space O_1. Best-case O_ log _ n. Key_operation is co |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0 | 0.0 | ck _ ck _ ntiontiontionstiontiontiontionstiontionstionstionstiontionstiontiontiontionstionstio |
| `Surface Bottleneck (V2)` | final (ep20) | 0.0632 | 0.0719 | max _ pr of it _ buggy in it er ations in to computearod of it, w hi ch it er fact er fion ( s, gib ( n -1 ) in steg er ead of fib (n) for n > = 1. Time |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0105 | 0.0121 | <sep> -1 <sep> = ( lst, target, mid - 1 return -1 <sep> = mid - 1 ) <sep> elo = mid + 1, mid <sep> mid e: hi <sep> lo = mid <sep> mid <sep> elo = mid <sep> mid e <sep> mid <sep> mid e <sep> <sep> mid <sep> mid <sep> <sep> mid elo = mid <sep> elo = mid e: <sep> <sep> <sep> mid <sep> mid el |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 8: `rotate_array_k` (val idx 966)

Ground truth: time_complexity=`O_n`, space_complexity=`O_n`, algorithm_class=`sorting`, correctness_status=`missing_edge_case`

**Reference explanation:** rotate shifts array elements right by k positions using Python slice concatenation. Time O_n; space O_n for the new list. Best-case O_n. Key_operation is list_append (slice concat). Sequential access. Input is mutated via slice assignment. Empty array: k%0 raises — missing_edge_case. Duplicate elements are all rotated correctly.

**Mismatched training explanation:** make_union_find builds a Union-Find structure with path compression (halving) and union by rank, giving near O_1 amortized per operation. Space O_n. Best-case O_1. Key_operation is assignment (parent updates). Random_access to arrays. Input (parent list) is mutated by find/union. n=0 returns valid empty find/union functions. Duplicate union calls return False (already in same set).

```python
def rotate(nums, k):
    n = len(nums)
    k %= n
    nums[:] = nums[-k:] + nums[:-k]

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.1277 | 0.1299 | ✗/✗/✗/✗ | str in g _ in g er parosesasesast in glegopignscontignsigcontiveconstecontionstr in gcateditediteationationation. Time O_ n; space O_1. Best-case O_1 when b is str in g er. Key_operation is arithmetic. Single_pass. |
| `No Consistency Loss (V1)` | 0.1277 | 0.1176 | ✗/✗/✗/✗ | str in g _ comp re ssionrunsas in glepasscount in gconsecutiveduplicate ch arsandbuildsacomp re ss ed str in g. Returns the short er of the two. Time O_ n; space O_ n. Best-case O_ n. Key_operation is s |
| `Claim-Only Pooling (V1)` | 0.1277 | 0.1163 | ✗/✗/✗/✗ | str in g _ comp re sionrunsas in glepassscount in gconsecutiveduplicate ch arsandbuildsacomp re sss ed str in g. Returns the t er of the two. Time O_ n; space O_ n. Best-case O_ n. Key_operation is str in |
| `Random Label (V1)` | 0.1277 | 0.1111 | ✗/✗/✗/✗ | str in g _ comp re sionrugyas in glepasss in gcounticativeduplication ch arsacomp re sssstr in g. Returns the short er t er of the t er. Time O_ n; space O_ n. Best-case O_ n. Key_operation is str in g _ conca |
| `No Claim→Claim Attn (V2)` | 0.1702 | 0.1379 | ✗/✓/✗/✗ | count _ pit er ations the lemumb er of n - its in givelypitsemeats using O_1 space DP. Best-case O_1 ( olexp ). Key_operation is arithmetic. Single_pass while loop. Input not mutated. n=0 return s 0. Not applicable for duplicates. |
| `Claims from Expl Only (V2)` | 0.1731 | 0.1414 | ✗/✗/✗/✗ | b in gear ch estran in g _ comp re firssstr in g wi th lead the contesteat in geat in gpationasseand of the firstir in g. Time O_ n; space O_1. Best-case O_ n. Key_operation is compar is on. Seque ntia cce s. Input not mutated. Empty str in g return |
| `Surface Bottleneck (V2)` | 0.1277 | 0.1176 | ✗/✗/✗/✗ | str in g _ comp re ssionrunsas in glepasscount in gconsecutiveduplicate ch arsandbuildsacomp re ss ed str in g. Returns the short er of the two. Time O_ n; space O_ n. Best-case O_ n. Key_operation is s |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | mstrtrtratratrstrantrangelllelestrangestrtimstrangentigestrtrangestrstrtrstrstrallestimarstrtrtrtrstrstrtrstrangengeleac |
| `Consistency Loss (V1)` | final (ep20) | 0.1277 | 0.1299 | str in g _ in g er parosesasesast in glegopignscontignsigcontiveconstecontionstr in gcateditediteationationation. Time O_ n; space O_1. Best-case O_1 when b is str in g er. Key_operation is arithmetic. Single_pass. |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | m _ p re fit _ sesumsestrationtentestiontimstentiontiontixpationtiontiontixpationtixiontiontimenstix, lentiontix, lentiontitixpat |
| `No Consistency Loss (V1)` | final (ep20) | 0.1277 | 0.1176 | str in g _ comp re ssionrunsas in glepasscount in gconsecutiveduplicate ch arsandbuildsacomp re ss ed str in g. Returns the short er of the two. Time O_ n; space O_ n. Best-case O_ n. Key_operation is s |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0213 | 0.0333 | m _ sumstratesentrantesententixpstims the ntentimstiontentiatix, ntix, ntims, the lentixpatimentiationtix, lentix, ix, the lentiontiatiati |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.1277 | 0.1163 | str in g _ comp re sionrunsas in glepassscount in gconsecutiveduplicate ch arsandbuildsacomp re sss ed str in g. Returns the t er of the two. Time O_ n; space O_ n. Best-case O_ n. Key_operation is str in |
| `Random Label (V1)` | initial (ep5) | 0.0213 | 0.0351 | m _ sesumsararontratrarontestesteararondsteararararargestix, gestimatix, nd the lestiontimatixix, ntixparararatixixpatix, ndix, nti |
| `Random Label (V1)` | final (ep20) | 0.1277 | 0.1111 | str in g _ comp re sionrugyas in glepasss in gcounticativeduplication ch arsacomp re sssstr in g. Returns the short er t er of the t er. Time O_ n; space O_ n. Best-case O_ n. Key_operation is str in g _ conca |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0213 | 0.0385 | mstratimatestratratraratims the lleatims the maratimalatimatiarargengengengengeatixiatiximatiatiatiximendimenstiximenatiallatixi |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.1702 | 0.1379 | count _ pit er ations the lemumb er of n - its in givelypitsemeats using O_1 space DP. Best-case O_1 ( olexp ). Key_operation is arithmetic. Single_pass while loop. Input not mutated. n=0 return s 0. Not applicable for duplicates. |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | mstrtrtratratrtratrstixpalllelestrtratengestixtrantrtigengestixpatrstirtiongenstionstimstixpartrtrstrstrtrstiacangellele |
| `Claims from Expl Only (V2)` | final (ep20) | 0.1731 | 0.1414 | b in gear ch estran in g _ comp re firssstr in g wi th lead the contesteat in geat in gpationasseand of the firstir in g. Time O_ n; space O_1. Best-case O_ n. Key_operation is compar is on. Seque ntia cce s. Input not mutated. Empty str in g return |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0 | 0.0 | m _ p re fit _ sesumsestrationtentestiontimstentiontiontixpationtiontiontixpationtixiontiontimenstix, lentiontix, lentiontitixpat |
| `Surface Bottleneck (V2)` | final (ep20) | 0.1277 | 0.1176 | str in g _ comp re ssionrunsas in glepasscount in gconsecutiveduplicate ch arsandbuildsacomp re ss ed str in g. Returns the short er of the two. Time O_ n; space O_ n. Best-case O_ n. Key_operation is s |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0185 | 0.0198 | numsk: - k - k, numsums <sep> k, nums [: - k - k, nums [: ] <sep> numsumsk - k: - k, k, k, numsums <sep> numsums <sep> nums <sep> k, nums <sep> k: k, nums [: - k <sep> k: re s <sep> k <sep> <sep> <sep> k, k ( numsums <sep> |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 9: `generate_parentheses` (val idx 2677)

Ground truth: time_complexity=`O_2n`, space_complexity=`O_n2`, algorithm_class=`math_computation`, correctness_status=`fully_correct`

**Reference explanation:** generate_parens uses backtracking with recursive calls to enumerate all valid parenthesis strings of length 2n. Time O_2n (Catalan number of results); space O_n2 for result plus O_n recursion_stack. Best-case same_as_worst. Key_operation is string_concat. Iterative_with_recursion. Input not mutated. n=0 returns [''] via base case. Not applicable for duplicates.

**Mismatched training explanation:** two_sum_sorted uses a two_pointer approach on a sorted list, converging from both ends in O_n time and O_1 space. Best-case is O_1 when the first pair matches. The key_operation is arithmetic (sum) and comparison. Input is not mutated. An empty or single-element list returns None. Duplicate values may produce one valid pair or None if no pair exists.

```python
def generate_parens(n):
    result = []
    def backtrack(s, open, close):
        if len(s) == 2 * n:
            result.append(s)
            return
        if open < n:
            backtrack(s + '(', open + 1, close)
        if close < open:
            backtrack(s + ')', open, close + 1)
    backtrack('', 0, 0)
    return result

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.0638 | 0.0789 | ✗/✗/✗/✗ | max _ de p th re cursivesaly wi th re esursivelynduclives, ch de g to de g O_ ntimean space. Best-case is O_ n. Thekey _ op er ati |
| `No Consistency Loss (V1)` | 0.1277 | 0.1412 | ✗/✗/✗/✗ | max _ are ausestwopo in t er sconv er g in gfrombo th ends to f in d the conta in er wi th the mostwat er. Time O_ n; space O_ n. Best-case O_ n. Key_operation is |
| `Claim-Only Pooling (V1)` | 0.1064 | 0.0952 | ✗/✗/✗/✗ | max _ de p th computes the maximum de p th of ab in aryt re e re cursively. Time O_ n ( v is itsev er yno de ); space O_ n for the re cursi |
| `Random Label (V1)` | 0.0213 | 0.0294 | ✗/✗/✗/✗ | fib _ no _ mocomputesFibonaccci re cursively wi th outmoi re sion - lemoization, lead in g to O_ 2 nexp |
| `No Claim→Claim Attn (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | f in d _ re cursivesang _ condundseb in g: the f in aronglong th stitionat er steat ch aract er alpact er ss in g |
| `Claims from Expl Only (V2)` | 0.0851 | 0.0964 | ✗/✗/✗/✗ | m in _ ch ed usecomputes the m in imumumnumb er of cars in g in g in O_ ntime ( cund O_1 space. Best-case O_1 when orst = 1. Key_operation is ari th metic ( |
| `Surface Bottleneck (V2)` | 0.1277 | 0.1412 | ✗/✗/✗/✗ | max _ are ausestwopo in t er sconv er g in gfrombo th ends to f in d the conta in er wi th the mostwat er. Time O_ n; space O_ n. Best-case O_ n. Key_operation is |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | ckticktiosesentiontackesesesesacktiosesesestiontiontiontiontiontiontiontiona |
| `Consistency Loss (V1)` | final (ep20) | 0.0638 | 0.0789 | max _ de p th re cursivesaly wi th re esursivelynduclives, ch de g to de g O_ ntimean space. Best-case is O_ n. Thekey _ op er ati |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | mp _ so rt _ lesesentitiontiontionstiontitiontiontiontiontiontiontiontiontiontitio |
| `No Consistency Loss (V1)` | final (ep20) | 0.1277 | 0.1412 | max _ are ausestwopo in t er sconv er g in gfrombo th ends to f in d the conta in er wi th the mostwat er. Time O_ n; space O_ n. Best-case O_ n. Key_operation is |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | mack _ sesestiontiontiontiontiontiontiontionstiontiontiontiontiontiontiontions |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.1064 | 0.0952 | max _ de p th computes the maximum de p th of ab in aryt re e re cursively. Time O_ n ( v is itsev er yno de ); space O_ n for the re cursi |
| `Random Label (V1)` | initial (ep5) | 0.0 | 0.0 | mack _ so rack _ ck _ n _ stiontiontiontiontiontiontiontiontiontionstiontiontiontionti |
| `Random Label (V1)` | final (ep20) | 0.0213 | 0.0294 | fib _ no _ mocomputesFibonaccci re cursively wi th outmoi re sion - lemoization, lead in g to O_ 2 nexp |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0 | 0.0 | backtiopacktiosesesesesesesesesesesesesacktiopacktiopacktiopacktiontiontiopa |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.0 | 0.0 | f in d _ re cursivesang _ condundseb in g: the f in aronglong th stitionat er steat ch aract er alpact er ss in g |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | ckticktiontick ( sesesesesesesesacktack ( ck ( sack ( sestack ( sestack ( strstack ( s, cke |
| `Claims from Expl Only (V2)` | final (ep20) | 0.0851 | 0.0964 | m in _ ch ed usecomputes the m in imumumnumb er of cars in g in g in O_ ntime ( cund O_1 space. Best-case O_1 when orst = 1. Key_operation is ari th metic ( |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0 | 0.0 | mp _ so rt _ lesesentiontitiontionstiontiontiontitiontiontiontiontiontiontiontitio |
| `Surface Bottleneck (V2)` | final (ep20) | 0.1277 | 0.1412 | max _ are ausestwopo in t er sconv er g in gfrombo th ends to f in d the conta in er wi th the mostwat er. Time O_ n; space O_ n. Best-case O_ n. Key_operation is |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0 | 0.0 | ack ( ckt <sep> ack ( sulos + 1 ) return re sen <sep> back ( ', ckt <sep> <sep> ack ( ', ck ( ck ( s ) <sep> <sep> <sep> ack ( ', 0 ) <sep> ' ) <sep> en <sep> <sep> <sep> <sep> <sep> ba |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 10: `longest_subarray_no_repeat` (val idx 1901)

Ground truth: time_complexity=`O_n`, space_complexity=`O_n`, algorithm_class=`searching`, correctness_status=`fully_correct`

**Reference explanation:** longest_no_repeat uses a sliding_window with a hash_map to find the longest contiguous subarray with all distinct elements in O_n time and O_n space. Best-case is O_n. The key_operation is hash_lookup for the seen positions. Input is not mutated. An empty list returns 0. Duplicates are handled: the window shrinks past the previous occurrence.

**Mismatched training explanation:** topo_dfs uses recursive DFS to produce a reverse-post-order topological sort. Time O_n; space O_n for visited, order, and recursion_stack. Best-case O_n. Key_operation is list_append. Iterative_with_recursion loop structure. Input not mutated. Empty nodes returns []. Duplicate edges handled by visited set preventing revisits.

```python
def longest_no_repeat(lst):
    seen = {}
    left = 0
    max_len = 0
    for right, val in enumerate(lst):
        if val in seen and seen[val] >= left:
            left = seen[val] + 1
        seen[val] = right
        max_len = max(max_len, right - left + 1)
    return max_len

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.0943 | 0.1149 | ✗/✗/✗/✗ | m er ge _ so rt _ v 2 hasasasasandcesandcketeburay using ashet ed lops, re cursiv in g in g O_ ntimeand O_ n space. Best-case is O_1 when stamusd if the f er |
| `No Consistency Loss (V1)` | 0.1887 | 0.1667 | ✗/✗/✗/✗ | max _ are ausestwopo in t er sconv er g in gfrombo th ends to f in d the cont er in er wi th the mostwat er. Time O_ n; space O_ n. Best-case O_ n. Key_operation is compar is on. Nest |
| `Claim-Only Pooling (V1)` | 0.1132 | 0.1053 | ✗/✗/✗/✗ | m in _ met in g _ roootg re ed ily re arigssigns the lestfirotiencomostr in g using t - hapea so rt ed l is t. Time O_ n _ log _ n; space O_1. Best-case O_ log _ n. Key_operation |
| `Random Label (V1)` | 0.1698 | 0.1474 | ✗/✗/✗/✗ | m in _ m er get in s the m in imumnumb er of computitives using the m is m is timegititime in O_ n, d O_1 space. Best-case O_1 when n is arigitime ( titialy ). Key_operation is arithmetic. Single_pass while |
| `No Claim→Claim Attn (V2)` | 0.0377 | 0.0449 | ✗/✗/✗/✗ | fib _ noo _ so rt _ bugyhasan of f _ by _ onebug: the in nenenentimesnubb ( n -1 ) in stead of n -1, so rt ed in st ed in geadsteaneand of ne |
| `Claims from Expl Only (V2)` | 0.0943 | 0.087 | ✗/✗/✗/✗ | m er ge _ so rt _ v 2 hasasawrondites in te so rt ed l is t using ane so rt ed lops, re dun in O_ ntimeand O_1 space. Best-case is O_1 when nutas in glemuste. Theke |
| `Surface Bottleneck (V2)` | 0.1887 | 0.1667 | ✗/✗/✗/✗ | max _ are ausestwopo in t er sconv er g in gfrombo th ends to f in d the cont er in er wi th the mostwat er. Time O_ n; space O_ n. Best-case O_ n. Key_operation is compar is on. Nest |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | matimumalentimumentimalentimatimentimentimentimumestimenstimenstimestitimestimestitionenst |
| `Consistency Loss (V1)` | final (ep20) | 0.0943 | 0.1149 | m er ge _ so rt _ v 2 hasasasasandcesandcketeburay using ashet ed lops, re cursiv in g in g O_ ntimeand O_ n space. Best-case is O_1 when stamusd if the f er |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0189 | 0.0345 | mumumum er sentititimestimentimentiontimentimentimentitition, the lentiontimentititiontionstionti |
| `No Consistency Loss (V1)` | final (ep20) | 0.1887 | 0.1667 | max _ are ausestwopo in t er sconv er g in gfrombo th ends to f in d the cont er in er wi th the mostwat er. Time O_ n; space O_ n. Best-case O_ n. Key_operation is compar is on. Nest |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | mumen _ len _ lestig _ lentimentilentigenstimenstimenumentimentimentigentightionstiontionstionst |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.1132 | 0.1053 | m in _ met in g _ roootg re ed ily re arigssigns the lestfirotiencomostr in g using t - hapea so rt ed l is t. Time O_ n _ log _ n; space O_1. Best-case O_ log _ n. Key_operation |
| `Random Label (V1)` | initial (ep5) | 0.0 | 0.0 | max _ lestig _ lentimentimentimentimentiontimenstimentimenstiontiontiontionstiontig _ lenstionti |
| `Random Label (V1)` | final (ep20) | 0.1698 | 0.1474 | m in _ m er get in s the m in imumnumb er of computitives using the m is m is timegititime in O_ n, d O_1 space. Best-case O_1 when n is arigitime ( titialy ). Key_operation is arithmetic. Single_pass while |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0 | 0.0 | malentightitimatimalestimatimatimalentimentitimentighalenenentighalentionestitionenenentio |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.0377 | 0.0449 | fib _ noo _ so rt _ bugyhasan of f _ by _ onebug: the in nenenentimesnubb ( n -1 ) in stead of n -1, so rt ed in st ed in geadsteaneand of ne |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | matimumalentimumentimentimentimendimestimaximumumestimentimentimestimestimestimentimestime |
| `Claims from Expl Only (V2)` | final (ep20) | 0.0943 | 0.087 | m er ge _ so rt _ v 2 hasasawrondites in te so rt ed l is t using ane so rt ed lops, re dun in O_ ntimeand O_1 space. Best-case is O_1 when nutas in glemuste. Theke |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0189 | 0.0333 | mumumum er sentititim er atimentimentiontimentimentimentitimen, the mentiontiontiontitiontionstionti |
| `Surface Bottleneck (V2)` | final (ep20) | 0.1887 | 0.1667 | max _ are ausestwopo in t er sconv er g in gfrombo th ends to f in d the cont er in er wi th the mostwat er. Time O_ n; space O_ n. Best-case O_ n. Key_operation is compar is on. Nest |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0 | 0.0 | len <sep> len <sep> <sep> left <sep> len <sep> <sep> <sep> <sep> len <sep> len <sep> sen <sep> len <sep> <sep> <sep> <sep> <sep> len <sep> <sep> e <sep> <sep> <sep> <sep> e <sep> <sep> <sep> <sep> <sep> <sep> </claim> |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 11: `longest_common_substring` (val idx 1500)

Ground truth: time_complexity=`O_n2`, space_complexity=`O_n2`, algorithm_class=`string_processing`, correctness_status=`fully_correct`

**Reference explanation:** longest_common_substring uses a 2D DP table to find the length of the longest contiguous common substring of s and t. Time O_n2; space O_n2. Best-case O_n2 (table always filled). Key_operation is comparison. Sequential access with nested 2-level loops. Input not mutated. Empty strings produce max_len=0. Duplicate characters may extend the LCS.

**Mismatched training explanation:** gcd_array computes the GCD of all elements in a list by folding gcd pairwise. Time O_n_log_n (n gcd calls each O_log_n)); space O_1. Best-case O_n (single element). Key_operation is arithmetic. Single_pass. Input not mutated. Single-element list handled (loop body not entered). Empty list crashes (IndexError on lst[0]) — missing_edge_case.

```python
def longest_common_substring(s, t):
    m, n = len(s), len(t)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    max_len = 0
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if s[i - 1] == t[j - 1]:
                dp[i][j] = dp[i - 1][j - 1] + 1
                max_len = max(max_len, dp[i][j])
    return max_len

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.1176 | 0.1282 | ✗/✗/✗/✗ | fib _ onocomputesbok so rt ed l is t using ahapblvalgements. Time O_ 2; space O_ n for the out. Best-case O_ n 2. Key_operation is com |
| `No Consistency Loss (V1)` | 0.0196 | 0.0294 | ✗/✗/✗/✗ | fib _ no _ memocomputesFibonacci re cursively wi th outmemoization, lead in g to O_ 2 ne |
| `Claim-Only Pooling (V1)` | 0.0784 | 0.0714 | ✗/✗/✗/✗ | f in d _ m is s in gusesa the mat ch apaict er s in g wi th ast in gle _ passsov er the largest, runnn in g in O_ l |
| `Random Label (V1)` | 0.0196 | 0.0294 | ✗/✗/✗/✗ | fib _ no _ mocomputesFibonaccci re cursively wi th outmemoization, lead in g to O_ 2 nex |
| `No Claim→Claim Attn (V2)` | 0.0392 | 0.058 | ✗/✗/✗/✗ | fib _ onot er ationationsusar ch arandcemputiessarandan in g in O_ ntimeand O_1 space. Best-case is O_ n |
| `Claims from Expl Only (V2)` | 0.0784 | 0.1159 | ✗/✗/✗/✗ | f in d _ mocomputes the maximumumnulatipatiprod in garimetipation. Time O_ log _ n; space O_1. Best-case O_ |
| `Surface Bottleneck (V2)` | 0.0196 | 0.0294 | ✗/✗/✗/✗ | fib _ no _ memocomputesFibonacci re cursively wi th outmemoization, lead in g to O_ 2 ne |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | matimationtionsentiontionstionstionenesendsentiontiontionenstionst |
| `Consistency Loss (V1)` | final (ep20) | 0.1176 | 0.1282 | fib _ onocomputesbok so rt ed l is t using ahapblvalgements. Time O_ 2; space O_ n for the out. Best-case O_ n 2. Key_operation is com |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | max _ lestilesentimatimumumentimentimentiontiontiontimentiontiontime |
| `No Consistency Loss (V1)` | final (ep20) | 0.0196 | 0.0294 | fib _ no _ memocomputesFibonacci re cursively wi th outmemoization, lead in g to O_ 2 ne |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | max _ lestilesesentimestimentimentiontiontiontimentiontimentiontions |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.0784 | 0.0714 | f in d _ m is s in gusesa the mat ch apaict er s in g wi th ast in gle _ passsov er the largest, runnn in g in O_ l |
| `Random Label (V1)` | initial (ep5) | 0.0 | 0.0 | max _ lestimax _ lestimatimatimatimestimationdstimentimentiontiontions |
| `Random Label (V1)` | final (ep20) | 0.0196 | 0.0294 | fib _ no _ mocomputesFibonaccci re cursively wi th outmemoization, lead in g to O_ 2 nex |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0196 | 0.037 | matimationtionsestiontionstionsestionestiontionstionstiones the lestio |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.0392 | 0.058 | fib _ onot er ationationsusar ch arandcemputiessarandan in g in O_ ntimeand O_1 space. Best-case is O_ n |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | matimationtionsentiontionstionsestionesendstionstionstionenstionst |
| `Claims from Expl Only (V2)` | final (ep20) | 0.0784 | 0.1159 | f in d _ mocomputes the maximumumnulatipatiprod in garimetipation. Time O_ log _ n; space O_1. Best-case O_ |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0 | 0.0 | max _ lestitimesentimentimentionstiontiontiontimentiontimentiontiont |
| `Surface Bottleneck (V2)` | final (ep20) | 0.0196 | 0.0294 | fib _ no _ memocomputesFibonacci re cursively wi th outmemoization, lead in g to O_ 2 ne |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0196 | 0.023 | en = max _ lengen ) return max _ len ( max _ len, dp [ max _ len ) return max _ len <sep> endp [ i - 1 ] ) <sep> en ( m |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 12: `rotate_array_k` (val idx 2169)

Ground truth: time_complexity=`O_n`, space_complexity=`O_n`, algorithm_class=`sorting`, correctness_status=`missing_edge_case`

**Reference explanation:** rotate shifts array elements right by k positions using Python slice concatenation. Time O_n; space O_n for the new list. Best-case O_n. Key_operation is list_append (slice concat). Sequential access. Input is mutated via slice assignment. Empty array: k%0 raises — missing_edge_case. Duplicate elements are all rotated correctly.

**Mismatched training explanation:** max_depth computes the maximum depth of a binary tree recursively. Time O_n (visits every node); space O_n for the recursion_stack. Best-case O_log_n on a balanced tree. Key_operation is comparison (max). Recursive. Input not mutated. None root returns 0. Not applicable for duplicates (structure query).

```python
def rotate(nums, k):
    n = len(nums)
    k %= n
    nums[:] = nums[-k:] + nums[:-k]

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.1277 | 0.1299 | ✗/✗/✗/✗ | str in g _ in g er parosesasesast in glegopignscontignsigcontiveconstecontionstr in gcateditediteationationation. Time O_ n; space O_1. Best-case O_1 when b is str in g er. Key_operation is arithmetic. Single_pass. |
| `No Consistency Loss (V1)` | 0.1277 | 0.1176 | ✗/✗/✗/✗ | str in g _ comp re ssionrunsas in glepasscount in gconsecutiveduplicate ch arsandbuildsacomp re ss ed str in g. Returns the short er of the two. Time O_ n; space O_ n. Best-case O_ n. Key_operation is s |
| `Claim-Only Pooling (V1)` | 0.1277 | 0.1163 | ✗/✗/✗/✗ | str in g _ comp re sionrunsas in glepassscount in gconsecutiveduplicate ch arsandbuildsacomp re sss ed str in g. Returns the t er of the two. Time O_ n; space O_ n. Best-case O_ n. Key_operation is str in |
| `Random Label (V1)` | 0.1277 | 0.1111 | ✗/✗/✗/✗ | str in g _ comp re sionrugyas in glepasss in gcounticativeduplication ch arsacomp re sssstr in g. Returns the short er t er of the t er. Time O_ n; space O_ n. Best-case O_ n. Key_operation is str in g _ conca |
| `No Claim→Claim Attn (V2)` | 0.1702 | 0.1379 | ✗/✓/✗/✗ | count _ pit er ations the lemumb er of n - its in givelypitsemeats using O_1 space DP. Best-case O_1 ( olexp ). Key_operation is arithmetic. Single_pass while loop. Input not mutated. n=0 return s 0. Not applicable for duplicates. |
| `Claims from Expl Only (V2)` | 0.1731 | 0.1414 | ✗/✗/✗/✗ | b in gear ch estran in g _ comp re firssstr in g wi th lead the contesteat in geat in gpationasseand of the firstir in g. Time O_ n; space O_1. Best-case O_ n. Key_operation is compar is on. Seque ntia cce s. Input not mutated. Empty str in g return |
| `Surface Bottleneck (V2)` | 0.1277 | 0.1176 | ✗/✗/✗/✗ | str in g _ comp re ssionrunsas in glepasscount in gconsecutiveduplicate ch arsandbuildsacomp re ss ed str in g. Returns the short er of the two. Time O_ n; space O_ n. Best-case O_ n. Key_operation is s |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | mstrtrtratratrstrantrangelllelestrangestrtimstrangentigestrtrangestrstrtrstrstrallestimarstrtrtrtrstrstrtrstrangengeleac |
| `Consistency Loss (V1)` | final (ep20) | 0.1277 | 0.1299 | str in g _ in g er parosesasesast in glegopignscontignsigcontiveconstecontionstr in gcateditediteationationation. Time O_ n; space O_1. Best-case O_1 when b is str in g er. Key_operation is arithmetic. Single_pass. |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | m _ p re fit _ sesumsestrationtentestiontimstentiontiontixpationtiontiontixpationtixiontiontimenstix, lentiontix, lentiontitixpat |
| `No Consistency Loss (V1)` | final (ep20) | 0.1277 | 0.1176 | str in g _ comp re ssionrunsas in glepasscount in gconsecutiveduplicate ch arsandbuildsacomp re ss ed str in g. Returns the short er of the two. Time O_ n; space O_ n. Best-case O_ n. Key_operation is s |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0213 | 0.0333 | m _ sumstratesentrantesententixpstims the ntentimstiontentiatix, ntix, ntims, the lentixpatimentiationtix, lentix, ix, the lentiontiatiati |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.1277 | 0.1163 | str in g _ comp re sionrunsas in glepassscount in gconsecutiveduplicate ch arsandbuildsacomp re sss ed str in g. Returns the t er of the two. Time O_ n; space O_ n. Best-case O_ n. Key_operation is str in |
| `Random Label (V1)` | initial (ep5) | 0.0213 | 0.0351 | m _ sesumsararontratrarontestesteararondsteararararargestix, gestimatix, nd the lestiontimatixix, ntixparararatixixpatix, ndix, nti |
| `Random Label (V1)` | final (ep20) | 0.1277 | 0.1111 | str in g _ comp re sionrugyas in glepasss in gcounticativeduplication ch arsacomp re sssstr in g. Returns the short er t er of the t er. Time O_ n; space O_ n. Best-case O_ n. Key_operation is str in g _ conca |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0213 | 0.0385 | mstratimatestratratraratims the lleatims the maratimalatimatiarargengengengengeatixiatiximatiatiatiximendimenstiximenatiallatixi |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.1702 | 0.1379 | count _ pit er ations the lemumb er of n - its in givelypitsemeats using O_1 space DP. Best-case O_1 ( olexp ). Key_operation is arithmetic. Single_pass while loop. Input not mutated. n=0 return s 0. Not applicable for duplicates. |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | mstrtrtratratrtratrstixpalllelestrtratengestixtrantrtigengestixpatrstirtiongenstionstimstixpartrtrstrstrtrstiacangellele |
| `Claims from Expl Only (V2)` | final (ep20) | 0.1731 | 0.1414 | b in gear ch estran in g _ comp re firssstr in g wi th lead the contesteat in geat in gpationasseand of the firstir in g. Time O_ n; space O_1. Best-case O_ n. Key_operation is compar is on. Seque ntia cce s. Input not mutated. Empty str in g return |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0 | 0.0 | m _ p re fit _ sesumsestrationtentestiontimstentiontiontixpationtiontiontixpationtixiontiontimenstix, lentiontix, lentiontitixpat |
| `Surface Bottleneck (V2)` | final (ep20) | 0.1277 | 0.1176 | str in g _ comp re ssionrunsas in glepasscount in gconsecutiveduplicate ch arsandbuildsacomp re ss ed str in g. Returns the short er of the two. Time O_ n; space O_ n. Best-case O_ n. Key_operation is s |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0185 | 0.0198 | numsk: - k - k, numsums <sep> k, nums [: - k - k, nums [: ] <sep> numsumsk - k: - k, k, k, numsums <sep> numsums <sep> nums <sep> k, nums <sep> k: k, nums [: - k <sep> k: re s <sep> k <sep> <sep> <sep> k, k ( numsums <sep> |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 13: `activity_selection` (val idx 2731)

Ground truth: time_complexity=`O_n_log_n`, space_complexity=`O_n`, algorithm_class=`greedy`, correctness_status=`fully_correct`

**Reference explanation:** activity_selection greedily selects the maximum set of non-overlapping activities by sorting on finish time in O_n_log_n, then a single_pass scan in O_n. Overall O_n_log_n time and O_n space for the sorted index list. Best-case is O_n_log_n. The key_operation is comparison. Input lists are not mutated. Empty input returns empty list.

**Mismatched training explanation:** min_cut finds the minimum number of cuts to partition s into palindromes using a 2D palindrome table and 1D DP. Time O_n2; space O_n2. Best-case O_n2. Key_operation is comparison. Nested_2_level. Input not mutated. Empty string: loops don't run, dp[-1] = dp[-1] = -1. Duplicate characters enable longer palindrome recognitions.

```python
def activity_selection(start, finish):
    n = len(start)
    if n == 0:
        return []
    activities = sorted(range(n), key=lambda i: finish[i])
    selected = [activities[0]]
    last_finish = finish[activities[0]]
    for i in activities[1:]:
        if start[i] >= last_finish:
            selected.append(i)
            last_finish = finish[i]
    return selected

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.04 | 0.0351 | ✗/✗/✗/✗ | f in d _ matecusesarotation of tiona |
| `No Consistency Loss (V1)` | 0.04 | 0.0714 | ✗/✗/✗/✗ | set _ in t er sectionbuildsasetfrom |
| `Claim-Only Pooling (V1)` | 0.0 | 0.0 | ✗/✗/✗/✗ | max _ producttttionbuildsack to m |
| `Random Label (V1)` | 0.06 | 0.0625 | ✗/✗/✗/✗ | f in d _ m is s in gusesas in gletadoup to f in |
| `No Claim→Claim Attn (V2)` | 0.02 | 0.0328 | ✗/✗/✗/✗ | fib _ onat _ oo _ buggyhasasan of f _ o |
| `Claims from Expl Only (V2)` | 0.02 | 0.0351 | ✗/✗/✗/✗ | f in d _ mation ch arycomputesFicccc |
| `Surface Bottleneck (V2)` | 0.04 | 0.0714 | ✗/✗/✗/✗ | set _ in t er sectionbuildsasetfrom |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.04 | 0.0741 | f in is hectititictitictiondiendi |
| `Consistency Loss (V1)` | final (ep20) | 0.04 | 0.0351 | f in d _ matecusesarotation of tiona |
| `No Consistency Loss (V1)` | initial (ep5) | 0.04 | 0.0714 | f in is he _ puseasestitionbusestio |
| `No Consistency Loss (V1)` | final (ep20) | 0.04 | 0.0714 | set _ in t er sectionbuildsasetfrom |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.04 | 0.0714 | f in is he _ pusesesestiontionastio |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.0 | 0.0 | max _ producttttionbuildsack to m |
| `Random Label (V1)` | initial (ep5) | 0.04 | 0.0714 | f in is hestiondiontiv er astionesti |
| `Random Label (V1)` | final (ep20) | 0.06 | 0.0625 | f in d _ m is s in gusesas in gletadoup to f in |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.04 | 0.0741 | f in is hesenditicendictititiendi |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.02 | 0.0328 | fib _ onat _ oo _ buggyhasasan of f _ o |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.04 | 0.0741 | f in is hectititictitictititiendi |
| `Claims from Expl Only (V2)` | final (ep20) | 0.02 | 0.0351 | f in d _ mation ch arycomputesFicccc |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.04 | 0.0714 | f in is he _ puseasestitionbusestio |
| `Surface Bottleneck (V2)` | final (ep20) | 0.04 | 0.0714 | set _ in t er sectionbuildsasetfrom |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.08 | 0.1194 | f in is h = f in is h = f in is hf in is hectitiecec |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 14: `frequency_map` (val idx 1021)

Ground truth: time_complexity=`O_n`, space_complexity=`O_n`, algorithm_class=`math_computation`, correctness_status=`fully_correct`

**Reference explanation:** frequency_map builds a dictionary mapping each value to its count in O_n time and O_n space. Best-case is O_n since all elements must be visited. The key_operation is hash_lookup. The input list is not mutated. An empty list returns an empty dict. Duplicate elements are counted, their distinct count tracked—so duplicate presence is essential behavior.

**Mismatched training explanation:** word_frequency splits text and builds a frequency hash_map in O_n time and O_n space. Best-case is O_n. The key_operation is hash_lookup. Input string is not mutated. An empty string returns an empty dict. Repeated words are counted, merging duplicates into a single count entry.

```python
def frequency_map(lst):
    freq = {}
    for x in lst:
        freq[x] = freq.get(x, 0) + 1
    return freq

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.1636 | 0.18 | ✗/✗/✗/✗ | m er ge _ so rt _ v er gestwon er gest er ationbly in g wi d th asestestadcondictes, cun in givestelyn, g O_ ntimeand O_ n space. Best-case is O_ n. Thekey _ op er ation is swap. Input is not mutated. Anem |
| `No Consistency Loss (V1)` | 0.1579 | 0.125 | ✗/✗/✗/✗ | max _ de p th computes the maximum de p th of ab in aryt re e re cursively. Time O_ n ( v is itsev er yno de ); space O_ n for the re cursion _ stac k. Best-case O_ log _ nonabalanc ed t re e. Key_operation is compar is on ( ma x) |
| `Claim-Only Pooling (V1)` | 0.1636 | 0.1308 | ✗/✗/✗/✗ | max _ de p th computes the maximum de p th of ab in aryt re e re cursively. Time O_ n ( v is itsev er yno de ); space O_ n for the re cursion _ stac k. Best-case O_ log _ nonalabralay. Key_operation is compar is on. Seque ntia |
| `Random Label (V1)` | 0.1636 | 0.1273 | ✗/✗/✗/✗ | max _ de p th computes the maximum de p th of ab in aryt re e re cursively. Time O_ n ( v is itsev er yno de ); space O_ n for the re cursion _ stac k. Best-case O_ log _ nonalancalanc ed t re e. Key_operation is compar is on ( |
| `No Claim→Claim Attn (V2)` | 0.1273 | 0.0971 | ✗/✗/✗/✗ | max _ de p th re cursives the l is ts in goncursestas in glecentsas in glepasssstr in glet er v er ep, r in g O_ ntimeand O_ n space. Best-case is O_ n. Thekey _ op er ation is s in putr in g _ conmut |
| `Claims from Expl Only (V2)` | 0.129 | 0.1197 | ✗/✗/✗/✗ | m er ge _ so rt _ v er geswandit er ationbugy in to m er ge so rt er ge so rt er gatition, in t er g in n, in glapaticas in s wi th les er ates. Time O_ n; space O_1. Best-case O_ n. Key_operation is compar is on. Two - po in t er a cce s |
| `Surface Bottleneck (V2)` | 0.1579 | 0.125 | ✗/✗/✗/✗ | max _ de p th computes the maximum de p th of ab in aryt re e re cursively. Time O_ n ( v is itsev er yno de ); space O_ n for the re cursion _ stac k. Best-case O_ log _ nonabalanc ed t re e. Key_operation is compar is on ( ma x) |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | m _ stestesestiontentesulllontseststiontstionstssststarstionstiontionstionsstiontionstionsstionssstionssstionstionstionsst |
| `Consistency Loss (V1)` | final (ep20) | 0.1636 | 0.18 | m er ge _ so rt _ v er gestwon er gest er ationbly in g wi d th asestestadcondictes, cun in givestelyn, g O_ ntimeand O_ n space. Best-case is O_ n. Thekey _ op er ation is swap. Input is not mutated. Anem |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | max _ m er atitionationtiontionstiontionstillestiontiontiontiontiontiontiontionstionationstiontilationstionstigiontionatigion |
| `No Consistency Loss (V1)` | final (ep20) | 0.1579 | 0.125 | max _ de p th computes the maximum de p th of ab in aryt re e re cursively. Time O_ n ( v is itsev er yno de ); space O_ n for the re cursion _ stac k. Best-case O_ log _ nonabalanc ed t re e. Key_operation is compar is on ( ma x) |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | max _ max _ m er gesestentestitiontiontiontiontionstiontilenentiontiontiontiontilentionstionentionentiontiontiontiationationsti |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.1636 | 0.1308 | max _ de p th computes the maximum de p th of ab in aryt re e re cursively. Time O_ n ( v is itsev er yno de ); space O_ n for the re cursion _ stac k. Best-case O_ log _ nonalabralay. Key_operation is compar is on. Seque ntia |
| `Random Label (V1)` | initial (ep5) | 0.0 | 0.0 | m _ masesesesteartentiv er stiontiontiontiontionstiondstiontionstionstiontionstionestiontionstionestionestiontiontiontiontion |
| `Random Label (V1)` | final (ep20) | 0.1636 | 0.1273 | max _ de p th computes the maximum de p th of ab in aryt re e re cursively. Time O_ n ( v is itsev er yno de ); space O_ n for the re cursion _ stac k. Best-case O_ log _ nonalancalanc ed t re e. Key_operation is compar is on ( |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0 | 0.0 | mationtesestiontentesallontiontiontiontiontsestionestationsesestiontionsestionstionstionstionstionesstionastionstiontion |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.1273 | 0.0971 | max _ de p th re cursives the l is ts in goncursestas in glecentsas in glepasssstr in glet er v er ep, r in g O_ ntimeand O_ n space. Best-case is O_ n. Thekey _ op er ation is s in putr in g _ conmut |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | m _ stestesestiontentesestiontiontsellontiontstionstionesstiontionstionstionstionsssssssstionstionstionsstionestastiontion |
| `Claims from Expl Only (V2)` | final (ep20) | 0.129 | 0.1197 | m er ge _ so rt _ v er geswandit er ationbugy in to m er ge so rt er ge so rt er gatition, in t er g in n, in glapaticas in s wi th les er ates. Time O_ n; space O_1. Best-case O_ n. Key_operation is compar is on. Two - po in t er a cce s |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0 | 0.0 | max _ m er atitionationtiontionstiontionstillestiontiontiontiontiontiontiontionstionationstiontilationstionstigiontionatigion |
| `Surface Bottleneck (V2)` | final (ep20) | 0.1579 | 0.125 | max _ de p th computes the maximum de p th of ab in aryt re e re cursively. Time O_ n ( v is itsev er yno de ); space O_ n for the re cursion _ stac k. Best-case O_ log _ nonabalanc ed t re e. Key_operation is compar is on ( ma x) |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0 | 0.0 | f re q. get ( x, 0 ) <sep> ( x, 0 ) <sep> f re q <sep> f re q <sep> f re q [ x ] = f re q <sep> ( x, 0 ) - = f re q [ old ( f re q [ x ] return f re q <sep> ( x, 0 ) + 1 return f re q <sep> <sep> <sep> <sep> <sep> ( hi - 1 return f re q <sep> ( x <sep> <sep> lo <sep> ( hi <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> </claim> |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 15: `is_power_of_two` (val idx 160)

Ground truth: time_complexity=`O_1`, space_complexity=`O_1`, algorithm_class=`math_computation`, correctness_status=`fully_correct`

**Reference explanation:** is_power_of_two uses a single bitwise-AND trick to check if n is a power of two in O_1 time and O_1 space. Best-case is also O_1. The key_operation is arithmetic (bitwise AND). Input n is not mutated. Non-positive integers return False correctly. Not applicable for duplicates.

**Mismatched training explanation:** count_distinct_chars inserts all characters into a set and returns its size. Time O_n; space O_n for the set. Best-case O_n (must process all chars). Key_operation is hash_lookup (set insertion). Sequential access. Input not mutated. Empty string returns 0. Duplicate characters are deduplicated by the set, so effectively removes_duplicates.

```python
def is_power_of_two(n):
    if n <= 0:
        return False
    return (n & (n - 1)) == 0

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.2444 | 0.2353 | ✗/✗/✗/✗ | max _ wi ndow _ sumusesasestackummatexivelup to re atendates in gles in g wi nglticon in O_ ntimeand O_ n space. Best-case is O_ n. Thekey _ op er ation is has h_ loo kup. Input is not mutated. Anemptyl |
| `No Consistency Loss (V1)` | 0.2167 | 0.1905 | ✗/✗/✗/✗ | max _ prow er _ sumsesatwo in t er sconv er g in gfrombo th ends in gward to f in d the cont er wi th the mostwat er. Time O_ n; space O_ n. Best-case O_ n. Key_operation is compar is on. Nest ed _ 2 _ level. Input not mutated. Empty l is t return s 0. |
| `Claim-Only Pooling (V1)` | 0.2 | 0.1524 | ✗/✗/✗/✗ | max _ po in t er _ ch at er secomputes the m in imumCPU in t er vals using ag re ed y for mulas ed on the maximumf re quencytas k. Time O_ n; space O_ n for Count er. Best-case O_ n. Key_operation is has h_ loo kup. Seque n tial. Input not mutated. Empty tasks: Co |
| `Random Label (V1)` | 0.1778 | 0.1136 | ✗/✗/✗/✗ | max _ p th re cursivelytrattl is ts in to ea ch pack in O_ ntime ( nsumsumsqumet ( to ntann ) to talacomp er supaltives ). Space is O_ n for re cursion _ stac k. Best-case is O_ n. Thekey _ |
| `No Claim→Claim Attn (V2)` | 0.2 | 0.1839 | ✗/✗/✗/✗ | max _ wi ndow _ sumusesasesasubarr in g wi th ay in gletea ch to f in d the masumallementsumat in O_ ntimeand O_1 space. Best-case is O_ n. Thekey _ op er ation is has h_ lookup for the in punonunu |
| `Claims from Expl Only (V2)` | 0.0351 | 0.0392 | ✗/✗/✗/✗ | max _ wi ndow _ sum in _ wi ndowtesusesubart in g wi ndowset in g wi ndowt in t in t in g wi ndowsundowt in t in t in t in t ch arandowsut in gando in t er stpa in t in t er st ch aco in t er s. Time O_ n; |
| `Surface Bottleneck (V2)` | 0.2083 | 0.1935 | ✗/✗/✗/✗ | max _ subarrayusesKadane ' sg re ed lyargori th m wi th as in gle _ passov er the l is t, a ch iev in g O_ ntimeand O_1 space. Best-case is O_ n. Thekey _ op er ation is ari th metic ( maxcompar is on |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | co _ st _ ntitiontitiontionticontiontilestiontilestionsestilestionstionstiontionstionstilestionstionstionsstionstionstionsti |
| `Consistency Loss (V1)` | final (ep20) | 0.2444 | 0.2353 | max _ wi ndow _ sumusesasestackummatexivelup to re atendates in gles in g wi nglticon in O_ ntimeand O_ n space. Best-case is O_ n. Thekey _ op er ation is has h_ loo kup. Input is not mutated. Anemptyl |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | m _ sum _ sumumbumsesesesentimesesestitiontiontiontiventiv er sentiontiontimentionstiontimelentiontiontimeationtiontiontimellen |
| `No Consistency Loss (V1)` | final (ep20) | 0.2167 | 0.1905 | max _ prow er _ sumsesatwo in t er sconv er g in gfrombo th ends in gward to f in d the cont er wi th the mostwat er. Time O_ n; space O_ n. Best-case O_ n. Key_operation is compar is on. Nest ed _ 2 _ level. Input not mutated. Empty l is t return s 0. |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | m _ sum _ sumumsesumsumsaxpllllesesentimestimentimentimentimensimentimentimentiontimelentiontimentiontimentiontimentiontions |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.2 | 0.1524 | max _ po in t er _ ch at er secomputes the m in imumCPU in t er vals using ag re ed y for mulas ed on the maximumf re quencytas k. Time O_ n; space O_ n for Count er. Best-case O_ n. Key_operation is has h_ loo kup. Seque n tial. Input not mutated. Empty tasks: Co |
| `Random Label (V1)` | initial (ep5) | 0.0 | 0.0 | bum _ sesesesestioontiontiontionselonstionstiontionstionsestiontiontionstionstiontivestiondstionstiontionstiontiontivelest |
| `Random Label (V1)` | final (ep20) | 0.1778 | 0.1136 | max _ p th re cursivelytrattl is ts in to ea ch pack in O_ ntime ( nsumsumsqumet ( to ntann ) to talacomp er supaltives ). Space is O_ n for re cursion _ stac k. Best-case is O_ n. Thekey _ |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0 | 0.0 | m _ se _ po _ sesesestiontiontioncontiontiontiontiontiontionsestiontiontionanstionstionanananstigionacontionationdiantionansti |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.2 | 0.1839 | max _ wi ndow _ sumusesasesasubarr in g wi th ay in gletea ch to f in d the masumallementsumat in O_ ntimeand O_1 space. Best-case is O_ n. Thekey _ op er ation is has h_ lookup for the in punonunu |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | co _ st _ ntitiontitiontionticontiontitiontiontilesticonstilestiontionstionstionstiontionsstionssstiontionstionstionsstionti |
| `Claims from Expl Only (V2)` | final (ep20) | 0.0351 | 0.0392 | max _ wi ndow _ sum in _ wi ndowtesusesubart in g wi ndowset in g wi ndowt in t in t in g wi ndowsundowt in t in t in t in t ch arandowsut in gando in t er stpa in t in t er st ch aco in t er s. Time O_ n; |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0 | 0.0 | m _ sum _ sumumbumsesesesentimesesestitiontiontiontiventiv er sentiontiontimentionstiontimelentiontiontimeationtiontiontimellen |
| `Surface Bottleneck (V2)` | final (ep20) | 0.2083 | 0.1935 | max _ subarrayusesKadane ' sg re ed lyargori th m wi th as in gle _ passov er the l is t, a ch iev in g O_ ntimeand O_1 space. Best-case is O_ n. Thekey _ op er ation is ari th metic ( maxcompar is on |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0 | 0.0 | e <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> </claim> |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 16: `find_min_rotated` (val idx 2047)

Ground truth: time_complexity=`O_log_n`, space_complexity=`O_1`, algorithm_class=`searching`, correctness_status=`missing_edge_case`

**Reference explanation:** find_min_rotated uses binary search on a rotated sorted list to find the minimum element in O_log_n time. Space is O_1. Best-case same_as_worst. The key_operation is comparison. Divide_conquer access pattern. Input not mutated. Empty list crashes (missing guard), making it a missing_edge_case bug. Duplicates: not_applicable.

**Mismatched training explanation:** kmp_search finds the first occurrence of pattern in text in O_n time (assuming the failure table is precomputed). Space O_n for the fail array. Best-case O_1 when pattern matches immediately. Key_operation is comparison. Sequential access. Input not mutated. Empty pattern returns 0. Duplicate characters in pattern are handled by the failure function.

```python
def find_min_rotated(lst):
    lo, hi = 0, len(lst) - 1
    while lo < hi:
        mid = (lo + hi) // 2
        if lst[mid] > lst[hi]:
            lo = mid + 1
        else:
            hi = mid
    return lst[lo]

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.2222 | 0.1837 | ✗/✗/✗/✗ | m in _ paro in t er valses the m in imume ed ime ed ime ed iments using ato in timumet to hand O_1 space. Best-case O_1 when musarksetelooorments. Key_operation is compar is on ( ma x). No in t er ss for mut er. Input not mutated. Empty in g in g |
| `No Consistency Loss (V1)` | 0.2034 | 0.1942 | ✗/✗/✗/✗ | m in _ plat for msf in ds the m in imumnumb er of tra in plat for msne ed ed. Time O_ n 2 ( tamesfrily ); space O_ n for tarray. Best-case O_ n 2. Key_operation is compar is on ( m in ). Nest ed _ 2 _ level. Input not mutated. Empty matrices return s |
| `Claim-Only Pooling (V1)` | 0.2321 | 0.22 | ✗/✓/✓/✗ | m in _ protmovesow er roapo in tsonarot using a 2 using f in dbleamend 1 D DP. Time O_ log _ n; space O_1. Best-case O_1. Key_operation is compar is on ( m in ). Nest ed _ 2 _ level. Input not mutated. Empty oorn = 1 return s 0. Not applicable for duplicates. |
| `Random Label (V1)` | 0.1364 | 0.1739 | ✗/✗/✗/✗ | max _ productusesastromaximumultrip to hexpodup in O_ ntime ( nsqrtsqracens ) to numenntsb er scks. Space is O_ n for the re siltarraymentrayditras. Space is |
| `No Claim→Claim Attn (V2)` | 0.1818 | 0.1446 | ✗/✗/✗/✗ | max _ prow _ modusesbastrigumestrovelelements, k in g to re adypeandatitiand O_1 space. Best-case is O_ n. They _ op er ation is c ( mulion ). Input is not mutated. Nonegly return s 0. Duplicate pues are alll |
| `Claims from Expl Only (V2)` | 0.2766 | 0.2418 | ✗/✗/✗/✗ | m in _ paro in _ mumb er utses the mumumnumb er of araritititives in g to me ed laritsexpandrititition. Time O_ log _ n; space O_1. Best-case O_1 when metexpry. Key_operation is compar is on. Single_pass. Input not mutated. Empty l is t return s 0 |
| `Surface Bottleneck (V2)` | 0.2034 | 0.1942 | ✗/✗/✗/✗ | m in _ plat for msf in ds the m in imumnumb er of tra in plat for msne ed ed. Time O_ n 2 ( tamesfrims ); space O_ n for tarray. Best-case O_ n 2. Key_operation is compar is on ( m in ). Nest ed _ 2 _ level. Input not mutated. Empty matrices return s |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | malallomumacallomallatimatimesesestimestimestimestimestimentimestimestimelestimelestimelontimestiontimelontimentionstime |
| `Consistency Loss (V1)` | final (ep20) | 0.2222 | 0.1837 | m in _ paro in t er valses the m in imume ed ime ed ime ed iments using ato in timumet to hand O_1 space. Best-case O_1 when musarksetelooorments. Key_operation is compar is on ( ma x). No in t er ss for mut er. Input not mutated. Empty in g in g |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | mumut _ cont _ so rtimuntimplycontimentimentimentimentimentiontimentimelentimentimentimentimentimentiontiontimentimentiontiont |
| `No Consistency Loss (V1)` | final (ep20) | 0.2034 | 0.1942 | m in _ plat for msf in ds the m in imumnumb er of tra in plat for msne ed ed. Time O_ n 2 ( tamesfrily ); space O_ n for tarray. Best-case O_ n 2. Key_operation is compar is on ( m in ). Nest ed _ 2 _ level. Input not mutated. Empty matrices return s |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | mumumumumumumsesesumumstimstimentimentimentimentimentimentimentimentimentimentimentimentiontimentimentimentimentiontiont |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.2321 | 0.22 | m in _ protmovesow er roapo in tsonarot using a 2 using f in dbleamend 1 D DP. Time O_ log _ n; space O_1. Best-case O_1. Key_operation is compar is on ( m in ). Nest ed _ 2 _ level. Input not mutated. Empty oorn = 1 return s 0. Not applicable for duplicates. |
| `Random Label (V1)` | initial (ep5) | 0.0 | 0.0 | marimbuticomplesesestestimentimentimentimentimestimentiontiontimestimentiontimentiontionstiontiontiontimentiontimenstion |
| `Random Label (V1)` | final (ep20) | 0.1364 | 0.1739 | max _ productusesastromaximumultrip to hexpodup in O_ ntime ( nsqrtsqracens ) to numenntsb er scks. Space is O_ n for the re siltarraymentrayditras. Space is |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0 | 0.0 | malatalomatimalomalacatimeseselesendicondimestimatitimestimestitimatititiontimestiontitiondiontiontiontiontionelelatitia |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.1818 | 0.1446 | max _ prow _ modusesbastrigumestrovelelements, k in g to re adypeandatitiand O_1 space. Best-case is O_ n. They _ op er ation is c ( mulion ). Input is not mutated. Nonegly return s 0. Duplicate pues are alll |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | malallomumacallomallatimesticatimesesestimestimentimestiontimestimestimestimestimestimelontiontiontiontiontiontimelelele |
| `Claims from Expl Only (V2)` | final (ep20) | 0.2766 | 0.2418 | m in _ paro in _ mumb er utses the mumumnumb er of araritititives in g to me ed laritsexpandrititition. Time O_ log _ n; space O_1. Best-case O_1 when metexpry. Key_operation is compar is on. Single_pass. Input not mutated. Empty l is t return s 0 |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0 | 0.0 | mumut _ cont _ so rtimuntimplestiontimentimentimentimentimentiontimentimentimentimentimentiontiontiontiontiontiontiontiontiont |
| `Surface Bottleneck (V2)` | final (ep20) | 0.2034 | 0.1942 | m in _ plat for msf in ds the m in imumnumb er of tra in plat for msne ed ed. Time O_ n 2 ( tamesfrims ); space O_ n for tarray. Best-case O_ n 2. Key_operation is compar is on ( m in ). Nest ed _ 2 _ level. Input not mutated. Empty matrices return s |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0 | 0.0 | e: hi = mid return lo <sep> <sep> <sep> else: hi - = mid return lst <sep> <sep> <sep> <sep> mid - 1 <sep> e <sep> <sep> <sep> <sep> <sep> <sep> <sep> e: hi <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> e <sep> <sep> <sep> <sep> <sep> <sep> e <sep> <sep> </claim> |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 17: `search_rotated_sorted` (val idx 561)

Ground truth: time_complexity=`O_log_n`, space_complexity=`O_1`, algorithm_class=`searching`, correctness_status=`fully_correct`

**Reference explanation:** search_rotated adapts binary search for a once-rotated sorted array. Time O_log_n; space O_1. Best-case is O_1 when mid hits target immediately. Key_operation is comparison. Divide_conquer access pattern. Input not mutated. Empty list returns -1 via lo > hi. Duplicates: not_applicable (only one matching index returned).

**Mismatched training explanation:** shell_sort is an in-place sorting algorithm using diminishing gap sequences. It uses nested loops (outer gap, middle insertion, inner shift), giving O_n2 worst-case time but O_n best-case on nearly-sorted data. Space is O_1 with a single temp_variable. The key_operation is comparison. Input is mutated. Empty inputs are safe and duplicates are fully preserved.

```python
def search_rotated(lst, target):
    lo, hi = 0, len(lst) - 1
    while lo <= hi:
        mid = (lo + hi) // 2
        if lst[mid] == target:
            return mid
        if lst[lo] <= lst[mid]:
            if lst[lo] <= target < lst[mid]:
                hi = mid - 1
            else:
                lo = mid + 1
        else:
            if lst[mid] < target <= lst[hi]:
                lo = mid + 1
            else:
                hi = mid - 1
    return -1

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.2889 | 0.2529 | ✓/✓/✓/✗ | fat _ p er mution _ buggyhasan of frombay is stexp. Time O_ n; space O_1. Best-case O_1 when firiri th menten. Key_operation is arithmetic. Single_pass while loop. Input not mutated. Non < = 1 return s 0. Not applicable for duplicates. |
| `No Consistency Loss (V1)` | 0.0 | 0.0 | ✗/✗/✗/✗ | fib _ no _ menocomputesFibonacci re cursively wi th outmemoization, lead in g to O_ 2 nexponenenentialtime due to re peat ed s |
| `Claim-Only Pooling (V1)` | 0.0667 | 0.0519 | ✗/✗/✗/✗ | max _ productttttionbuildsack ed mution using al - hap er en the n th re cursively. Time O_ n ( to tatelements ); space O_ n for the re cursion _ sta |
| `Random Label (V1)` | 0.0889 | 0.0811 | ✗/✗/✗/✗ | max _ productusesastroouns in g wi th ahetwopopsapdapdaroun in g in O_ ntimeand O_1 space. Best-case is O_ n. Thekey _ op er ation is has h_ loo kup. |
| `No Claim→Claim Attn (V2)` | 0.0222 | 0.0267 | ✗/✗/✗/✗ | fan _ bugyhasawrong _ conditionbug: the barongestest ed in stead of batex ed str in gad of b, so the in t ed, so rt ed l is ngunemputit. T |
| `Claims from Expl Only (V2)` | 0.0889 | 0.0822 | ✗/✗/✗/✗ | f in d _ muten re ed usilycomputes the maximumumult of atip re fativelu re haltendablipesicatement. Time O_ n; space O_ n for re sult. Best-case O_ n |
| `Surface Bottleneck (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | fib _ no _ menocomputesFibonacci re cursively wi th outmemoization, lead in g to O_ 2 nexponenenentialtime due to re peat ed s |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | mpompompompomponticondsesticasticondsticondsticacondsticondsticonstictiticonsticalesticalesti |
| `Consistency Loss (V1)` | final (ep20) | 0.2889 | 0.2529 | fat _ p er mution _ buggyhasan of frombay is stexp. Time O_ n; space O_1. Best-case O_1 when firiri th menten. Key_operation is arithmetic. Single_pass while loop. Input not mutated. Non < = 1 return s 0. Not applicable for duplicates. |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | mpon _ pontitiontitionsestiondstionstionstiondstionstitiondstiondstiondstitiontiondstitiontitio |
| `No Consistency Loss (V1)` | final (ep20) | 0.0 | 0.0 | fib _ no _ menocomputesFibonacci re cursively wi th outmemoization, lead in g to O_ 2 nexponenenentialtime due to re peat ed s |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | m _ pon _ contitionasesenastiondstiondstitionstiondstiondstitiontiontiondstiglendstiontionstionds |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.0667 | 0.0519 | max _ productttttionbuildsack ed mution using al - hap er en the n th re cursively. Time O_ n ( to tatelements ); space O_ n for the re cursion _ sta |
| `Random Label (V1)` | initial (ep5) | 0.0 | 0.0 | marion _ so ritiondsestionstiondstiondstimestiondstimestimestiv er arondstiondstiondstiondstiontiond |
| `Random Label (V1)` | final (ep20) | 0.0889 | 0.0811 | max _ productusesastroouns in g wi th ahetwopopsapdapdaroun in g in O_ ntimeand O_1 space. Best-case is O_ n. Thekey _ op er ation is has h_ loo kup. |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0 | 0.0 | mpondsestitionsesestititiondstitiondstionstitiondstitionstionstionstiondstionstiticonstionale |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.0222 | 0.0267 | fan _ bugyhasawrong _ conditionbug: the barongestest ed in stead of batex ed str in gad of b, so the in t ed, so rt ed l is ngunemputit. T |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | mponcompompomponticonsesticticticondsticonticondsesticonsticonticonstionstionstiticonstionsti |
| `Claims from Expl Only (V2)` | final (ep20) | 0.0889 | 0.0822 | f in d _ muten re ed usilycomputes the maximumumult of atip re fativelu re haltendablipesicatement. Time O_ n; space O_ n for re sult. Best-case O_ n |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0 | 0.0 | mpon _ pontitiontitionsestiondstionstionstiondstionstitiondstiondstiondstitiontiondstitionstiti |
| `Surface Bottleneck (V2)` | final (ep20) | 0.0 | 0.0 | fib _ no _ menocomputesFibonacci re cursively wi th outmemoization, lead in g to O_ 2 nexponenenentialtime due to re peat ed s |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0435 | 0.0526 | lse: hi = mid - 1 e: hi - 1 e: lse <sep> lse: hi = mid - 1 else: hi = mid - 1 return -1 <sep> -1 <sep> lse: hi = mid - 1 return -1 <sep> lo <sep> <sep> -1 <sep> -1 <sep> <sep> <sep> -1 <sep> <sep> <sep> <sep> lse <sep> -1 <sep> <sep> elo <sep> <sep> <sep> e <sep> <sep> <sep> <sep> <sep> </claim> |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 18: `caesar_cipher` (val idx 2237)

Ground truth: time_complexity=`O_n`, space_complexity=`O_n`, algorithm_class=`string_processing`, correctness_status=`fully_correct`

**Reference explanation:** caesar_cipher applies a rotational shift to each alphabetic character, preserving case and non-alphabetic characters. Time O_n; space O_n for the result list. Best-case O_n (every character must be processed). The key_operation is arithmetic (modular shift). Sequential access. Input not mutated. Empty string returns empty. All characters preserved.

**Mismatched training explanation:** two_sum_hash uses a hash_map to find a pair summing to target in O_n time and O_n space. However, the element is added to seen before checking for its complement, which can return an index pair where both indices are the same element when x+x==target (wrong_condition). Best-case is O_1 when the pair is found early. The key_operation is hash_lookup. Input is not mutated. An empty list returns None safely. Duplicates: the first qualifying pair is returned.

```python
def caesar_cipher(text, shift):
    result = []
    for ch in text:
        if ch.isalpha():
            base = ord('A') if ch.isupper() else ord('a')
            result.append(chr((ord(ch) - base + shift) % 26 + base))
        else:
            result.append(ch)
    return ''.join(result)

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.0426 | 0.0506 | ✗/✗/✗/✗ | m er ge _ so rt _ v 2 hasawrondconbug: the out er lopatestead of < = 1, comp are d in stealearund of 1, mp are ds the yicontelement ed s |
| `No Consistency Loss (V1)` | 0.0213 | 0.0317 | ✗/✗/✗/✗ | max _ productttttracksbo th maximum in imsubarrrayproducts to kendip er sitsfromnegativenumbl in O_ n 2 timeand |
| `Claim-Only Pooling (V1)` | 0.0213 | 0.0328 | ✗/✗/✗/✗ | max _ de producttttsbo th maximumumandm in imumsubarrayproducts to handlesignflipssfromnestsign if ivegn if i |
| `Random Label (V1)` | 0.1489 | 0.1867 | ✗/✗/✗/✗ | max _ productttsboductraymaximum in arayp re cursively. Time O_ n; space O_ n for the re sultarray. Best-case O_ n. Key_operation is compar is on ( maxsit ) |
| `No Claim→Claim Attn (V2)` | 0.0851 | 0.08 | ✗/✗/✗/✗ | max _ productes the maximumumb er of arativelypltititititions using aget er s ( opounstit er ation ). Time O_ n; space O_1. Best-case O_1 when n < lork |
| `Claims from Expl Only (V2)` | 0.0213 | 0.0253 | ✗/✗/✗/✗ | m er ge _ so rt _ v 2 hasawrondconbug: the out er lopat er s in stead of n - itead of 1, c in gf in steallempassssteanorund, mb in gi |
| `Surface Bottleneck (V2)` | 0.0213 | 0.0317 | ✗/✗/✗/✗ | max _ productttttracksbo th maximum in imsubarrrayproducts to kendip er sitsfromnegativenumbl in O_ n 2 timeand |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | masendsesendimesentimentiontimendimentimellesentimentiontimentiontiontiontiontiontionelond |
| `Consistency Loss (V1)` | final (ep20) | 0.0426 | 0.0506 | m er ge _ so rt _ v 2 hasawrondconbug: the out er lopatestead of < = 1, comp are d in stealearund of 1, mp are ds the yicontelement ed s |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | mat _ so rt _ so rt _ ngesentititiontitiontitionstiontiontitiontiontiontiontiontionstititiontionenti |
| `No Consistency Loss (V1)` | final (ep20) | 0.0213 | 0.0317 | max _ productttttracksbo th maximum in imsubarrrayproducts to kendip er sitsfromnegativenumbl in O_ n 2 timeand |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | masulontitiontititiontimentiontiontiontionsentiontiontiontiontiontiontiontiontionenstionti |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.0213 | 0.0328 | max _ de producttttsbo th maximumumandm in imumsubarrayproducts to handlesignflipssfromnestsign if ivegn if i |
| `Random Label (V1)` | initial (ep5) | 0.0 | 0.0 | masulonticondsesesestionticondstiondstimentiondstiontiondstiontiontiontiondiontiondstionti |
| `Random Label (V1)` | final (ep20) | 0.1489 | 0.1867 | max _ productttsboductraymaximum in arayp re cursively. Time O_ n; space O_ n for the re sultarray. Best-case O_ n. Key_operation is compar is on ( maxsit ) |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0 | 0.0 | masendsesesesesendsesesendsendsendseaseseaseaseaseaseaseaseaseaseaseasendseaseaseaseaparti |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.0851 | 0.08 | max _ productes the maximumumb er of arativelypltititititions using aget er s ( opounstit er ation ). Time O_ n; space O_1. Best-case O_1 when n < lork |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | masendsesendimesentimentiontimendimentimelontimentimentimentimelontimelontimentimentiontio |
| `Claims from Expl Only (V2)` | final (ep20) | 0.0213 | 0.0253 | m er ge _ so rt _ v 2 hasawrondconbug: the out er lopat er s in stead of n - itead of 1, c in gf in steallempassssteanorund, mb in gi |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0 | 0.0 | mat _ so rt _ so rt _ ngesentititiontitiontitionstiontiontiontiontiontiontiontionstiontitiontionenti |
| `Surface Bottleneck (V2)` | final (ep20) | 0.0213 | 0.0317 | max _ productttttracksbo th maximum in imsubarrrayproducts to kendip er sitsfromnegativenumbl in O_ n 2 timeand |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0 | 0.0 | co in ( o in ( re sult <sep> basult <sep> <sep> sulasend ( re sult ) <sep> <sep> <sep> <sep> else + base <sep> sext <sep> <sep> <sep> <sep> </claim> |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 19: `longest_palindrome_substr` (val idx 595)

Ground truth: time_complexity=`O_n2`, space_complexity=`O_1`, algorithm_class=`string_processing`, correctness_status=`wrong_condition`

**Reference explanation:** longest_palindrome_substr expands around every center (odd and even length) to find the longest palindromic substring. However, the final line uses stale l and r from the last expansion, which is a wrong_condition bug. Time O_n2; space O_1 (ignoring output). Best-case O_n. Key_operation is comparison. Sliding_window pattern. Input not mutated. Empty string: outer loop never runs; best stays empty but final line crashes.

**Mismatched training explanation:** coin_change_greedy uses the largest coin first, which is optimal for canonical coin systems but not_applicable as a general solution. Time O_n2 (n coin types × inner while subtractions); space O_1. Best-case O_n. Key_operation is arithmetic. Nested_2_level. Input not mutated. amount=0 returns 0 via count. Duplicate coin values cause extra but harmless iterations.

```python
def longest_palindrome_substr(s):
    best = ''
    for i in range(len(s)):
        for odd in (True, False):
            l, r = i, i + (0 if odd else 1)
            while l >= 0 and r < len(s) and s[l] == s[r]:
                if r - l + 1 > len(best):
                    best = s[l:r+1]
                l -= 1; r += 1
    best = s[l:r+1]
    return best

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.0161 | 0.025 | ✗/✗/✗/✗ | fib _ nononation ch ecksartitionbug: the stearad of n - itition in stead of 1 0 in steadparteandparititi |
| `No Consistency Loss (V1)` | 0.0161 | 0.0253 | ✗/✗/✗/✗ | fib _ no _ memocomputesFibonacci re cursively wi th outmememoization, lead in g to O_ 2 nexponenential |
| `Claim-Only Pooling (V1)` | 0.0484 | 0.0471 | ✗/✗/✗/✗ | max _ producttttion ch ecks if al is tcanbepartition ed in to twoequal - sumsubsets using a DP set to track reachable sums |
| `Random Label (V1)` | 0.0968 | 0.0889 | ✗/✗/✗/✗ | max _ produckspomputes the m in imumumumnucarm in g to paroducs in gam in O_ ntime wi th O_ n space for dparrowl is ts. Best-case is O_ |
| `No Claim→Claim Attn (V2)` | 0.0323 | 0.0396 | ✗/✗/✗/✗ | fib _ nononghasan of f _ by _ onebug: the in n er engentitiounsad of b ( n -1 ) in stead of n - i -1 ) in stead of n > = 1. Time O_ n 2 |
| `Claims from Expl Only (V2)` | 0.0968 | 0.1064 | ✗/✗/✗/✗ | fib _ nononation _ so rt ed usaraysart ed l is t to fom in O_ log _ ntimeand O_1 space. Best-case is O_1. Thekey _ op er ation is compar is |
| `Surface Bottleneck (V2)` | 0.0161 | 0.0253 | ✗/✗/✗/✗ | fib _ no _ memocomputesFibonacci re cursively wi th outmememoization, lead in g to O_ 2 nexponenential |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | comestimestimestitimestimestimestimestimestiomestimestitiomestimestitiomestiomes |
| `Consistency Loss (V1)` | final (ep20) | 0.0161 | 0.025 | fib _ nononation ch ecksartitionbug: the stearad of n - itition in stead of 1 0 in steadparteandparititi |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | mpumestivestitiontiontitimentionstiventititionstitiontitiontiontiontiontiontions |
| `No Consistency Loss (V1)` | final (ep20) | 0.0161 | 0.0253 | fib _ no _ memocomputesFibonacci re cursively wi th outmememoization, lead in g to O_ 2 nexponenential |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | comest _ lestimestitigestitionstitiontititiontionstionstitionstionstiontionstionti |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.0484 | 0.0471 | max _ producttttion ch ecks if al is tcanbepartition ed in to twoequal - sumsubsets using a DP set to track reachable sums |
| `Random Label (V1)` | initial (ep5) | 0.0 | 0.0 | b _ lest _ lestivestivestartiventionstivestiventivestivestivendstiondiventiondstives |
| `Random Label (V1)` | final (ep20) | 0.0968 | 0.0889 | max _ produckspomputes the m in imumumumnucarm in g to paroducs in gam in O_ ntime wi th O_ n space for dparrowl is ts. Best-case is O_ |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0 | 0.0 | comestitimestititimestitiomestitititiontiomestititiontitionstionstititionstionti |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.0323 | 0.0396 | fib _ nononghasan of f _ by _ onebug: the in n er engentitiounsad of b ( n -1 ) in stead of n - i -1 ) in stead of n > = 1. Time O_ n 2 |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | comestimestimestitimestimestimestimestimestiomestimestitiomestimestititiontionti |
| `Claims from Expl Only (V2)` | final (ep20) | 0.0968 | 0.1064 | fib _ nononation _ so rt ed usaraysart ed l is t to fom in O_ log _ ntimeand O_1 space. Best-case is O_1. Thekey _ op er ation is compar is |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0 | 0.0 | mpumestivestitiontiontitimentionstiventititionstiontiontiontiontitiontionstionti |
| `Surface Bottleneck (V2)` | final (ep20) | 0.0161 | 0.0253 | fib _ no _ memocomputesFibonacci re cursively wi th outmememoization, lead in g to O_ 2 nexponenential |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0323 | 0.0171 | ast <sep> best <sep> l <sep> lar + 1 esest <sep> len <sep> ben ( beses [ l <sep> <sep> be <sep> lsen ( be <sep> s <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> e <sep> <sep> <sep> e <sep> <sep> <sep> <sep> <sep> le <sep> |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

### Sample 20: `merge_sort_missing_base` (val idx 40)

Ground truth: time_complexity=`O_n_log_n`, space_complexity=`O_n`, algorithm_class=`sorting`, correctness_status=`missing_edge_case`

**Reference explanation:** merge_sort_v2 is missing the base case check (len <= 1), causing infinite recursion even on single-element lists. This is a missing_edge_case bug that creates infinite_loop_risk in practice. Time undefined (infinite); space O_n nominally. Key_operation is comparison. Recursive. Input not mutated. Any list causes infinite recursion.

**Mismatched training explanation:** missing_number uses the Gauss sum formula to find the missing integer in O_1 time (or O_n if sum is counted, here treated as O_1 due to Gauss formula) with O_1 space. Best-case is O_1. The key_operation is arithmetic. Input is not mutated. An empty list returns 0 (missing number from [0..0]). Assumes no duplicates; behavior with duplicates is undefined_behavior.

```python
def merge_sort_v2(lst):
    # bug: missing base case for len==1, infinite recursion
    mid = len(lst) // 2
    left  = merge_sort_v2(lst[:mid])
    right = merge_sort_v2(lst[mid:])
    return sorted(left + right)

```

**Final checkpoint (epoch_020):**

| Variant | BLEU-1 | ROUGE-L | Emit major claims | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | 0.1897 | 0.1748 | ✗/✗/✗/✗ | b in a ry_ sear ch par ch ed leftsarondst ed locevaly using alogenditsev is im in ge. Time O_ log _ n ( nts ); space O_1. Best-case O_1 when m = 1. Key_operation is compar is on. Seque ntiala cce ss. Input not mutated. Empty l is t return s 0. Duplicate dues are al ed |
| `No Consistency Loss (V1)` | 0.12 | 0.1053 | ✗/✗/✗/✗ | f in d _ m in _ rotat ed usesb in arysear ch onarotat ed l is t to f in d the m in imumelement in O_ log _ ntime. Space is O_1. Best-case same _ as _ worst. Thekey _ op er ation is compar is on. Div |
| `Claim-Only Pooling (V1)` | 0.1176 | 0.1416 | ✗/✗/✗/✗ | f in d _ m is s in guses the ma the maticalsum for mula to f in d the m is s in g in teg er in [ 0. n ] in O_ ntimeand O_1 space. Best-case O_ n. Thekey _ op er ation is compar is on. Input is not mutated. Empty l is t return s 0 ]. Duplicate values are s are s |
| `Random Label (V1)` | 0.1091 | 0.1 | ✗/✗/✗/✗ | f in d _ m in _ m er gestf in ds the m in imumnumb er ggytart re ed y in g re ed ypentationss the maxigimumfilictionfumat ch is ham. Time O_ n _ log _ n ( so rts ); space O_ n for map. Best-case O_ n _ log _ n. Key_operation |
| `No Claim→Claim Attn (V2)` | 0.2708 | 0.1935 | ✗/✗/✗/✗ | b in a ry_ sear ch _ v 2 hasasawrongetf in d the rongetatighton of the l is targengementstition of stalemest. Time O_ n 2; space O_ n 2. Best-case O_ n 2. Key_operation is compar is on ( stig ). Input not mutated. Empty str in gs |
| `Claims from Expl Only (V2)` | 0.2222 | 0.1818 | ✗/✗/✗/✗ | b in a ry_ sear ch f in ds the m in imumnumb er of centargencompatively. Time O_ log _ n; space O_1. Best-case O_ log _ n. Key_operation is compar is on. Seque ntiala cce s. Input not mutated. Empty l is t return s 0. Duplicate dues are callundunt ed in coun |
| `Surface Bottleneck (V2)` | 0.12 | 0.1053 | ✗/✗/✗/✗ | f in d _ m in _ rotat ed usesb in arysear ch onarotat ed l is t to f in d the m in imumelement in O_ log _ ntime. Space is O_1. Best-case same _ as _ worst. Thekey _ op er ation is compar is on. Div |
| `Surface Btlnk No ExplLM (V2)` | 0.0 | 0.0 | ✗/✗/✗/✗ | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

**Initial checkpoint comparison:**

| Variant | Ckpt | BLEU-1 | ROUGE-L | Generated prose |
|---|---|---|---|---|
| `Consistency Loss (V1)` | initial (ep5) | 0.0 | 0.0 | m er g _ so rtig _ so rtig _ lestightig _ ntightiontiontightiontiontionalestigastionastitiontionalestionstiontiondightionalestiontional |
| `Consistency Loss (V1)` | final (ep20) | 0.1897 | 0.1748 | b in a ry_ sear ch par ch ed leftsarondst ed locevaly using alogenditsev is im in ge. Time O_ log _ n ( nts ); space O_1. Best-case O_1 when m = 1. Key_operation is compar is on. Seque ntiala cce ss. Input not mutated. Empty l is t return s 0. Duplicate dues are al ed |
| `No Consistency Loss (V1)` | initial (ep5) | 0.0444 | 0.058 | m er g _ so rt _ so rt _ so rt _ so rt _ so rtestiongestiontitionstiontiontiontionstigestiontionstionstionstighalestititigentig the lealentig the f in gha |
| `No Consistency Loss (V1)` | final (ep20) | 0.12 | 0.1053 | f in d _ m in _ rotat ed usesb in arysear ch onarotat ed l is t to f in d the m in imumelement in O_ log _ ntime. Space is O_1. Best-case same _ as _ worst. Thekey _ op er ation is compar is on. Div |
| `Claim-Only Pooling (V1)` | initial (ep5) | 0.0 | 0.0 | m er g _ so rt _ so rt _ so rt _ so rt _ so rtigg _ ntigalestionstiglestigalentigalenstiontiontionstiglestitionationstiontigalenstionstig _ ntionst |
| `Claim-Only Pooling (V1)` | final (ep20) | 0.1176 | 0.1416 | f in d _ m is s in guses the ma the maticalsum for mula to f in d the m is s in g in teg er in [ 0. n ] in O_ ntimeand O_1 space. Best-case O_ n. Thekey _ op er ation is compar is on. Input is not mutated. Empty l is t return s 0 ]. Duplicate values are s are s |
| `Random Label (V1)` | initial (ep5) | 0.0 | 0.0 | m er g _ so rt _ so rt _ so rt _ so rtightig _ ng _ ngationg _ ntiontiontiontiontionstiontionstiontionstiontionationdstiontiontiontionstiontionat |
| `Random Label (V1)` | final (ep20) | 0.1091 | 0.1 | f in d _ m in _ m er gestf in ds the m in imumnumb er ggytart re ed y in g re ed ypentationss the maxigimumfilictionfumat ch is ham. Time O_ n _ log _ n ( so rts ); space O_ n for map. Best-case O_ n _ log _ n. Key_operation |
| `No Claim→Claim Attn (V2)` | initial (ep5) | 0.0 | 0.0 | m er g _ so rtig _ so rtig _ so rtightitightightitiontiontiontiontightigastighastionastitiondionalestiondiontighastionditiondighastiond |
| `No Claim→Claim Attn (V2)` | final (ep20) | 0.2708 | 0.1935 | b in a ry_ sear ch _ v 2 hasasawrongetf in d the rongetatighton of the l is targengementstition of stalemest. Time O_ n 2; space O_ n 2. Best-case O_ n 2. Key_operation is compar is on ( stig ). Input not mutated. Empty str in gs |
| `Claims from Expl Only (V2)` | initial (ep5) | 0.0 | 0.0 | m er g _ so rtig _ so rtig _ lestightig _ ntightiontiontightiontiontionalestigastionastitiontionalestionstiontiondightionalestiontional |
| `Claims from Expl Only (V2)` | final (ep20) | 0.2222 | 0.1818 | b in a ry_ sear ch f in ds the m in imumnumb er of centargencompatively. Time O_ log _ n; space O_1. Best-case O_ log _ n. Key_operation is compar is on. Seque ntiala cce s. Input not mutated. Empty l is t return s 0. Duplicate dues are callundunt ed in coun |
| `Surface Bottleneck (V2)` | initial (ep5) | 0.0444 | 0.058 | m er g _ so rt _ so rt _ so rt _ so rt _ so rtestiongestiontiontightionstitiontionstitiontionstitionstighalestitighalentigentig the lealendig the f in ght |
| `Surface Bottleneck (V2)` | final (ep20) | 0.12 | 0.1053 | f in d _ m in _ rotat ed usesb in arysear ch onarotat ed l is t to f in d the m in imumelement in O_ log _ ntime. Space is O_1. Best-case same _ as _ worst. Thekey _ op er ation is compar is on. Div |
| `Surface Btlnk No ExplLM (V2)` | initial (ep5) | 0.0 | 0.0 | left ed ( rt ) right ) right = 1 ) <sep> mid = [ mid: right = rt ] while left = left [ mid: ] < left [ mid + 1 eft [ ] < len ( len ( left [ mid + = mid ] ); ight + = meft <sep> lseft [ seft [ ] ); ightend ( |
| `Surface Btlnk No ExplLM (V2)` | final (ep20) | 0.0 | 0.0 | <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> <sep> |

---

## Manual Scoring Guide

For each sample, score each variant's **final-checkpoint prose** on:

| Criterion | Scale | Notes |
|---|---|---|
| Fluency | 0–2 | 0=incoherent, 1=partial, 2=fluent |
| Factual accuracy | 0–2 | Does prose match ground-truth complexity / correctness? |
| Claim alignment | 0–1 | Do emitted claim tokens match all 12 ground-truth values? |
| Initial→Final improvement | 0–1 | Did prose quality improve from initial to final checkpoint? |

Aggregate per-variant totals and compare across the V1+V2 ablation ladder.

Key research questions:
- Does `consistency_loss` (V1) produce more factually accurate prose than `no_consistency_loss`?
- Do V2 architectural constraints (e.g. `claims_from_explanation_only`) improve claim emission accuracy?
- Do `surface_bottleneck_*` variants emit fewer correctly-valued claims than hidden-state variants?

## Files in This Directory

| File | Description |
|---|---|
| `qualitative_side_by_side_rich.csv` | Machine-readable; one row per (variant, checkpoint, example) |
| `qualitative_side_by_side_rich.json` | Same data as JSON with manifest header |
| `qualitative_side_by_side_rich.pplx.md` | This document |