# ICLR 2026 Supplementary Material - Code Submission

**Paper Title**: Think Just Enough: Sequence-Level Entropy as a Confidence Signal for LLM Reasoning

**Submission Type**: Code Supplementary Material for Anonymous Review

## Overview

This supplementary package contains the complete source code implementation for the entropy-based early stopping framework described in our ICLR 2026 submission. The code is provided to enable full reproducibility of our experimental results and to facilitate review.

## Compliance with ICLR Guidelines

✅ **Anonymization**: All identifying information has been removed  
✅ **Self-Contained**: Complete implementation with all dependencies listed  
✅ **Reproducible**: Includes sample data and clear instructions  
✅ **Well-Documented**: Comprehensive documentation and comments  

## Contents

This supplementary material includes:

1. **Core Framework** (`entropy_framework.py`): Complete implementation of Shannon entropy calculation, threshold methods, and early stopping logic
2. **Experiment Templates**: Ready-to-run experiments for AIME'24, AIME'25, and GPQA Diamond datasets
3. **Analysis Tools**: Visualization and statistical analysis utilities
4. **Documentation**: Complete setup and usage instructions
5. **Sample Data**: Built-in sample problems for immediate testing

## Quick Validation (5 minutes)

To quickly verify the implementation works:

```bash
cd iclr_code
pip install -r requirements.txt
python templates/aime24_experiment.py --model gpt-3.5-turbo --problems 3
```

This will run a small test using the built-in sample problems.

## Key Results Reproducible

The code reproduces all main results from the paper:

- **Token Savings**: 25-50% computational cost reduction
- **Accuracy Preservation**: Zero accuracy degradation (Δ-Acc ≈ 0%)
- **Effect Sizes**: Cohen's d > 0.5 for reasoning-optimized models
- **Cross-Domain Validation**: Consistent performance across AIME and GPQA
- **Emergent Confidence**: Demonstration that standard models lack entropy discrimination

## File Structure

```
iclr_code/
├── SUPPLEMENTARY_MATERIAL_README.md    # This file
├── README.md                           # Main documentation
├── SETUP_INSTRUCTIONS.md               # Quick start guide
├── entropy_framework.py                # Core implementation
├── requirements.txt                    # Dependencies
├── templates/                          # Experiment templates
├── analysis/                           # Analysis tools
├── data/                              # Dataset directory
└── experiments/                       # Results directory
```

## Review Notes

- **Runtime**: Small experiments (3-5 problems) complete in 2-5 minutes
- **Dependencies**: Standard Python packages (numpy, matplotlib, openai)
- **API Requirements**: Requires OpenAI/OpenRouter API key for full experiments
- **Sample Mode**: Built-in samples work without external data/APIs

## Technical Requirements

- Python 3.7+
- Standard machine learning packages (see requirements.txt)
- API access for full experiments (sample experiments work offline)
- ~100MB disk space for full installation

## Ethical Considerations

- No personal data is collected or stored
- Uses publicly available academic datasets (AIME, GPQA)  
- API usage follows standard terms of service
- Implementation focuses on defensive efficiency techniques

This supplementary material is designed to facilitate thorough review and enable complete reproduction of our research findings.