# NeuroMamba Ablation Study on the Induction Heads Task

This repository provides a comprehensive framework for conducting **ablation studies** on the **NeuroMamba** model, specifically targeting its performance on the multi-level **Induction Heads** task. This setup allows for a rigorous diagnostic evaluation of key architectural components across a spectrum of cognitive challenges.

## Framework Overview

The framework is built for systematic and reproducible experimentation with a focus on ease of use and detailed reporting:

1.  **Centralized Ablation Control**: Experiments are managed via simple boolean flags in `config_ablation.py`, making it easy to switch between the baseline model and its ablated variants.
2.  **Advanced Task Generation**: The `data_generator.py` includes a sophisticated generator for the Induction Heads task, offering multiple, systematically increasing difficulty levels to probe different model capabilities.
3.  **Non-Invasive Ablation**: A "monkey patching" technique is used to inject the modified `NeuroMamba_ab` block at runtime, enabling ablations without altering the core `neuromamba` library code.
4.  **Structured Logging**: The training script generates separate, detailed CSV logs for training and validation, with a descriptive filename convention that encodes the precise experimental conditions.

## Files

-   `config_ablation.py`: The central **command center**. It defines configurations for the training loop, the specific difficulty level of the Induction Heads task, the NeuroMamba architecture, and the crucial **Ablation Flags** (`ablate_gc`, `ablate_y2`).
-   `data_generator.py`: A powerful script that generates the complex, multi-level datasets for the Induction Heads task.
-   `neuromamba_ab.py`: A modified NeuroMamba block containing the logic to programmatically disable the `gc` (gated connection) and `y2` (secondary output) pathways based on the flags from the config file.
-   `train_ablation.py`: The main **executable script**. It orchestrates the entire experiment, from applying the ablation via monkey patching to running the training loop and logging detailed results.

## The Induction Heads Task Levels

The `data_generator.py` can create sequences of varying difficulty to test different aspects of model intelligence:

-   **Level 0: Baseline**: Clean `[P, A, B]` triplets to test basic selective recall.
-   **Level 1: Memory Robustness**: Noise is added *between* triplets, testing long-term memory.
-   **Level 2: Abstract Pattern Recognition**: Noise is added *within* triplets (`[P, N, A, N, B]`), testing abstract rule learning.
-   **Level 3: Combined Stress Test**: Noise is added both within and between triplets.
-   **Level 4: Autonomous Learning Suite (No Prefix)**:
    -   **4.0 (Sanity Check)**: Clean `[A, B]` pairs.
    -   **4.1 (Robust Discovery)**: Noise between pairs.
    -   **4.2 (Dynamic World Modeling)**: Includes conflicting information (`A,B` ... `A,C`) to test state updating.

## How to Run an Experiment

1.  **Configure**: Open `config_ablation.py`.
    -   Set the `dataset_config['difficulty_level']` to the desired level (0-4).
    -   Scroll to the "ABLATION FLAGS" section and set the boolean flags for the desired experiment.

    **Example Configurations:**
    -   **Baseline (Full Model)**:
        ```python
        neuromamba_config.ablate_gc = False
        neuromamba_config.ablate_y2 = False
        ```
    -   **Ablate `gc` only on Level 3**:
        ```python
        dataset_config['difficulty_level'] = 3
        ...
        neuromamba_config.ablate_gc = True
        neuromamba_config.ablate_y2 = False
        ```

2.  **Execute**: Run the training script from your terminal:
    ```bash
    python train_ablation.py
    ```
    The script will print the model configuration, initialize training, and show a live progress bar.

## Results

The framework generates two detailed, timestamped CSV log files for each run: one for training and one for validation.

The intelligent filename convention allows for instant identification of each experiment's settings:

`ih_{level}_{ablation}_{params}_{arch}_{timestamp}_{type}.csv`

A sample filename might be `ih_lv2_ab_y2_120K_4_128_20250718-140000_validation.csv`, which tells you:
-   **Task**: Induction Heads (`ih`)
-   **Difficulty**: Level 2 (`lv2`)
-   **Ablation**: `y2` branch was ablated (`ab_y2`)
-   **Model Size**: ~120K parameters (`120K`)
-   **Architecture**: 4 layers, 128 dimension (`4_128`)
-   **Timestamp**: The exact date and time of the run.
-   **Type**: `validation` log

You can monitor progress in the terminal, which will look similar to this:
```
2025-07-18 14:00:00,100 - INFO - Using device: cuda
2025-07-18 14:00:01,200 - INFO - Initializing Model and Optimizer...
Layer 0: Ablating y2 branch. Setting out_cathree_proj parameters to zero.
Layer 1: Ablating y2 branch. Setting out_cathree_proj parameters to zero.
...
2025-07-18 14:00:03,300 - INFO - Model created. Total trainable parameters: 120,320 (~120K)
2025-07-18 14:00:03,400 - INFO - Logging training metrics to: ih_lv2_ab_y2_120K_4_128_20250718-140000_training.csv
2025-07-18 14:00:03,500 - INFO - Logging validation results to: ih_lv2_ab_y2_120K_4_128_20250718-140000_validation.csv
2025-07-18 14:00:04,600 - INFO - Generating fixed validation sets...
2025-07-18 14:00:10,700 - INFO - --- Starting Experiment ---
Training:  4%|▍         | 8192/204800 [05:10<2:03:00, 26.31it/s, loss=0.8123, train_acc=65.50%]
...
--- Validation at Step 8192 ---
  SeqLen 64      -> Loss: 0.5123, Accuracy: 75.00%
  SeqLen 128     -> Loss: 0.5234, Accuracy: 74.50%
  ...
  SeqLen 131072  -> Loss: 0.6123, Accuracy: 70.10%
---------------------------------
...
```