# NeuroMamba Ablation Study on the Selective Copying Task

This repository provides a targeted framework for conducting **ablation studies** on the **NeuroMamba** model. It is specifically designed to evaluate the contribution of key architectural components on the challenging **Selective Copying** task.

## Framework Overview

The framework enables researchers to systematically disable specific pathways within the NeuroMamba architecture and measure the impact on performance. The core design principles are:

1.  **Centralized Ablation Control**: Experiments are configured by simply toggling boolean flags in a single configuration file, `config_ablation.py`. This allows for rapid switching between the baseline model and its ablated variants.
2.  **Non-Invasive Modification**: The ablation logic is applied at runtime using a technique known as "monkey patching." This injects the modified `NeuroMamba_ab` block into the standard model structure without requiring any permanent changes to the core `neuromamba` library code.
3.  **Automated Logging**: The training script automatically generates detailed CSV log files with filenames that clearly encode the specific ablation, hyperparameters, and timestamp of each run.

## Files

-   `config_ablation.py`: The central **command center** for the study. It contains configurations for the training loop, the dataset, the base NeuroMamba model, and, most importantly, the **Ablation Flags** (`ablate_gc`, `ablate_y2`) that control which experiment to run.
-   `data_generator.py`: A script responsible for generating the Selective Copying dataset. This task requires the model to memorize a small set of tokens from a long, noisy sequence and recall them later. (Adapted from the S4 repository).
-   `neuromamba_ab.py`: A modified version of the NeuroMamba block specifically designed for this study. It contains the logic to programmatically disable the `gc` (gated connection) and `y2` (secondary output) pathways based on the flags passed during initialization.
-   `train_ablation.py`: The main **executable script**. It reads the settings from `config_ablation.py`, uses "monkey patching" to inject the `NeuroMamba_ab` block into the model, runs the training and evaluation loop, and logs the results.

## The Ablation Mechanism

This study targets two critical components of the NeuroMamba architecture:

-   **Gated Connection (`gc`)**: A pathway that provides a memory-like factor to the SSM state transition.
-   **Secondary Output (`y2`)**: A secondary projection from the SSM state that is added to the primary output.

### Control Flags

You can control the ablations using two boolean flags at the bottom of `config_ablation.py`:
-   `neuma_config.ablate_gc`: Set to `True` to remove the `gc` branch.
-   `neuma_config.ablate_y2`: Set to `True` to remove the `y2` output path.

### Implementation

The `neuromamba_ab.py` script checks these flags during model initialization. If an ablation is enabled, it zeroes out the weights of the corresponding projection layers and disables their gradients, effectively removing them from the computation graph. The `train_ablation.py` script then cleverly injects this modified block into the standard `NeuroMambaLMHeadModel` at runtime.

## How to Run an Experiment

Running an experiment is a straightforward two-step process:

1.  **Configure**: Open `config_ablation.py`. Scroll to the bottom to the "ABLATION FLAGS" section and set the flags for the desired experiment.

    **Example Configurations:**
    -   **Baseline (Full Model)**:
        ```python
        neuma_config.ablate_gc = False
        neuma_config.ablate_y2 = False
        ```
    -   **Ablate `gc` only**:
        ```python
        neuma_config.ablate_gc = True
        neuma_config.ablate_y2 = False
        ```
    -   **Ablate `y2` only**:
        ```python
        neuma_config.ablate_gc = False
        neuma_config.ablate_y2 = True
        ```

2.  **Execute**: Run the training script from your terminal:
    ```bash
    python train_ablation.py
    ```
    The script will print the model configuration, including which components are being ablated, and begin training.

## Results

All results are saved as CSV files in a directory named `results/`.

The intelligent filename convention allows for easy identification of each experiment's settings:
`log_{optimizer}_{ablation_str}_gc{expand_gc}_lr{learning_rate}_{timestamp}.csv`

A sample filename might be `log_adam_ab_gc_gc2_lr1e-04_20250718-120000.csv`, which tells you:
-   **Optimizer**: Adam (`adam`)
-   **Ablation**: `gc` branch was ablated (`ab_gc`)
-   **GC Expansion**: `expand_gc` was 2 (`gc2`)
-   **Learning Rate**: 1e-4 (`lr1e-04`)
-   **Timestamp**: The exact date and time of the run.

You can monitor progress in the terminal, which provides regular updates:
```
2025-07-18 12:00:01,123 - INFO - Using device: cuda:0
Layer 0: Ablating gc branch. Setting mf_proj parameters to zero.
Layer 1: Ablating gc branch. Setting mf_proj parameters to zero.
2025-07-18 12:00:05,456 - INFO - Evaluation results will be saved to: results/log_adam_ab_gc_gc2_lr1e-04_20250718-120000.csv
2025-07-18 12:00:05,456 - INFO - --- Starting Training ---
2025-07-18 12:00:15,789 - INFO - Step [200/400000], Training Loss: 2.5123
...
2025-07-18 12:05:00,100 - INFO - --- Step 1000: Starting evaluation... ---
2025-07-18 12:05:10,200 - INFO - >>> Evaluation at step [1000]: Average Validation Loss: 1.8765, Validation Accuracy: 45.67% <<<
...
2025-07-19 10:30:00,500 - INFO - --- Training and evaluation finished. Total duration: 1110.00 minutes ---
Restored original create_block function.
```

## Acknowledgments

The data generation code for the selective copying task is adapted from the official S4 repository. We thank the authors for their contribution.