# Selective Copying Task with NeuroMamba

This repository provides a self-contained implementation for training and evaluating a **NeuroMamba** model on the challenging Selective Copying task. The code is structured for reproducible research, featuring periodic evaluation and clear, concise logging of results.

## Files

-   `config.py`: Contains all configurations for the training process, dataset parameters, and the NeuroMamba model architecture (via the `neuma_config` object).
-   `data_generator.py`: Provides functions to generate data for the Selective Copying task on the fly.
-   `train.py`: The main script that orchestrates the entire experiment. It trains the NeuroMamba model, periodically evaluates its performance, and logs the results to a CSV file.

## Workflow

1.  **Configure**: Adjust training, dataset, and model parameters in `config.py`.
2.  **Run**: Execute `train.py` to start the training and evaluation process.
3.  **Analyze**: Monitor the terminal for real-time progress and analyze the generated CSV file in the `results/` directory for a detailed record of the model's performance.

## Running the Script

To run the main training script, use the following command:

```bash
python train.py
```

## Results and Logging

The script provides two forms of output: real-time terminal logs and a final, clean CSV file containing the evaluation history.

### Terminal Output

During execution, the script will print periodic updates to the terminal, including training loss and detailed evaluation results at each interval. A sample of the terminal output might look like this:

```
2024-06-04 12:30:11,543 - INFO - Using device: cuda:0
2024-06-04 12:30:11,890 - INFO - Evaluation results will be saved to: results/log_adam_2_lr1e-04_20240604-123011.csv
2024-06-04 12:30:11,890 - INFO - --- Starting Training ---
...
2024-06-04 12:35:40,111 - INFO - --- Step 1000: Starting evaluation... ---
2024-06-04 12:36:05,321 - INFO - >>> Evaluation at step [1000]: Average Validation Loss: 1.9832, Validation Accuracy: 25.12% <<<
...
2024-06-04 14:00:15,910 - INFO - --- Training and evaluation finished. Total duration: 90.07 minutes ---
```

### CSV Log File

The primary result of the experiment is a CSV file saved in the **`results/`** directory (e.g., `results/log_adam_2_lr1e-04_20240604-123011.csv`). The filename includes key hyperparameters like the `expand_gc` value for easy identification. This file provides a clean, final record of model performance, perfect for plotting learning curves.

**Important**: The `loss` and `accuracy` columns in the CSV file represent the **average validation metrics** calculated during the periodic evaluation phases, not the training metrics.

The CSV file contains the following columns:

| step | loss   | accuracy |
| :--- | :----- | :------- |
| 1000 | 1.9832 | 25.12    |
| 2000 | 1.2543 | 48.75    |
| ...  | ...    | ...      |
| 400000| 0.0001 | 100.00   |

## Acknowledgments

This work builds upon the foundational Mamba architecture. We would like to thank the authors, Albert Gu and Tri Dao, for their original paper, "[Mamba: Linear-Time Sequence Modeling with Selective State Spaces](https://arxiv.org/pdf/2312.00752.pdf)," and their public [implementation](https://github.com/state-spaces/mamba).