# SST2 Bayesian Transformer Experiment

This folder contains the notebooks, modules, data, checkpoints, and results for
the SST2 Bayesian transformer experiments and multi-seed robustness analysis.

## Quick start
- Run the notebooks from this folder so relative imports like `from modules...`
  and paths like `processed_data_agnews/` resolve correctly.
- Recommended environment:
  - Python 3.x
  - Jupyter
  - tensorflow, tensorflow_probability
  - numpy, pandas, scikit-learn
  - matplotlib, seaborn

## Notebooks
- `Bayesian_Transformers_final.ipynb`
  - Main end-to-end experiment. Loads data, trains models, evaluates, and
    writes figures/results.
- `Multi_Seed_Robustness_Check.ipynb`
  - Multi-seed stability analysis. Reuses seed=42 checkpoints when present.
  - Creates `temp_baseline_cache/` during runs (disk cache for baselines).

## Folder structure (key items)
- `modules/` - Core experiment code (model builders, training, evaluation, etc.)
- `processed_data_agnews/` - NPY arrays used by default (`DATA_DIR` in config).
- `processed_data/` - Alternate preprocessed arrays (same naming schema).
- `checkpoints/`
  - `model_0.h5` .. `model_4.h5` - seed=42 pretrained checkpoints
  - `model_5/` - deep ensemble members (`member_*.h5`)
  - `multi_seed/` - progress/results for multi-seed runs
- `figures/` - Saved plots (png)
- `results_csv/` - Metrics CSV outputs
- `multi_seed_results/` - Aggregated multi-seed outputs (csv/json/png)
- `tstex_modules/` - TS-TeX API stubs (not used by notebooks)

## Data expectations
Each data directory contains:
- `train_ids.npy`, `train_mask.npy`, `train_labels.npy`
- `dev_ids.npy`, `dev_mask.npy`, `dev_labels.npy`
- `ood_ids.npy`, `ood_mask.npy`, `ood_labels.npy`

Default data directory is `processed_data_agnews/` as defined in:
- `modules/config.py`
- `modules/config_seed.py`

To switch datasets, update `DATA_DIR` in those config files.

## Notes
- The parent folder name ends with a trailing space:
  `SST2_Bayesian_transformer_experiment ` (note the space).
  Be careful when `cd`-ing from a terminal.
- The notebooks save figures/results in the existing `figures/`,
  `results_csv/`, and `multi_seed_results/` folders.
