## Model fitting with Stacking Variational Bayesian Monte Carlo (S-VBMC)

Code and installation instructions for **S-VBMC** can be found [here](https://github.com/acerbilab/S-VBMC/tree/main).

Running S-VBMC requires having a set of `VariationalPosterior` objects obtained with independent runs of [Variational Bayesian Monte Carlo (VBMC)](https://github.com/acerbilab/pyvbmc).

---

## Layout

This repository contains everything needed to (a) prepare the data, (b) run many independent VBMC fits (locally), and (c) stack the resulting posteriors with S-VBMC.

```

.
├── scripts/
│   └── fit_entrypoint_truncated.py    # Reproducible single VBMC run (CLI); writes vp.npz + metadata.json
├── bav_model.py                       # Likelihood + priors; exposes BAVModel(log_joint, ...)
├── bav_sampler_multi.py               # Core BAV model utilities 
├── extract_data.py                    # Load MATLAB .mat and turn into PyTorch tensors
├── get_splits.py                      # Make two balanced splits (400 trials each) without overlap
├── stack_posteriors.py                # Collect VPs and run S-VBMC per subject/ρ/split
├── bav_data.mat                       # Source dataset (MATLAB format)
├── trial_idx_400_split_1.json         # Indices for split 1 (400 trials)
├── trial_idx_400_split_2.json         # Indices for split 2 (400 trials)
├── requirements.txt                   # Python dependencies
└── README.md

````

### What the important pieces do

- **`scripts/fit_entrypoint_truncated.py`**  
  A single, reproducible VBMC fit: seeds Python/NumPy/Torch, sets (plausible) bounds, initializes `x0` inside plausible bounds, runs `pyvbmc.VBMC.optimize()`, and saves:
  - `vp.npz` — the fitted variational posterior,
  - `metadata.json` — run metadata (bounds, seed, args, versions, status).

- **`get_splits.py`**  
  Utility used to create **two balanced splits of 400 trials** each (no overlap) with respect to `(response_type, V_level)`. The resulting indices are stored in `trial_idx_400_split_1.json` and `trial_idx_400_split_2.json`.

- **`extract_data.py`**  
  Loads the MATLAB dataset (`bav_data.mat`) and converts it to tensors used by the model.  
  The function `get_sbj_data(data_path, sbj, idx_path)` applies the split indices and returns `(x, y)` for a given subject and split.

- **`bav_model.py`**  
  Defines `BAVModel`, which wires together: data extraction, the negative log-likelihood (via `bav_sampler_multi.py`), and the truncated Gaussian priors. It exposes a `log_joint(theta)` callable for VBMC.

- **`bav_sampler_multi.py`**  
  Implements the multi-level BAV model and Gauss–Hermite quadrature NLL (`nll_bav_constant_gaussian`).

- **`stack_posteriors.py`**  
  Gathers multiple `VariationalPosterior` files for a given `(subject, ρ, split)`, filters out unstable runs/outliers, and runs **S-VBMC** to produce a **stacked posterior**. The result is serialized for later analysis/plotting.

---

## Minimal usage

Install dependencies:

```bash
pip install -r requirements.txt
````

### Fit a single VBMC run (local)

```python
from scripts.fit_entrypoint_truncated import compute_bounds
from pyvbmc import VBMC
from bav_model import BAVModel
import numpy as np

# Subject 0, rho = 4/3, first split
model = BAVModel(sbj=0, RHO_A=4/3, data_path="bav_data.mat",
                 idx_path="trial_idx_400_split_1.json", truncated=True)
LB, UB, PLB, PUB = compute_bounds()
x0 = np.random.uniform(low=PLB, high=PUB, size=PLB.shape)
vbmc = VBMC(log_density=model.log_joint,
            lower_bounds=LB,
            upper_bounds=UB,
            plausible_lower_bounds=PLB,
            plausible_upper_bounds=PUB,
            x0=x0)

vp, _ = vbmc.optimize()
```

Alternatively, one could run VBMC in a more systematic way therough `fit_entripoint.py` by running the following command on terminal:

```bash
python -m scripts.fit_entrypoint --sbj 0 --run-idx 0 --seed 23415543 --data-path bav_data.mat --idx-path trial_idx_400_split_1.json --rho-a 1 --outdir results_test --verbose
```
This will perform a single VBMC run (indexed as `0`) for subject `0` with $\rho = 1$ and store the results in a folder named `results_test`.

Repeat the run several times (different seeds/initializations) to build a list of `VariationalPosterior` objects.

Outputs (for each `(ρ, subject, run)`):

```
results/
  rho_1.3333333333/
    sbj_00/
      run_00/
        vp.npz
        metadata.json
      ...
```


### Stack the posteriors with S-VBMC

Once you have multiple `vp.npz` files per subject (ideally ≥ 20 “good” runs), stack them:

```python
from svbmc import SVBMC

# Suppose you collected vp objects into vp_list
stacked_vp = SVBMC(vp_list)
stacked_vp.optimize()
_ = stacked_vp.plot()
```

Or use the convenience script that:

1. loads VPs for a given `(subject, ρ, split)`,
2. filters unstable/outlier runs,
3. stacks the first 20 remaining posteriors,
4. saves the stacked posterior.

```python
from stack_posteriors import run_svbmc
run_svbmc(s=0, rho=4/3, split=1)
```

---


