# Neural Bayesian Filtering (NBF)

This repository contains the code for the paper **Neural Bayesian Filtering** submitted to NeurIPS 2025.

The code is organized into two main directories: `src/pomdp/` for the Gridworld experiments and `src/goofspiel/` for the Goofspiel experiments. Each directory contains scripts for training and evaluation of models.

## Setting up the Environment

To replicate our environment, use Python 3.12 and install the Python packages found in `requirements.txt` to a virtual environment by executing the following commands:

```bash
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install -r requirements.txt
```

After a successful installation, patch the `sbx` library to support action masking by running:

```bash
./tools/patch_sbx.sh
```

## Running Experiments

Follow these instructions to replicate the results found in our submission. All training and evaluation scripts contain information about command-line arguments within them.

### Gridworld Experiments

The two main files for training models are named `src/pomdp/train_grid_flow.py` (for Approx Beliefs and NBF) and `src/pomdp/train_grid_rnn.py` (for Recurrent). The files are structured the same, here is an example usage demonstrating how to train a model for a fixed 5x5 grid, saving the model in directory `grid-models/`:

```bash
python3 src/pomdp/train_grid_flow.py --dir='grid-models' --size=5 --ndim=2 --fixed
```

The file `src/pomdp/eval_grid_models.py` evaluates all Gridworld models. It expects a directory where both flow and recurrent models have been saved by the training scripts. It uses Gridworld parameters defined by the results of model training. This will produce an `eval/` subdirectory in `grid-models/`.

```bash
python3 src/pomdp/eval_grid_models.py --dir='grid-models' --nrepeats=500
```

### Goofspiel Experiments

Before running experiments, we need to generate a sequence of policies our training scripts expect. This can be done by running the following command:

```bash
python3 src/goofspiel/generate_policies.py --num-cards=5 --num-envs=16 --self-play-iters=4 --train-iters=128 --train-timesteps=4096 --seed=10

```

The two main files for training models are named `src/goofspiel/train_flow.py` (for Approx Beliefs and NBF) and `src/goofspiel/train_rnn.py` (for Recurrent). The files are structured the same, here is an example usage demonstrating how to train a model for a five-card Goofspiel:

```bash
python3 src/goofspiel/train_flow.py --num-cards=5 --num-epochs=32 --batch-size=32 --num-samples=64 --seed=10
```

The file `src/goofspiel/filter.py` evaluates the particle filter and a given pair of flow and recurrent models. It expects the names of saved flow and recurrent models produced by the `train_flow.py` and `train_rnn.py` scripts as input. The following command evaluates given checkpoints in a five-card Goofspiel:

```bash
python3 src/goofspiel/filter.py --num-cards=5 --flow-model-ckpt='model-flow-05-32-10.eqx' --rnn-model-ckpt='model-rnn-05-32-10.eqx' --seed=10
```

### Donuts Experiments

Training and plotting code for the toy domain introduced in the paper can be found in the `notebooks/` directory. The `donuts_demo.ipynb` notebook contains code for training Normalizing Flow models and `flow_matching_demo.ipynb` follows the same structure but uses Conditional Flow Matching.
