# Experiment Plan: Comparing Flow-Based Generative Models vs Discrete Time Models

## Overview
This experiment investigates whether flow-based generative models (continuous time) provide significant advantages over discrete time models for time series prediction tasks. The hypothesis is that for inherently predictable time series, the gap between continuous FBGM and discrete models will be small, while for less predictable data, this gap will increase.

The key theoretical difference is that flow-based models operate in continuous time and can use both current state and history for predictions, while discrete models only utilize history.

## Experiment Checklist

### 1. Dataset Construction
- [X] Create a 1D harmonic oscillator CRF with controllable parameters:
  - [X] Observation noise (varying levels)
  - [X] Process noise (varying levels)
  - [X] Time between observations (varying intervals)
- [X] Generate separate datasets for each parameter combination

### 2. Model Implementation
- [ ] Modify existing models:
  - [ ] Use existing autoregressive model from `autoregressive.py`
  - [ ] Create modified version of neural SDE from `neural_sde.py` adapted to work in continuous time
  - [ ] Both models should be simplified to a single-layer RNN
  - [ ] Both models should not use conditioning on `yts` for predictions
  - [ ] Implement function in `autoregressive.py` to get continuous extension of backward messages
  - [ ] (Optional) Experiment with different base linear SDEs:
    - [ ] Use Brownian motion SDE (first priority)
    - [ ] Use harmonic oscillator SDE
    - [ ] Use tracking model SDE
    - [ ] Compare performance across different base dynamics

### 3. Training Setup
- [X] Create configuration files for experiments
  - [X] Setup appropriate hyperparameters
  - [X] Create configuration for each dataset combination
  - [X] Ensure configurations allow easy batch experiment launching

### 4. Evaluation
- [ ] Implement evaluation method to calculate log likelihood:
  - [ ] For continuous time model: use regular predictions
  - [ ] For discrete model: use continuous extension of autoregressive model
  - [ ] Collect log likelihood for each time point between (0, t_max)

### 5. Results Aggregation
- [ ] Create pandas DataFrame with:
  - [ ] Rows for each (data + model) pair
  - [ ] Columns for:
    - [ ] Observation noise
    - [ ] Process noise
    - [ ] Time between observations
    - [ ] Model type
    - [ ] Average log likelihood across entire time range
  - [ ] (Optional) Include base SDE type as an additional column if multiple base SDEs are tested
    - [ ] Compare performance differences between Brownian motion, harmonic oscillator, and tracking models
- [ ] Generate initial analysis of results

## Next Steps
After completing these steps, more detailed plots and analyses will be conducted to fully explore the hypothesis.