# TACD GRU 

## Python Environment
This code is developed and tested in Python 3.8.13; so this python version is recommended for reproducibility. Following are the commands to setup the python environment from the provided `requirements.txt`:
1. Install Miniconda (https://docs.anaconda.com/free/miniconda/index.html)
2. conda activate <your env>
3. conda install pip
4. pip install -r requirements.txt
5. Please setup wandb; we use it for logging and plotting the final results (https://docs.wandb.ai/quickstart)

## Data setup
1. Clone the repository
2. Make a directory `data` inside the unzipped project (at the level of "run_experiments.py" and "lib") and create two sub-dirs `ushcn` and `mimic` inside `data`

### USHCN
1. Download USHCN data (https://data.ess-dive.lbl.gov/view/doi%3A10.3334%2FCDIAC%2FCLI.NDP019); extract it `tar -xvf ushcn_daily.tar.gz` inside data/ushcn
2. Move state-wise *.gz files to raw/ : `cp pub12/ushcn_daily/state* raw/`

### Physionet
No steps required. Code will automatically generate the dataset splits the first time it runs a Physionet experiment.

### MIMIC-III
1. Extract the tar file contents:
`tar -xvf mimic3_splits.tar.gz`
2. Move the file `feature_headers.pt` and folder `subsampled_1k` inside `data/mimic`.




## Running USHCN multi-step prediction
Following are the commands needed to run baseline and TACD-GRU models. Replace the random seed to replicate 3 distinct runs. Random seeds: {0, 341, 700}.

### mTAND
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --mTAND --dataset ushcn --task extrapolation -lsd 8 --wandb-project ushcn_mTAND_seed_runs -b 25 --num-workers 8 --lr 0.01 --epochs 100 --mTAND-use-norm --lr-decay 0.99 --mTAND-num-ref-points 32 --sample-rate 0.5 --unobserved-rate 0.2 --log-wandb --random-seed 0`

### GRU-D
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --grud --dataset ushcn -lsd 20 --task extrapolation --num-workers 8 --wandb-project ushcn_grud_seed_runs -b 50 --lr 0.01 --lr-decay 0.99 --grad-clip --log-wandb --epochs 100 --sample-rate 0.5 --unobserved-rate 0.2 --random-seed 0`

### f-CRU
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset ushcn --task extrapolation -lsd 10 --ts 0.0001 --sample-rate 0.5 --unobserved-rate 0.2 --enc-var-activation square --dec-var-activation exp --trans-var-activation relu --grad-clip --num-basis 20 --bandwidth 10 --f-cru --lr 0.05 --epochs 100 --wandb-project ushcn_fcru_hopt -b 50 --log-wandb --lr-decay 0.99 --random-seed 0`

### TACD-GRU
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset ushcn --task extrapolation -lsd 20 --num-workers 8 --sample-rate 0.5 --unobserved-rate 0.2 --wandb-project tacd_ushcn_seed_runs -b 50 --lr 0.1 --lr-decay 0.99 --tacd-time-emb 2 --tacd-event-emb 2 --grad-clip --grudplus_ablation_mode no_ablation --tacd-add_noise 0.0 --log-wandb --random-seed 0`

### TACD-GRU-xc-only
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset ushcn --task extrapolation -lsd 20 --num-workers 8 --sample-rate 0.5 --unobserved-rate 0.2 --wandb-project tacd_xc_ushcn_seed_runs -b 50 --lr 0.1 --lr-decay 0.99 --tacd-time-emb 2 --tacd-event-emb 2 --grad-clip --grudplus_ablation_mode context_only --tacd-add_noise 0.0 --log-wandb --random-seed 0`

### TACD-GRU-xa-only
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset ushcn --task extrapolation -lsd 20 --num-workers 8 --sample-rate 0.5 --unobserved-rate 0.2 --wandb-project tacd_xa_ushcn_seed_runs -b 50 --lr 0.1 --lr-decay 0.99 --tacd-time-emb 2 --tacd-event-emb 2 --grad-clip --grudplus_ablation_mode attention_only --tacd-add_noise 0.0 --log-wandb --random-seed 0`

### CRU
`CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset ushcn --task extrapolation -lsd 16 --ts 0.0001 --sample-rate 0.5 --unobserved-rate 0.2 --enc-var-activation square --dec-var-activation exp --trans-var-activation relu --grad-clip --num-basis 15 --bandwidth 3 --lr 0.05 --epochs 100 --wandb-project ushcn_cru_extrapolation -b 50 --log-wandb --lr-decay 0.99 --random-seed 0`

### ContiFormer
`CUDA_VISIBLE_DEVICES=0 python run_experiment.py --contiformer --dataset ushcn --task extrapolation -lsd 16 --ts 0.001 --sample-rate 0.5 --unobserved-rate 0.2 --grad-clip --lr 0.0001 --epochs 100 --wandb-project ushcn_contiformer_extrapolation -b 4 --log-wandb --lr-decay 0.99 --num-workers 8 --random-seed 700`

### RKN-Delta
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --rkn --t-sensitive-trans-net --dataset ushcn --task extrapolation -lsd 20 --sample-rate 0.5 --unobserved-rate 0.2 --ts 0.0001 --grad-clip --lr 0.001 --epochs 100 --wandb-project ushcn_rkn_extrapolation -b 50 --log-wandb --num-workers 2 --random-seed 0 --lr-decay 0.99 --grad-clip`

### GRU-Delta
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --grudelta --dataset ushcn --task extrapolation -lsd 16 --ts 0.001 --sample-rate 0.5 --unobserved-rate 0.2 --grad-clip --lr 0.005 --epochs 100 --wandb-project ushcn_grudelta_extrapolation -b 50 --log-wandb --lr-decay 0.99 --num-workers 8 --random-seed 0`

### ODE-RNN
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset ushcn --task extrapolation --ts 0.001 --sample-rate 0.5 --unobserved-rate 0.2 --ode_rnn -b 10 --wandb-project ushcn_odernn_extrapolation --lr 0.005 --epochs 100 --lr-decay 0.99 --grad-clip -lsd 16 --num-workers 8 --lr-decay 0.99 --random-seed 0 --log-wandb`

### Latent-ODE
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset ushcn --task extrapolation --sample-rate 0.5 --unobserved-rate 0.2 --latent_ode -b 10 --wandb-project ushcn_lode_extrapolation --log-wandb --lr 0.01 --epochs 100 --lr-decay 0.99 --grad-clip -lsd 20 --lode-rec-dim 32 --lode-units 32 --lode-gru-units 32 --lode-gen-layers 3 --lode-rec-layers 3 --random-seed 0`




## Running Physionet multi-step prediction
Following are the commands needed to run baseline and TACD-GRU models. Replace the random seed to replicate 3 distinct runs. Random seeds: {0, 341, 700}.

### mTAND
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --mTAND --dataset physionet --task extrapolation -lsd 22 --wandb-project physionet_mTAND_seed_runs -b 100 --num-workers 8 --lr 0.01 --epochs 100 --mTAND-use-norm --lr-decay 0.99 --mTAND-num-ref-points 64 --mTAND-time-embed-dim 32 --log-wandb --random-seed 0`

### GRU-D
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --grud --dataset physionet -lsd 16 --task extrapolation --num-workers 8 --wandb-project physionet_grud_seed_runs -b 100 --lr 0.01 --lr-decay 0.99 --grad-clip --log-wandb --epochs 100 --random-seed 0`

### f-CRU
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset physionet --task extrapolation -lsd 16 --ts 0.2 --enc-var-activation square --dec-var-activation exp --trans-var-activation relu --grad-clip --num-basis 20 --bandwidth 10 --f-cru --lr 0.001 --epochs 20 --wandb-project physionet_fcru_seed_runs -b 100 --log-wandb --random-seed 0`

### TACD-GRU
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset physionet -lsd 20 --task extrapolation --num-workers 8 --wandb-project tacd_physionet_seed_runs -b 100 --lr 0.025 --tacd-time-emb 64 --tacd-event-emb 64 --lr-decay 0.99 --grad-clip --grudplus_ablation_mode no_ablation --log-wandb --random-seed 0`

### TACD-GRU-xc-only
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset physionet -lsd 20 --task extrapolation --num-workers 8 --wandb-project physionet_tacd_xc_seed_runs -b 100 --lr 0.025 --tacd-time-emb 64 --tacd-event-emb 64 --lr-decay 0.99 --grad-clip --grudplus_ablation_mode context_only --log-wandb --random-seed 0`

### TACD-GRU-xa-only
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset physionet -lsd 20 --task extrapolation --num-workers 8 --wandb-project physionet_tacd_xa_seed_runs -b 100 --lr 0.025 --tacd-time-emb 64 --tacd-event-emb 64 --lr-decay 0.99 --grad-clip --grudplus_ablation_mode attention_only --log-wandb --random-seed 0`

### CRU
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset physionet --task extrapolation -lsd 32 --ts 0.2 --enc-var-activation square --dec-var-activation exp --trans-var-activation relu --grad-clip --num-basis 20 --bandwidth 10 --lr 0.005 --epochs 100 --wandb-project physionet_cru_extrapolation -b 100 --log-wandb --random-seed 0 --num-workers 8`

### ContiFormer
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --contiformer --dataset physionet --task extrapolation -lsd 16 --grad-clip -b 8 --num-workers 8 --lr 0.001 --epochs 100 --lr-decay 0.99 --random-seed 700 --wandb-project physionet_contiformer_extrapolation --log-wandb`

### RKN-Delta
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --rkn --t-sensitive-trans-net --dataset physionet --task extrapolation -lsd 32 --ts 0.2 --grad-clip --num-basis 20 --bandwidth 10 --lr 0.005 --epochs 100 --wandb-project physionet_rkn_extrapolation -b 100 --log-wandb --random-seed 341 --num-workers 8 --lr-decay 0.99`

### GRU-Delta
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --grudelta --dataset physionet --task extrapolation -lsd 32 --ts 0.2 --grad-clip --lr 0.005 --epochs 100 --wandb-project physionet_grudelta_extrapolation -b 100 --random-seed 0 --num-workers 2 --lr-decay 0.99 --log-wandb`

### ODE-RNN
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset physionet --task extrapolation --ode_rnn -b 4 --grad-clip --wandb-project physionet_odernn_extrapolation --lr 0.001 --epochs 10 --lr-decay 0.99 --grad-clip -lsd 16 --random-seed 0 --log-wandb`

### Latent-ODE
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset physionet --task extrapolation --latent_ode -b 100 --wandb-project physionet_lode_extrapolation --lr 0.005 --epochs 100 --lr-decay 0.99 --grad-clip -lsd 32 --random-seed 346 --log-wandb`


## Running MIMIC-III multi-step prediction
Following are the commands needed to run baseline and TACD-GRU models. Replace the random seed to replicate 3 distinct runs. Random seeds: {0, 341, 700}.

### mTAND
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --mTAND --dataset mimic --task extrapolation -lsd 64 --wandb-project mimic_mtand_seed_runs -b 1 --num-workers 8 --lr 0.001 --epochs 20 --mTAND-use-norm --lr-decay 0.99 --mTAND-num-ref-points 64 --log-wandb --random-seed 0`

### GRU-D
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --grud --dataset mimic --task extrapolation -lsd 32 --grad-clip --wandb-project mimic_grud_seed_runs -b 1 --num-workers 8 --lr 0.001 --epochs 20 --log-wandb --lr-decay 0.99 --random-seed 0`

### f-CRU
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset mimic --task extrapolation -lsd 64 --ts 0.0000000001 --enc-var-activation square --dec-var-activation square --trans-var-activation relu --grad-clip --num-basis 20 --bandwidth 10 --f-cru --lr 0.001 --epochs 20 --wandb-project mimic_fcru_seed_runs -b 1 --log-wandb --num-workers 8 --random-seed 0`

### Latent ODE
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset mimic --task extrapolation --latent_ode -b 1 --wandb-project mimic_lode_seed_runs --log-wandb --lr 0.001 --epochs 20 --lr-decay 0.99 --grad-clip -lsd 32 --lode-rec-dim 32 --lode-units 32 --lode-gru-units 32 --lode-gen-layers 3 --lode-rec-layers 3 --random-seed 967`

### TACD-GRU
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset mimic --task extrapolation -lsd 16 --grad-clip --wandb-project mimic_tacd_seed_runs -b 1 --num-workers 8 --lr 0.001 --epochs 20 --lr-decay 0.99 --tacd-time-emb 64 --tacd-event-emb 64 --grudplus_ablation_mode no_ablation --log-wandb --random-seed 0`

### TACD-GRU-xc-only
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset mimic --task extrapolation -lsd 16 --grad-clip --wandb-project mimic_tacd_xc_seed_runs -b 1 --num-workers 8 --lr 0.001 --epochs 20 --lr-decay 0.99 --tacd-time-emb 64 --tacd-event-emb 64 --grudplus_ablation_mode context_only --log-wandb --random-seed 0`

### TACD-GRU-xa-only
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset mimic --task extrapolation -lsd 16 --grad-clip --wandb-project mimic_tacd_xa_seed_runs -b 1 --num-workers 8 --lr 0.001 --epochs 20 --lr-decay 0.99 --tacd-time-emb 64 --tacd-event-emb 64 --grudplus_ablation_mode attention_only --log-wandb --random-seed 0`

### CRU
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset mimic --task extrapolation -lsd 32 --ts 0.0000000001 --enc-var-activation square --dec-var-activation square --trans-var-activation relu --grad-clip --num-basis 20 --bandwidth 10 --lr 0.0001 --epochs 20 --wandb-project mimic_cru_extrapolation -b 1 --log-wandb --num-workers 2 --random-seed 0`

### ContiFormer
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --contiformer --dataset mimic --task extrapolation -lsd 64 --grad-clip -b 1 --num-workers 2 --lr 0.001 --epochs 20 --lr-decay 0.99 --random-seed 0 --wandb-project mimic_contiformer_extrapolation --log-wandb`

### RKN-Delta
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --rkn --t-sensitive-trans-net --dataset mimic --task extrapolation -lsd 32 --ts 0.0000000001 --grad-clip --lr 0.00005 --epochs 20 --wandb-project mimic_rkn_extrapolation -b 1 --log-wandb --num-workers 2 --random-seed 0`

### GRU-Delta
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --grudelta --dataset mimic --task extrapolation -lsd 16 --grad-clip --lr 0.001 --epochs 20 --wandb-project mimic_grudelta_extrapolation -b 1 --log-wandb --random-seed 0 --num-workers 8 --lr-decay 0.99`

### ODE-RNN
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset mimic --task extrapolation --ode_rnn -b 1 --wandb-project mimic_odernn_extrapolation --lr 0.005 --epochs 20 --lr-decay 0.99 --grad-clip -lsd 32 --num-workers 0 --random-seed 700 --log-wandb`


## Running USHCN single-step prediction
Following are the commands needed to run baseline and TACD-GRU models. Replace the random seed to replicate 3 distinct runs. Random seeds: {0, 341, 700}.

### mTAND
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --mTAND --dataset ushcn --task next_obs_prediction -lsd 16 --wandb-project ushcn_mtand_nobs_pred -b 25 --num-workers 8 --lr 0.001 --epochs 100 --mTAND-use-norm --lr-decay 0.99 --mTAND-num-ref-points 32 --sample-rate 0.5 --unobserved-rate 0.2 --log-wandb --random-seed 0`

### GRU-D
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --grud --dataset ushcn -lsd 20 --task next_obs_prediction --num-workers 8 --wandb-project ushcn_grud_nobs_pred -b 50 --lr 0.01 --lr-decay 0.99 --grad-clip --log-wandb --epochs 100 --sample-rate 0.5 --unobserved-rate 0.2 --random-seed 0`

### f-CRU
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset ushcn --task next_obs_prediction -lsd 10 --ts 0.0001 --sample-rate 0.5 --unobserved-rate 0.2 --enc-var-activation square --dec-var-activation exp --trans-var-activation relu --grad-clip --num-basis 20 --bandwidth 10 --f-cru --lr 0.05 --epochs 100 --wandb-project ushcn_fcru_nobs_pred -b 50 --log-wandb --lr-decay 0.99 --random-seed 0`

### TACD-GRU
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset ushcn --task next_obs_prediction -lsd 8 --num-workers 8 --sample-rate 0.5 --unobserved-rate 0.2 --epochs 20 --wandb-project ushcn_tacd_nobs_seeds3 -b 25 --lr 0.001 --lr-decay 0.99 --tacd-time-emb 2 --tacd-event-emb 2 --grad-clip --grudplus_ablation_mode no_ablation --tacd-add_noise 0.0 --log-wandb --random-seed 0`

### TACD-GRU-xc-only
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset ushcn --task next_obs_prediction -lsd 8 --num-workers 8 --sample-rate 0.5 --unobserved-rate 0.2 --wandb-project ushcn_tacd_xc_nobs_pred -b 50 --lr 0.1 --lr-decay 0.99 --tacd-time-emb 2 --tacd-event-emb 2 --grad-clip --grudplus_ablation_mode context_only --tacd-add_noise 0.0 --log-wandb --random-seed 0`

### TACD-GRU-xa-only
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset ushcn --task next_obs_prediction -lsd 20 --num-workers 8 --sample-rate 0.5 --unobserved-rate 0.2 --wandb-project ushcn_tacd_xa_nobs_pred -b 50 --lr 0.1 --lr-decay 0.99 --tacd-time-emb 2 --tacd-event-emb 2 --grad-clip --grudplus_ablation_mode attention_only --tacd-add_noise 0.0 --log-wandb --random-seed 0`

### CRU
`CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset ushcn --task next_obs_prediction -lsd 16 --ts 0.0001 --sample-rate 0.5 --unobserved-rate 0.2 --enc-var-activation square --dec-var-activation exp --trans-var-activation relu --grad-clip --num-basis 15 --bandwidth 3 --lr 0.05 --epochs 100 --wandb-project ushcn_cru_nobs_pred -b 50 --log-wandb --lr-decay 0.99 --random-seed 0`

### ContiFormer
`CUDA_VISIBLE_DEVICES=0 python run_experiment.py --contiformer --dataset ushcn --task next_obs_prediction -lsd 16 --ts 0.0001 --sample-rate 0.5 --unobserved-rate 0.2 --grad-clip --lr 0.0005 --epochs 100 --wandb-project ushcn_contiformer_nobs_pred -b 10 --log-wandb --lr-decay 0.99 --num-workers 8 --random-seed 0`

### RKN-Delta
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --rkn --t-sensitive-trans-net --dataset ushcn --task next_obs_prediction -lsd 20 --sample-rate 0.5 --unobserved-rate 0.2 --ts 0.0001 --grad-clip --lr 0.001 --epochs 100 --wandb-project ushcn_rkn_nobs_pred -b 50 --log-wandb --num-workers 2 --random-seed 0 --lr-decay 0.99 --grad-clip`

### GRU-Delta
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --grudelta --dataset ushcn --task next_obs_prediction -lsd 20 --ts 0.001 --sample-rate 0.5 --unobserved-rate 0.2 --grad-clip --lr 0.005 --epochs 100 --wandb-project ushcn_grudelta_nobs_pred -b 50 --log-wandb --lr-decay 0.99 --num-workers 8 --random-seed 0`

### ODE-RNN
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset ushcn --task next_obs_prediction --ts 0.001 --sample-rate 0.5 --unobserved-rate 0.2 --ode_rnn -b 10 --wandb-project ushcn_odernn_next_obs --lr 0.00005 --epochs 100 --lr-decay 0.99 --grad-clip -lsd 20 --num-workers 8 --grad-clip --lr-decay 0.99 --random-seed 0 --log-wandb`

### Latent-ODE
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset ushcn --task next_obs_prediction --sample-rate 0.5 --unobserved-rate 0.2 --latent_ode -b 10 --wandb-project ushcn_lode_nobs_pred --log-wandb --lr 0.01 --epochs 100 --lr-decay 0.99 --grad-clip -lsd 20 --lode-rec-dim 32 --lode-units 32 --lode-gru-units 32 --lode-gen-layers 3 --lode-rec-layers 3 --random-seed 0`



## Running Physionet single-step prediction
Following are the commands needed to run baseline and TACD-GRU models. Replace the random seed to replicate 3 distinct runs. Random seeds: {0, 341, 700}.

### mTAND
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --mTAND --dataset physionet --task next_obs_prediction -lsd 22 --wandb-project physionet_tacd_nobs_pred -b 100 --num-workers 8 --lr 0.01 --epochs 100 --mTAND-use-norm --lr-decay 0.99 --mTAND-num-ref-points 64 --mTAND-time-embed-dim 32 --log-wandb --random-seed 0 --log-wandb`

### GRU-D
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --grud --dataset physionet -lsd 16 --task next_obs_prediction --num-workers 8 --wandb-project physionet_tacd_nobs_pred -b 100 --lr 0.01 --lr-decay 0.99 --grad-clip --log-wandb --epochs 100 --random-seed 0 --log-wandb`

### f-CRU
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset physionet --task next_obs_prediction -lsd 16 --ts 0.2 --enc-var-activation square --dec-var-activation exp --trans-var-activation relu --grad-clip --num-basis 20 --bandwidth 10 --f-cru --lr 0.001 --epochs 100 --wandb-project physionet_tacd_nobs_pred -b 100 --log-wandb --random-seed 0 --num-workers 8`

### TACD-GRU
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset physionet -lsd 20 --task next_obs_prediction --num-workers 2 --wandb-project physionet_tacd_nobs_pred -b 100 --lr 0.025 --tacd-time-emb 64 --tacd-event-emb 64 --lr-decay 0.99 --grad-clip --random-seed 0 --grudplus_ablation_mode no_ablation --log-wandb`

### TACD-GRU-xc-only
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset physionet -lsd 20 --task next_obs_prediction --num-workers 2 --wandb-project physionet_tacd_nobs_pred -b 100 --lr 0.025 --tacd-time-emb 64 --tacd-event-emb 64 --lr-decay 0.99 --grad-clip --random-seed 0 --grudplus_ablation_mode context_only --log-wandb`

### TACD-GRU-xa-only
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset physionet -lsd 20 --task next_obs_prediction --num-workers 2 --wandb-project physionet_tacd_nobs_pred -b 100 --lr 0.025 --tacd-time-emb 64 --tacd-event-emb 64 --lr-decay 0.99 --grad-clip --random-seed 0 --grudplus_ablation_mode attention_only --log-wandb`

### CRU
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset physionet --task next_obs_prediction -lsd 32 --ts 0.2 --enc-var-activation square --dec-var-activation exp --trans-var-activation relu --grad-clip --num-basis 20 --bandwidth 10 --lr 0.005 --epochs 100 --wandb-project physionet_tacd_nobs_pred -b 100 --log-wandb --random-seed 0 --num-workers 8`

### ContiFormer
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --contiformer --dataset physionet --task next_obs_prediction -lsd 16 --grad-clip -b 4 --num-workers 2 --lr 0.001 --epochs 100 --lr-decay 0.99 --random-seed 0 --wandb-project physionet_tacd_nobs_pred --log-wandb`

### RKN-Delta
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --rkn --t-sensitive-trans-net --dataset physionet --task next_obs_prediction -lsd 32 --ts 0.2 --grad-clip --num-basis 20 --bandwidth 10 --lr 0.005 --epochs 100 --wandb-project physionet_tacd_nobs_pred -b 100 --log-wandb --random-seed 0 --num-workers 8 --lr-decay 0.99`

### GRU-Delta
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --grudelta --dataset physionet --task next_obs_prediction -lsd 32 --ts 0.2 --grad-clip --lr 0.01 --epochs 100 --wandb-project physionet_tacd_nobs_pred -b 100 --random-seed 0 --num-workers 8 --lr-decay 0.99 --log-wandb`

### ODE-RNN
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset physionet --task next_obs_prediction --ode_rnn -b 10 --wandb-project physionet_tacd_nobs_pred --lr 0.1 --epochs 100 --lr-decay 0.99 --grad-clip -lsd 32 --random-seed 0 --log-wandb`

### Latent-ODE
`$CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset physionet --task next_obs_prediction --latent_ode -b 100 --wandb-project physionet_tacd_nobs_pred --lr 0.01 --epochs 100 --lr-decay 0.99 --grad-clip -lsd 32 --random-seed 0 --log-wandb`




## Running MIMIC-III single-step prediction
Following are the commands needed to run baseline and TACD-GRU models. Replace the random seed to replicate 3 distinct runs. Random seeds: {0, 341, 700}.

### mTAND
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --mTAND --dataset mimic --task next_obs_prediction -lsd 64 --wandb-project mimic_tacd_nobs_pred -b 1 --num-workers 8 --lr 0.001 --epochs 20 --mTAND-use-norm --lr-decay 0.99 --mTAND-num-ref-points 64 --log-wandb --random-seed 0`

### GRU-D
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --grud --dataset mimic --task next_obs_prediction -lsd 32 --grad-clip --wandb-project mimic_tacd_nobs_pred -b 1 --num-workers 2 --lr 0.001 --epochs 20 --log-wandb --lr-decay 0.99 --random-seed 0`

### f-CRU
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset mimic --task next_obs_prediction -lsd 64 --ts 0.0000000001 --enc-var-activation square --dec-var-activation square --trans-var-activation relu --grad-clip --num-basis 20 --bandwidth 10 --f-cru --lr 0.001 --epochs 20 --wandb-project mimic_tacd_nobs_pred -b 1 --log-wandb --num-workers 2 --random-seed 0`

### Latent ODE
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset mimic --task next_obs_prediction --latent_ode -b 1 --wandb-project mimic_tacd_nobs_pred --log-wandb --lr 0.001 --epochs 20 --lr-decay 0.99 --grad-clip -lsd 32 --lode-rec-dim 32 --lode-units 32 --lode-gru-units 32 --lode-gen-layers 3 --lode-rec-layers 3 --num-workers 0 --random-seed 0`

### TACD-GRU
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset mimic --task next_obs_prediction -lsd 16 --grad-clip --wandb-project mimic_tacd_nobs_pred -b 1 --num-workers 8 --lr 0.001 --epochs 20 --lr-decay 0.99 --tacd-time-emb 64 --tacd-event-emb 64 --grudplus_ablation_mode no_ablation --log-wandb --random-seed 0`

### TACD-GRU-xc-only
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset mimic --task next_obs_prediction -lsd 16 --grad-clip --wandb-project mimic_tacd_nobs_pred -b 1 --num-workers 8 --lr 0.001 --epochs 20 --lr-decay 0.99 --tacd-time-emb 64 --tacd-event-emb 64 --grudplus_ablation_mode context_only --log-wandb --random-seed 0`

### TACD-GRU-xa-only
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --tacd_gru --grudplus_refine refinement_6 --dataset mimic --task next_obs_prediction -lsd 16 --grad-clip --wandb-project mimic_tacd_nobs_pred -b 1 --num-workers 8 --lr 0.001 --epochs 20 --lr-decay 0.99 --tacd-time-emb 64 --tacd-event-emb 64 --grudplus_ablation_mode attention_only --log-wandb --random-seed 0`

### CRU
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset mimic --task next_obs_prediction -lsd 32 --ts 0.0000000001 --enc-var-activation square --dec-var-activation square --trans-var-activation relu --grad-clip --num-basis 20 --bandwidth 10 --lr 0.001 --epochs 14 --wandb-project mimic_tacd_nobs_pred -b 1 --log-wandb --num-workers 2 --random-seed 0`

### ContiFormer
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --contiformer --dataset mimic --task next_obs_prediction -lsd 64 --grad-clip -b 1 --num-workers 2 --lr 0.001 --epochs 20 --lr-decay 0.99 --random-seed 0 --wandb-project mimic_tacd_nobs_pred --log-wandb`

### RKN-Delta
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --rkn --t-sensitive-trans-net --dataset mimic --task next_obs_prediction -lsd 32 --ts 0.0000000001 --grad-clip --lr 0.00005 --epochs 20 --wandb-project mimic_tacd_nobs_pred -b 1 --log-wandb --num-workers 2 --random-seed 0`

### GRU-Delta
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --grudelta --dataset mimic --task next_obs_prediction -lsd 16 --grad-clip --lr 0.001 --epochs 20 --wandb-project mimic_tacd_nobs_pred -b 1 --log-wandb --random-seed 0 --num-workers 8 --lr-decay 0.99`

### ODE-RNN
`$ CUDA_VISIBLE_DEVICES=0 python run_experiment.py --dataset mimic --task next_obs_prediction --ode_rnn -b 1 --wandb-project mimic_tacd_nobs_pred --lr 0.05 --epochs 20 --lr-decay 0.99 --grad-clip -lsd 32 --num-workers 0 --random-seed 0 --log-wandb`





# Generic notes
* If you would like to run time normalized TACD-GRU, simply enable it by: `--tacd-norm-time`
* To add noise to the TACD-GRU context based predictor, use: `--tacd-add_noise 0.25`
* We highly recommend setting up wandb because stdouts for some models are deprecated.
* One can specify "--log-wandb" to start logging in your "--wandb-project" workspace project.
