
# DS4 - Decision S4

## Overview

DS4 implementation, using S4 to model RL mujoco tasks as sequence to sequence problems.
More comprehensive version will be uploaded upon acceptance.

Implements the model using variants of the [S4 architecture](https://arxiv.org/abs/2111.00396)
The implementation uses a single step on inference. Every step uses its only inputs, previous action and the S4 state, without looking back on previous inputs.

Based on the base code from [Decision Transformer: Reinforcement Learning via Sequence Modeling](https://sites.google.com/berkeley.edu/decision-transformer).

![image info](./base_architecture.png)

## Instructions

Requires python 3.8, pytorch and cudatoolkit.
Prerequisites and datasets downloads are listen in:
```area_prepare/installation.sh```

To run an experiment for example:
```
python reinforcementlearn-dt-s4/gym/experiment.py --env walker2d --dataset medium --max_iters 50 --num_steps_per_iter 2000 --s4_singlestep 1
```
This command will create a log file in the experiment directory.

Environment can be changed between haldcheetah, walker2d and hopper.
Dataset can be medium, medium-replay or expert.
