# **Hierarchical Subspaces of Policies for Continual Offline Reinforcement Learning**

This project is linked to the [research article](./HILOW.pdf) of the same name, which explores continual reinforcement learning methods for navigation tasks in an offlie setting.

## **Code and Datasets**

The code repository and the datasets are available [here](https://drive.google.com/drive/folders/1X5NTzN5ADsU69xhOVxW08lpRufpQTo78?usp=sharing). Download the `datasets.zip` file and unzip it within the CODE folder (which should contain a copy of this README.md file)

## **Installation**

See the installation pages in order to setup and to use the library : [Windows ( 11 )](./installation/WINDOWS.md) / [WSL2](./installation/WSL.md) / [Linux ( Ubuntu 22.04 )](./installation/LINUX.md) .

## **Abstract**

*In dynamic domains such as autonomous robotics and video game simulations, agents must continuously adapt to new tasks while retaining previously acquired skills.
This ongoing process, known as Continual Reinforcement Learning, presents significant challenges, including the risk of forgetting past knowledge and the need for scalable solutions as the number of tasks increases.
To address these issues, we introduce **HIerarchical LOW-rank Subspaces of Policies (HILOW)**, a novel framework designed for continual learning in offline navigation settings.
HILOW leverages hierarchical policy subspaces to enable flexible and efficient adaptation to new tasks while preserving existing knowledge. We demonstrate, through a careful experimental study, the effectiveness of our method in both classical MuJoCo maze environments and complex video game-like simulations, showcasing competitive performance and satisfying adaptability according to classical continual learning metrics, in particular regarding memory usage.
Our work provides a promising framework for real-world applications where continuous learning from pre-collected data is essential.*

## **Objectives**

The core objective of this repository is to provide a solution for developing learning agents that continuously improve while preparing for future tasks, without the need for retraining from scratch when encountering new tasks.

Specifically, the system aims to adapt bots in video games to new versions efficiently. When a new version of the game is released, the current bot should adapt to the new environment with minimal computational cost, leveraging prior knowledge to handle variations quickly.

___
___
