# 🧾 Introduction


<p align="center">
  <img src="architecture.png" width="100%"/>
</p>

We consider the setting of online dynamics adaptation, where policies are trained in the source domain with sufficient data, while only limited interactions with the target domain are allowed. In this paper, we study the domain adaptation problem from a generative modeling perspective. Specifically, we introduce DADiff, a diffusion-based framework that leverages the discrepancy between source and target generative trajectories in the generation process of the next state to estimate the dynamics mismatch. Both reward modification and data filtering variants are developed to adapt the policy to the target domain. We also provide a theoretical analysis to show that the performance difference of a given policy between the two domains is bounded by the generative trajectory deviation. The results demonstrate that our method provides superior performance compared to existing approaches, effectively addressing the dynamics mismatch.



# 💻 Repository Structure

- `train.py`: Main script to run the DADiff algorithm.
- `algo/`: Contains the implementation of `DADiff-modify`, `DADiff-select` and `DAFlow-modify`.
- `config/`: Configuration files for methods.
- `envs/`: Contains the `xml` files in different enviroments.
- `requirements.txt`: List of dependencies required to run the project.
- `dockerfile`: Dockerfile to set up the environment.

# 🛠️ Usage

## Prerequisites

Build a container using the provided Dockerfile or set up a Python environment with the required dependencies.

## Running DADiff

Run the following commands to execute the DADiff algorithm:
```sh
CUDA_VISIBLE_DEVICES=0 python train.py --policy dadiff_modify --env halfcheetah_morph --beta 0.1 --seed 2025 --dir runs
```
You can modify the parameters such as `--policy`, `--env`, `--beta` ($\lambda$ in the paper), `--seed`, `--gate_threshold` ($\xi\%$ in the paper), and `--dir` as needed.