Code for Minimum Description Length Control 

Hyperparameters are as described in the paper, Minimum Description Length Control. To train in the sequential setting, simply run train_agent_seq.py with --env set to the desired environment. One can specify a task ordering via the --order flag, e.g., '0-1-2' trains walker-stand, walker-walk, and walker-run, in that order for the walker domain. If no order is specified a random order among the tasks in the specified env is chosen. To train an agent in the parallel setting, simply pass the number of parallel tasks as num_tasks to the agent (defined in mdlc_sac.py). 

File descriptions:
- train_agent_seq.py: train MDL-C on sequential tasks 
- train_agent_single.py: functions for training and evaluating the control policy on a single task 
- mdlc_sac.py: agent class 
- networks.py: network architectures and custom layers for policy, Q-functions, and variational dropout 
- utils.py: helper functions

