Metadata-Version: 2.1
Name: scalable_mappo_lagr
Version: 0.1.0
Summary: scalable_mappo_lagr algorithms of marlbenchmark
Home-page: UNKNOWN
License: UNKNOWN
Description: # Scal-MAPPO-L
        
        ## 1. Usage
        All core code is located within the onpolicy folder. The algorithms/ subfolder contains algorithm-specific code
        for MAPPO. 
        
        * The envs/ subfolder contains environment wrapper implementations for the MPEs, SMAC, and Hanabi. 
        
        * Code to perform training rollouts and policy updates are contained within the runner/ folder - there is a runner for 
        each environment. 
        
        * Executable scripts for training with default hyperparameters can be found in the scripts/ folder. The files are named
        in the following manner: train_algo_environment.sh. Within each file, the map name (in the case of SMAC and the MPEs) can be altered. 
        * Python training scripts for each environment can be found in the scripts/train/ folder. 
        
        * The config.py file contains relevant hyperparameter and env settings. Most hyperparameters are defaulted to the ones
        used in the paper; however, please refer to the appendix for a full list of hyperparameters used. 
        
        
        ## 2. Installation
        
         Here we give an example installation on CUDA == 10.1. For non-GPU & other CUDA version installation, please refer to the [PyTorch website](https://pytorch.org/get-started/locally/).
        
        ``` Bash
        # create conda environment
        conda create -n marl python==3.8
        conda activate marl
        pip install torch==1.5.1+cu101 torchvision==0.6.1+cu101 -f https://download.pytorch.org/whl/torch_stable.html
        ```
        
        ## 3.Train
        Here we use train_mujoco.sh as an example:
        ```
        cd scalable_mappo_lagr\scripts
        chmod +x ./train_mujoco.sh
        ./train_mujoco.sh
        ```
        Local results are stored in subfold scripts/results. Note that we use Weights & Bias as the default visualization platform; to use Weights & Bias, please register and login to the platform first. More instructions for using Weights&Bias can be found in the official [documentation](https://docs.wandb.ai/). Adding the `--use_wandb` in command line or in the .sh file will use Tensorboard instead of Weights & Biases. 


        ## 4.Cite
        Our work is based on that of the following paper.
        ```
        @misc{yu2021surprising,
              title={The Surprising Effectiveness of MAPPO in Cooperative Multi-Agent Games}, 
              author={Chao Yu and Akash Velu and Eugene Vinitsky and Yu Wang and Alexandre Bayen and Yi Wu},
              year={2021},
              eprint={2103.01955},
              archivePrefix={arXiv},
              primaryClass={cs.LG}
        },
        @article{gu2023safe,
              title={Safe multi-agent reinforcement learning for multi-robot control},
              author={Gu, Shangding and Kuba, Jakub Grudzien and Chen, Yuanpei and Du, Yali and Yang, Long and Knoll, Alois and Yang, Yaodong},
              journal={Artificial Intelligence},
              volume={319},
              pages={103905},
              year={2023},
              publisher={Elsevier}
        }

        
        
        
Keywords: multi-agent reinforcement learning platform pytorch
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
