This repository contains implementations of model-free on-policy single-agent reinforcement learning algorithms.

MDP Generation Classes:
Env.py and utils.py

Algorithms Implemented:
FedQ_EarlySettled_simple.py: Our proposed FedQ-EarlySettled-LowCost algorithm
FedQHoeffding.py: FedQ-Hoeffding algorithm
FedQBernstein.py: FedQ-Bernstein algorithm
fed_adv.py: FedQ-Advantage algorithm


Experimental Setup:
All the experiments in this subsection are run on a server with Intel Xeon E5-2650v4 (2.2GHz) and 100
cores. Each replication is limited to five cores and 15GB of RAM. The total execution time is about 15 hours.
Two test configurations are provided:
532single.py: Code submitted to the server for (H,S,A) = (5,3,2) 
7105single.py: Code submitted to the server for (H,S,A) = (7,10,5)
These configurations include all necessary hyper-parameter values for the experiments.