Some good settings from hyper-param search for reacher domain.

Relevant hyper-params are:

   - actor_lr
   - delta
   - entropy_lambda
   - fourier_k
   - NN_basis_dim
   - importance clip

-----------------


### ProOLS


- Speed 0 :  {'NN_basis_dim': '16', 'Policy_basis_dim': '32', 'actor_lr': 0.0025151323441388726, 'algo_name': 'ProOLS', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 5, 'entropy_lambda': 0.47150591457186375, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '660_NS_0_-1_NS_Reacher_Fourier_1000_5_150_3_10.0_0.47150591457186375_0.0025151323441388726_0.0010542534035091538_0.99_False_-1_16_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 3, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher0', 'importance_clip': 10.0, 'inc': 19819, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 150, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 19, 'speed': 0, 'state_lr': 0.0010542534035091538, 'summary': True, 'swarm': True, 'timestamp': '8|5|22:11:39'}

- Speed 1 :  {'NN_basis_dim': '64', 'Policy_basis_dim': '32', 'actor_lr': 0.0009087136254135959, 'algo_name': 'ProOLS', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 1, 'entropy_lambda': 0.544920264272891, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '420_NS_1_-1_NS_Reacher_Fourier_1000_1_30_7_10.0_0.544920264272891_0.0009087136254135959_0.00025351719449965676_0.99_False_-1_64_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 7, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher0', 'importance_clip': 10.0, 'inc': 12618, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 30, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 18, 'speed': 1, 'state_lr': 0.00025351719449965676, 'summary': True, 'swarm': True, 'timestamp': '8|5|23:30:22'}

- Speed 2 :  {'NN_basis_dim': '32', 'Policy_basis_dim': '32', 'actor_lr': 0.00010410478690710495, 'algo_name': 'ProOLS', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 1, 'entropy_lambda': 0.20749601514073318, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '477_NS_2_-1_NS_Reacher_Fourier_1000_1_10_5_10.0_0.20749601514073318_0.00010410478690710495_0.0018353258829523621_0.99_False_-1_32_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 5, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher0', 'importance_clip': 10.0, 'inc': 14318, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 10, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 8, 'speed': 2, 'state_lr': 0.0018353258829523621, 'summary': True, 'swarm': True, 'timestamp': '8|5|21:45:57'}

- Speed 3 :  {'NN_basis_dim': '16', 'Policy_basis_dim': '32', 'actor_lr': 8.984181548239339e-05, 'algo_name': 'ProOLS', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 1, 'entropy_lambda': 0.025105471059275894, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '142_NS_3_-1_NS_Reacher_Fourier_1000_1_30_5_5.0_0.025105471059275894_8.984181548239339e-05_0.0017452121140649852_0.99_False_-1_16_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 5, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher0', 'importance_clip': 5.0, 'inc': 4278, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 30, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 18, 'speed': 3, 'state_lr': 0.0017452121140649852, 'summary': True, 'swarm': True, 'timestamp': '8|5|22:11:20'}

- Speed 4 :  {'NN_basis_dim': '64', 'Policy_basis_dim': '32', 'actor_lr': 6.169702330038067e-05, 'algo_name': 'ProOLS', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 1, 'entropy_lambda': 0.18385178579023276, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '486_NS_4_-1_NS_Reacher_Fourier_1000_1_10_7_10.0_0.18385178579023276_6.169702330038067e-05_0.0013967876705326396_0.99_False_-1_64_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 7, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher0', 'importance_clip': 10.0, 'inc': 14599, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 10, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 19, 'speed': 4, 'state_lr': 0.0013967876705326396, 'summary': True, 'swarm': True, 'timestamp': '8|5|22:2:37'}

### ProWLS


- Speed 0 :  {'NN_basis_dim': '16', 'Policy_basis_dim': '32', 'actor_lr': 0.00038637556467132825, 'algo_name': 'ProWLS', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 3, 'entropy_lambda': 0.852303948290338, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '78_NS_0_-1_NS_Reacher_Fourier_1000_3_90_3_10.0_0.852303948290338_0.00038637556467132825_0.002298474857474361_0.99_False_-1_16_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 3, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher1', 'importance_clip': 10.0, 'inc': 2358, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 90, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 18, 'speed': 0, 'state_lr': 0.002298474857474361, 'summary': True, 'swarm': True, 'timestamp': '8|5|22:28:31'}

- Speed 1 :  {'NN_basis_dim': '16', 'Policy_basis_dim': '32', 'actor_lr': 0.0008983323779180773, 'algo_name': 'ProWLS', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 1, 'entropy_lambda': 0.03658414163684841, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '177_NS_1_-1_NS_Reacher_Fourier_1000_1_30_3_10.0_0.03658414163684841_0.0008983323779180773_0.0006282757553755635_0.99_False_-1_16_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 3, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher1', 'importance_clip': 10.0, 'inc': 5338, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 30, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 28, 'speed': 1, 'state_lr': 0.0006282757553755635, 'summary': True, 'swarm': True, 'timestamp': '8|5|22:24:4'}

- Speed 2 :  {'NN_basis_dim': '16', 'Policy_basis_dim': '32', 'actor_lr': 0.001183383633096802, 'algo_name': 'ProWLS', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 1, 'entropy_lambda': 0.04798482575060038, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '391_NS_2_-1_NS_Reacher_Fourier_1000_1_30_3_10.0_0.04798482575060038_0.001183383633096802_8.766331348170455e-05_0.99_False_-1_16_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 3, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher1', 'importance_clip': 10.0, 'inc': 11738, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 30, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 8, 'speed': 2, 'state_lr': 8.766331348170455e-05, 'summary': True, 'swarm': True, 'timestamp': '8|5|22:31:47'}

- Speed 3 :  {'NN_basis_dim': '16', 'Policy_basis_dim': '32', 'actor_lr': 5.76279709125753e-05, 'algo_name': 'ProWLS', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 1, 'entropy_lambda': 0.08523441021197838, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '303_NS_3_-1_NS_Reacher_Fourier_1000_1_30_5_5.0_0.08523441021197838_5.76279709125753e-05_0.0011939296813377324_0.99_False_-1_16_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 5, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher1', 'importance_clip': 5.0, 'inc': 9098, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 30, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 8, 'speed': 3, 'state_lr': 0.0011939296813377324, 'summary': True, 'swarm': True, 'timestamp': '8|5|22:48:48'}

- Speed 4 :  {'NN_basis_dim': '16', 'Policy_basis_dim': '32', 'actor_lr': 0.0007014196697630904, 'algo_name': 'ProWLS', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 3, 'entropy_lambda': 0.13518553754347654, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '183_NS_4_-1_NS_Reacher_Fourier_1000_3_60_3_5.0_0.13518553754347654_0.0007014196697630904_6.65399331805308e-05_0.99_False_-1_16_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 3, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher1', 'importance_clip': 5.0, 'inc': 5498, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 60, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 8, 'speed': 4, 'state_lr': 6.65399331805308e-05, 'summary': True, 'swarm': True, 'timestamp': '8|5|22:44:49'}

### ONPG

- Speed 0 :  {'NN_basis_dim': '64', 'Policy_basis_dim': '32', 'actor_lr': 0.0025564383610271126, 'algo_name': 'ONPG', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 3, 'entropy_lambda': 0.8150803872189464, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '110_NS_0_-1_NS_Reacher_Fourier_1000_3_10.0_0.8150803872189464_0.0025564383610271126_0.00010291532385191142_0.99_False_-1_64_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 7, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher2', 'importance_clip': 10.0, 'inc': 3319, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 150, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 19, 'speed': 0, 'state_lr': 0.00010291532385191142, 'summary': True, 'swarm': True, 'timestamp': '8|5|21:12:33'}

- Speed 1 :  {'NN_basis_dim': '64', 'Policy_basis_dim': '32', 'actor_lr': 0.0011295723307354554, 'algo_name': 'ONPG', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 3, 'entropy_lambda': 0.971054508524769, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '623_NS_1_-1_NS_Reacher_Fourier_1000_3_5.0_0.971054508524769_0.0011295723307354554_0.0016376207911007281_0.99_False_-1_64_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 7, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher2', 'importance_clip': 5.0, 'inc': 18699, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 150, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 9, 'speed': 1, 'state_lr': 0.0016376207911007281, 'summary': True, 'swarm': True, 'timestamp': '8|5|21:18:5'}

- Speed 2 :  {'NN_basis_dim': '16', 'Policy_basis_dim': '32', 'actor_lr': 0.0036088123335136233, 'algo_name': 'ONPG', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 5, 'entropy_lambda': 0.9492547361189186, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '275_NS_2_-1_NS_Reacher_Fourier_1000_5_10.0_0.9492547361189186_0.0036088123335136233_8.034672630538665e-05_0.99_False_-1_16_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 7, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher2', 'importance_clip': 10.0, 'inc': 8259, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 150, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 9, 'speed': 2, 'state_lr': 8.034672630538665e-05, 'summary': True, 'swarm': True, 'timestamp': '8|5|21:13:59'}

- Speed 3 :  {'NN_basis_dim': '64', 'Policy_basis_dim': '32', 'actor_lr': 0.0003416956372421785, 'algo_name': 'ONPG', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 5, 'entropy_lambda': 0.08662175254994073, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '637_NS_3_-1_NS_Reacher_Fourier_1000_5_15.0_0.08662175254994073_0.0003416956372421785_0.002256403073773172_0.99_False_-1_64_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 7, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher2', 'importance_clip': 15.0, 'inc': 19139, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 150, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 29, 'speed': 3, 'state_lr': 0.002256403073773172, 'summary': True, 'swarm': True, 'timestamp': '8|5|21:18:5'}

- Speed 4 :  {'NN_basis_dim': '32', 'Policy_basis_dim': '32', 'actor_lr': 0.0009665699604673259, 'algo_name': 'ONPG', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 5, 'entropy_lambda': 0.057159323347818515, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '237_NS_4_-1_NS_Reacher_Fourier_1000_5_5.0_0.057159323347818515_0.0009665699604673259_0.0013582022521035483_0.99_False_-1_32_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 7, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher2', 'importance_clip': 5.0, 'inc': 7138, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 150, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 28, 'speed': 4, 'state_lr': 0.0013582022521035483, 'summary': True, 'swarm': True, 'timestamp': '8|5|21:14:8'}

### FTRL-PG (OFPG)

- Speed 0 :  {'NN_basis_dim': '64', 'Policy_basis_dim': '32', 'actor_lr': 0.002817936246364995, 'algo_name': 'OFPG', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 5, 'entropy_lambda': 0.6422433796467522, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '323_NS_0_-1_NS_Reacher_Fourier_1000_5_50_5.0_0.6422433796467522_0.002817936246364995_0.0044767050895122016_0.99_False_-1_64_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 7, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher3', 'importance_clip': 5.0, 'inc': 9699, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 50, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 9, 'speed': 0, 'state_lr': 0.0044767050895122016, 'summary': True, 'swarm': True, 'timestamp': '8|5|22:30:36'}

- Speed 1 :  {'NN_basis_dim': '32', 'Policy_basis_dim': '32', 'actor_lr': 0.0016571651445758976, 'algo_name': 'OFPG', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 3, 'entropy_lambda': 0.24140084170275078, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '325_NS_1_-1_NS_Reacher_Fourier_1000_3_90_10.0_0.24140084170275078_0.0016571651445758976_0.001988346646864447_0.99_False_-1_32_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 7, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher3', 'importance_clip': 10.0, 'inc': 9779, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 90, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 29, 'speed': 1, 'state_lr': 0.001988346646864447, 'summary': True, 'swarm': True, 'timestamp': '8|5|22:56:26'}

- Speed 2 :  {'NN_basis_dim': '64', 'Policy_basis_dim': '32', 'actor_lr': 5.008159176581739e-05, 'algo_name': 'OFPG', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 5, 'entropy_lambda': 0.0907275730455918, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '92_NS_2_-1_NS_Reacher_Fourier_1000_5_150_15.0_0.0907275730455918_5.008159176581739e-05_0.0025805013849066246_0.99_False_-1_64_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 7, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher3', 'importance_clip': 15.0, 'inc': 2778, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 150, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 18, 'speed': 2, 'state_lr': 0.0025805013849066246, 'summary': True, 'swarm': True, 'timestamp': '8|5|23:8:53'}

- Speed 3 :  {'NN_basis_dim': '16', 'Policy_basis_dim': '32', 'actor_lr': 0.003918249107531506, 'algo_name': 'OFPG', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 5, 'entropy_lambda': 0.2598993570667304, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '238_NS_3_-1_NS_Reacher_Fourier_1000_5_150_5.0_0.2598993570667304_0.003918249107531506_0.0037569489060064994_0.99_False_-1_16_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 7, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher3', 'importance_clip': 5.0, 'inc': 7158, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 150, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 18, 'speed': 3, 'state_lr': 0.0037569489060064994, 'summary': True, 'swarm': True, 'timestamp': '8|5|22:31:59'}

- Speed 4 :  {'NN_basis_dim': '64', 'Policy_basis_dim': '32', 'actor_lr': 0.0023792644585433676, 'algo_name': 'OFPG', 'base': 0, 'batch_size': 1000, 'buffer_size': 1000, 'debug': False, 'delta': 5, 'entropy_lambda': 0.2971209546693609, 'env_name': 'NS_Reacher', 'experiment': 'NS', 'extrapolator_basis': 'Fourier', 'folder_suffix': '400_NS_4_-1_NS_Reacher_Fourier_1000_5_100_5.0_0.2971209546693609_0.0023792644585433676_0.002408705523471258_0.99_False_-1_64_1000_1000_100_rmsprop_True_False_False_False_term_0', 'fourier_coupled': True, 'fourier_k': 7, 'fourier_order': -1, 'gamma': 0.99, 'gauss_std': 1.5, 'gpu': 0, 'hyper': 'Reacher3', 'importance_clip': 5.0, 'inc': 12018, 'log_output': 'term', 'max_episodes': 1000, 'max_inner': 100, 'max_steps': 500, 'optim': 'rmsprop', 'oracle': -1, 'raw_basis': False, 'restore': False, 'save_count': 100, 'save_model': False, 'seed': 18, 'speed': 4, 'state_lr': 0.002408705523471258, 'summary': True, 'swarm': True, 'timestamp': '8|5|23:25:23'}
