storm_kit.mpc.control.mppi module

class MPPI(d_action, horizon, init_cov, init_mean, base_action, beta, num_particles, step_size_mean, step_size_cov, alpha, gamma, kappa, n_iters, action_lows, action_highs, null_act_frac=0.0, rollout_fn=None, sample_mode='mean', hotstart=True, squash_fn='clamp', update_cov=False, cov_type='sigma_I', seed=0, sample_params={'filter_coeffs': None, 'fixed_samples': True, 'seed': 0, 'type': 'halton'}, tensor_args={'device': device(type='cpu'), 'dtype': torch.float32}, visual_traj='state_seq')[source]

Bases: storm_kit.mpc.control.olgaussian_mpc.OLGaussianMPC

Inheritance diagram of MPPI

Class that implements Model Predictive Path Integral Controller

Implementation is based on Williams et. al, Information Theoretic MPC for Model-Based Reinforcement Learning with additional functions for updating the covariance matrix and calculating the soft-value function.

Parameters
  • base_action (str) – Action to append at the end when shifting solution to next timestep ‘random’ : appends random action ‘null’ : appends zero action ‘repeat’ : repeats second to last action

  • num_particles (int) – Number of action sequences sampled at every iteration

_abc_impl = <_abc_data object>
_calc_val(trajectories)[source]

Calculate value of state given rollouts from a policy

_control_costs(actions)[source]
_exp_util(costs, actions)[source]

Calculate weights using exponential utility

_shift(shift_steps)[source]

Predict good parameters for the next time step by shifting the mean forward one step and growing the covariance

_update_distribution(trajectories)[source]

Update moments in the direction using sampled trajectories