Optimization

Modules:

Least Squares:

class gmmvi.optimization.sample_db.SampleDB(dim, diagonal_covariances, keep_samples, max_samples=None)[source]

A database for storing samples and meta-information.

Along the samples, we also store

  1. The parameters of the Gaussian distribution that were used for obtaining each sample

  2. log-density evaluations of the target distribution, \log p(\mathbf{x})

  3. (if available), gradients of the log-densites of the target distribution, \nabla_\mathbf{x} \log p(\mathbf{x})

Parameters:
  • dim – int dimensionality of the samples to be stored

  • diagonal_covariances – bool True, if the samples are always drawn from Gaussians with diagonal covariances (saves memory)

  • keep_samples – bool If this is False, the samples are not actually stored

  • max_samples – int Maximal number of samples that are stored. If adding new samples would exceed this limit, every N-th sample in the database gets deleted.

add_samples(samples, means, chols, target_lnpdfs, target_grads, mapping)[source]

Add the given samples to the database.

Parameters:
  • samples – tf.Tensor a two-dimensional tensor of shape num_samples x num_dimensions containing the samples to be added.

  • means – tf.Tensor a two-dimensional tensor containing for each Gaussian distribution that was used for obtaining the samples the corresponding mean. The first dimension of the tensor can be smaller than the number of samples, if several samples where drawn from the same Gaussian (see the parameter mapping).

  • chols – tf.Tensor a three-dimensional tensor containing for each Gaussian distribution that was used for obtaining the samples the corresponding Cholesky matrix. The first dimension of the tensor can be smaller than the number of samples, if several samples where drawn from the same Gaussian (see the parameter mapping).

  • target_lnpdfs – tf.Tensor a one-dimensional tensor containing the log-densities of the (unnormalized) target distribution, \log p(\mathbf{x}).

  • target_grads – tf.Tensor a two-dimensional tensor containing the gradients of the log-densities of the (unnormalized) target distribution, \nabla_{\mathbf{x}} \log p(\mathbf{x}).

  • mapping – tf.Tensor a tensor of size number_of_samples, which corresponds for every sample the index to means and chols that corresponds to the Gaussian distribution that was used for drawing that sample.

static build_from_config(config, num_dimensions)[source]

A static method to conveniently create a SampleDB from a given config dictionary.

Parametes:
config: dict

The dictionary is typically read from YAML a file, and holds all hyperparameters.

num_dimensions: int

dimensionality of the samples to be stored

evaluate_background(weights, means, chols, inv_chols, samples)[source]

Evaluates the log-densities of the given samples on a GMM with the given parametes. This function is implemented in a memory-efficient way to scale to mixture models with many components.

Parameters:

weights

tf.Tensor

The weights of the GMM that should be evaluated

means: tf.Tensor

The means of the GMM that should be evaluated

chols: tf.Tensor

The Cholesky matrices of the GMM that should be evaluated

inv_chols: tf.Tensor

The inverse of abovementioned chols

samples: tf.Tensor

The samples to be evaluated.

get_newest_samples(N)[source]

Returns (up to) the N newest samples, and their meta-information.

Returns:

log_pdfs - the log-density of the GMM that was effectively used for drawing the samples (used for importance sampling)

active_sample - the selected samples

active_mapping - contains for every sample the index of the component that was used for drawing it

active_target_lnpdfs - log-density evaluations of the target distribution for the selected samples

active_target_grads - gradients evaluations of the log-density of the target distribution for the selected samples

Return type:

tuple(tf.Tensor, tf.Tensor, tf.Tensor, tf.Tensor, tf.Tensor)

get_random_sample(N: int)[source]

Get N random samples from the database.

Parameters:

N – int abovementioned N

Returns:

tuple(tf.Tensor, tf.Tensor)

samples - the chosen samples

target_lnpdfs - the corresponding log densities of the target distribution

remove_every_nth_sample(N: int)[source]

Deletes Every N-th sample from the database and the associated meta information.

Parameters:

N – int abovementioned N

class gmmvi.optimization.gmmvi.GMMVI(model: GmmWrapper, sample_db: SampleDB, temperature: tf.float32, sample_selector: SampleSelector, num_component_adapter: ComponentAdaptation, component_stepsize_adapter: ComponentStepsizeAdaptation, ng_estimator: NgEstimator, ng_based_updater: NgBasedComponentUpdater, weight_stepsize_adapter: WeightStepsizeAdaptation, weight_updater: WeightUpdater)[source]

The main class of this framework, which provides the functionality to perform a complete update step for the GMM.

Responsibilities for performing the necessary sub-steps (sample selection, natural gradient estimation, etc.) and for keeping track of data are delegated to the GMMVI Modules, the SampleDB and GmmWrapper. Hence, this class acts mainly as a manager between these components.

Parameters:
  • modelGmmWrapper The (wrapped) model that we are optimizing.

  • sample_dbSampleDB The database for storing samples.

  • temperature – tf.float32 The temperature parameter \beta for weighting the model entropy H(q) in the optimization problem \arg\max_q \mathbb{E}\left[ \log(\tilde{p}(x)) \right] + \beta H(q).

  • sample_selectorSampleSelector The SampleSelector for selecting the samples that are used during each iteration.

  • num_component_adapterNumComponentAdaptation The NumComponentAdapter used for adding and deleting components.

  • component_stepsize_adapterComponentStepsizeAdaptation The ComponentStepsizeAdapter for choosing the learning rates for the component update.

  • ng_estimatorNgEstimator The NgEstimator for estimating the natural gradient for the component update.

  • ng_based_updaterNgBasedComponentUpdater The NgBasedComponentUpdater for updating the components based on the estimated natural gradients.

  • weight_stepsize_adapterWeightStepsizeAdaptation The WeightStepsizeAdapter for choosing the learning rate for updating the mixture weights.

  • weight_updaterWeightUpdater The NgBasedComponentUpdater for updating the components based on the estimated natural gradients.

static build_from_config(config: dict, target_distribution: LNPDF, model: GmmWrapper)[source]

Create a GMMVI instance from a configuration dictionary.

This static method provides a convenient way to create a GMMVI instance, based on an initial GMM ( a wrapped model ), a target_distribution and a dictionary containing the types and parameters of the GMMVI modules.

Parameters:
  • config – dict The dictionary should contain for each GMMVI module an entry of the form XXX_type (a string) and XXX_config (a dict) for specifying the type of each module, and the module-specific hyperparameters. For example, the dictionary could contain sample_selector_type={“component-based”} and sample_selector_config={“desired_samples_per_component”: 100, “ratio_reused_samples_to_desired”: 2.}. Refer to the example yml-configs, or to the individual GMMVI module for the expected parameters, and type-strings.

  • target_distributionLNPDF The (unnormalized) target distribution that we want to approximate.

  • modelGmmWrapper The (wrapped) model that we are optimizing.

train_iter()[source]

Perform a single training iteration.

This method does not take any parameters, nor does it return anything. However, it may have several effects, such as

  • drawing new samples from the model and evaluating them on the target distribution,

  • updating the gmmvi.optimization.gmmvi.GMMVI.model parameters,

  • adapting learning rates, etc.