NgEstimator¶

- class gmmvi.optimization.gmmvi_modules.ng_estimator.NgEstimator(temperature, model: GmmWrapper, requires_gradient: bool, only_use_own_samples: bool, use_self_normalized_importance_weights: bool)[source]¶
This class provides a common interface for estimating the natural gradient for a Gaussian component.
There are currently two options for estimating the natural gradient:
The
MoreNgEstimator
uses compatible function approximation to estimate the natural gradient from a quadratic reward surrogate [ALL+15, PTA+19, PS08, SMSM99].The
SteinNgEstimator
uses Stein’s Lemma to estimate the natural gradient using first-order information [LKS19b].
- Parameters:
temperature – float Usually temperature=1., can be used to scale the importance of maximizing the model entropy.
model –
GmmWrapper
The wrapped model where we want to update the components.requires_gradient – bool Does this object require first-order information?
only_use_own_samples – bool If true, we do not use importance sampling to update one component based on samples from a different component.
use_self_normalized_importance_weights – bool if true, use self-normalized importance weighting (normalizing the importance weights such they sum to one), rather than standard importance weighting.
- static build_from_config(config, temperature, gmm_wrapper)[source]¶
This static method provides a convenient way to create a
MoreNgEstimator
, orSteinNgEstimator
depending on the provided config.- Parameters:
temperature – float Usually temperature=1., can be used to scale the importance of maximizing the model entropy.
config – dict The dictionary is typically read from YAML a file, and holds all hyperparameters.
- get_expected_hessian_and_grad(samples: Tensor, mapping: Tensor, background_densities: Tensor, target_lnpdfs: Tensor, target_lnpdfs_grads: Tensor)[source]¶
Perform the natural gradient estimation, needs to be implemented by the deriving class.
- Parameters:
samples – tf.Tensor a tensor of shape num_samples x num_dimension containing the samples used for the approximation
mapping – tf.Tensor a one-dimensional tensor of integers, storing for every sample from which component it was sampled.
background_densities – tf.Tensor the log probability density of the background distribution (which was used for sampling the provided samples). A one-dimensional tensor of size num_samples.
target_lnpdfs – tf.Tensor The rewards are given by the log-densities of the target-distribution,
.
target_lnpdfs_grads – tf.Tensor The gradients of the target_lnpdfs with respect to the samples,
.
- Returns:
expected_hessian_neg - A tensor of shape num_components x num_dimensions x num_dimensions containing for each component an estimate of the (negated) expected Hessian
expected_gradient_neg - A tensor of shape num_components x num_dimensions containing for each component an estimate of the (negated) expected gradient
- Return type:
tuple(tf.Tensor, tf.Tensor)
MoreNgEstimator¶
- class gmmvi.optimization.gmmvi_modules.ng_estimator.MoreNgEstimator(temperature, model, only_use_own_samples: bool, initial_l2_regularizer: float, use_self_normalized_importance_weights: bool)[source]¶
Use compatible function approximation to estimate the natural gradient from a quadratic reward surrogate. See [ALL+15, PTA+19, PS08, SMSM99].
- Parameters:
temperature – float Usually temperature=1., can be used to scale the importance of maximizing the model entropy.
model –
GmmWrapper
The wrapped model where we want to update the components.only_use_own_samples – bool If true, we do not use importance sampling to update one component based on samples from a different component.
initial_l2_regularizer – float The l2_regularizer is as regularizer during weighted least-squares (ridge regression) for fitting the compatible surrogate.
use_self_normalized_importance_weights – bool if true, use self-normalized importance weighting (normalizing the importance weights such they sum to one), rather than standard importance weighting.
- get_expected_hessian_and_grad(samples: Tensor, mapping: Tensor, background_densities: Tensor, target_lnpdfs: Tensor, target_lnpdfs_grads: Tensor) [<class 'tensorflow.python.framework.ops.Tensor'>, <class 'tensorflow.python.framework.ops.Tensor'>] [source]¶
Estimates the natural gradient using compatible function approximation. This method does not require / make use of the provided gradients, but only uses the function evaluations target_lnpdfs for estimating the natural gradient. The method fits a quadratic reward function
to approximate the target distribution using importance-weighted least squares where the targets are given by target_lnpdfs,
. The natural gradient estimate, can then be computed from the coefficients
and
.
- Parameters:
samples – tf.Tensor a tensor of shape num_samples x num_dimension containing the samples used for the approximation
mapping – tf.Tensor a one-dimensional tensor of integers, storing for every sample from which component it was sampled.
background_densities – tf.Tensor the log probability density of the background distribution (which was used for sampling the provided samples). A one-dimensional tensor of size num_samples.
target_lnpdfs – tf.Tensor The rewards are given by the (unnormalized) log-densities of the target-distribution,
.
target_lnpdfs_grads – tf.Tensor The gradients of the target_lnpdfs with respect to the samples (not used),
.
- Returns:
expected_hessian_neg - A tensor of shape num_components x num_dimensions x num_dimensions containing for each component an estimate of the (negated) expected Hessian
expected_gradient_neg - A tensor of shape num_components x num_dimensions containing for each component an estimate of the (negated) expected gradient
- Return type:
tuple(tf.Tensor, tf.Tensor)
SteinNgEstimator¶
- class gmmvi.optimization.gmmvi_modules.ng_estimator.SteinNgEstimator(temperature, model, only_use_own_samples: bool, use_self_normalized_importance_weights: bool)[source]¶
Use Stein’s Lemma to estimate the natural gradient using first-order information. See [LKS19b].
- Parameters:
temperature – float Usually temperature=1., can be used to scale the importance of maximizing the model entropy.
model –
GmmWrapper
The wrapped model where we want to update the components.only_use_own_samples – bool If true, we do not use importance sampling to update one component based on samples from a different component.
use_self_normalized_importance_weights – bool if true, use self-normalized importance weighting (normalizing the importance weights such they sum to one), rather than standard importance weighting.
- get_expected_hessian_and_grad(samples: Tensor, mapping: Tensor, background_densities: Tensor, target_lnpdfs: Tensor, target_lnpdfs_grads: Tensor)[source]¶
Estimates the natural gradient using Stein’s Lemma [LKS19b]. The expected gradient is a simple importance-weighted Monte-Carlo estimate based on the provided target_lnpdfs_grads and the gradients of the component log-densities. The expected Hessians are estimated as
, where
is the gradient of the log-ratio with respect to the corresponding sample.
- Parameters:
samples – tf.Tensor a tensor of shape num_samples x num_dimension containing the samples used for the approximation
mapping – tf.Tensor a one-dimensional tensor of integers, storing for every sample from which component it was sampled.
background_densities – tf.Tensor the log probability density of the background distribution (which was used for sampling the provided samples). A one-dimensional tensor of size num_samples.
target_lnpdfs – tf.Tensor The rewards are given by the log-densities of the target-distribution,
.
target_lnpdfs_grads – tf.Tensor The gradients of the target_lnpdfs with respect to the samples,
.
- Returns:
expected_hessian_neg - A tensor of shape num_components x num_dimensions x num_dimensions containing for each component an estimate of the (negated) expected Hessian
expected_gradient_neg - A tensor of shape num_components x num_dimensions containing for each component an estimate of the (negated) expected gradient
- Return type:
tuple(tf.Tensor, tf.Tensor)