<!DOCTYPE html>
<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>Customize your own flow layer · NormalizingFlows.jl</title><script data-outdated-warner src="../assets/warner.js"></script><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.045/juliamono.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.4/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.4/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.4/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.24/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../assets/documenter.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><div class="docs-package-name"><span class="docs-autofit"><a href="../">NormalizingFlows.jl</a></span></div><form class="docs-search" action="../search/"><input class="docs-search-query" id="documenter-search-query" name="q" type="text" placeholder="Search docs"/></form><ul class="docs-menu"><li><a class="tocitem" href="../">Home</a></li><li><a class="tocitem" href="../api/">API</a></li><li><a class="tocitem" href="../example/">Example</a></li><li class="is-active"><a class="tocitem" href>Customize your own flow layer</a><ul class="internal"><li><a class="tocitem" href="#Affine-Coupling-Flow"><span>Affine Coupling Flow</span></a></li><li><a class="tocitem" href="#Implementing-Affine-Coupling-Layer"><span>Implementing Affine Coupling Layer</span></a></li><li><a class="tocitem" href="#Reference"><span>Reference</span></a></li></ul></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>Customize your own flow layer</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>Customize your own flow layer</a></li></ul></nav><div class="docs-right"><a class="docs-edit-link" href="https://github.com/TuringLang/NormalizingFlows.jl/blob/main/docs/src/customized_layer.md#" title="Edit on GitHub"><span class="docs-icon fab"></span><span class="docs-label is-hidden-touch">Edit on GitHub</span></a><a class="docs-settings-button fas fa-cog" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-sidebar-button fa fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a></div></header><article class="content" id="documenter-page"><h1 id="Defining-Your-Own-Flow-Layer"><a class="docs-heading-anchor" href="#Defining-Your-Own-Flow-Layer">Defining Your Own Flow Layer</a><a id="Defining-Your-Own-Flow-Layer-1"></a><a class="docs-heading-anchor-permalink" href="#Defining-Your-Own-Flow-Layer" title="Permalink"></a></h1><p>In practice, user might want to define their own normalizing flow.  As briefly noted in <a href="../#What-are-normalizing-flows?">What are normalizing flows?</a>, the key is to define a customized normalizing flow layer, including its transformation and inverse, as well as the log-determinant of the Jacobian of the transformation. <code>Bijectors.jl</code> offers a convenient interface to define a customized bijection. We refer users to <a href="https://turinglang.org/Bijectors.jl/dev/transforms/#Implementing-a-transformation">the documentation of <code>Bijectors.jl</code></a> for more details. <code>Flux.jl</code> is also a useful package, offering a convenient interface to define neural networks.</p><p>In this tutorial, we demonstrate how to define a customized normalizing flow layer – an <code>Affine Coupling Layer</code> (Dinh <em>et al.</em>, 2016) – using <code>Bijectors.jl</code> and <code>Flux.jl</code>.</p><h2 id="Affine-Coupling-Flow"><a class="docs-heading-anchor" href="#Affine-Coupling-Flow">Affine Coupling Flow</a><a id="Affine-Coupling-Flow-1"></a><a class="docs-heading-anchor-permalink" href="#Affine-Coupling-Flow" title="Permalink"></a></h2><p>Given an input vector <span>$\boldsymbol{x}$</span>, the general <em>coupling transformation</em> splits it into two parts: <span>$\boldsymbol{x}_{I_1}$</span> and <span>$\boldsymbol{x}_{I\setminus I_1}$</span>. Only one part (e.g., <span>$\boldsymbol{x}_{I_1}$</span>) undergoes a bijective transformation <span>$f$</span>, noted as the <em>coupling law</em>,  based on the values of the other part (e.g., <span>$\boldsymbol{x}_{I\setminus I_1}$</span>), which remains unchanged. </p><p class="math-container">\[\begin{array}{llll}
c_{I_1}(\cdot ; f, \theta): &amp; \mathbb{R}^d \rightarrow \mathbb{R}^d &amp; c_{I_1}^{-1}(\cdot ; f, \theta): &amp; \mathbb{R}^d \rightarrow \mathbb{R}^d \\
&amp; \boldsymbol{x}_{I \backslash I_1} \mapsto \boldsymbol{x}_{I \backslash I_1} &amp; &amp; \boldsymbol{y}_{I \backslash I_1} \mapsto \boldsymbol{y}_{I \backslash I_1} \\
&amp; \boldsymbol{x}_{I_1} \mapsto f\left(\boldsymbol{x}_{I_1} ; \theta\left(\boldsymbol{x}_{I\setminus I_1}\right)\right) &amp; &amp; \boldsymbol{y}_{I_1} \mapsto f^{-1}\left(\boldsymbol{y}_{I_1} ; \theta\left(\boldsymbol{y}_{I\setminus I_1}\right)\right)
\end{array}\]</p><p>Here <span>$\theta$</span> can be an arbitrary function, e.g., a neural network. As long as <span>$f(\cdot; \theta(\boldsymbol{x}_{I\setminus I_1}))$</span> is invertible, <span>$c_{I_1}$</span> is invertible, and the  Jacobian determinant of <span>$c_{I_1}$</span> is easy to compute:</p><p class="math-container">\[\left|\text{det} \nabla_x c_{I_1}(x)\right| = \left|\text{det} \nabla_{x_{I_1}} f(x_{I_1}; \theta(x_{I\setminus I_1}))\right|\]</p><p>The affine coupling layer is a special case of the coupling transformation, where the coupling law <span>$f$</span> is an affine function:</p><p class="math-container">\[\begin{aligned}
\boldsymbol{x}_{I_1} &amp;\mapsto \boldsymbol{x}_{I_1} \odot s\left(\boldsymbol{x}_{I\setminus I_1}\right) + t\left(\boldsymbol{x}_{I \setminus I_1}\right) \\
\boldsymbol{x}_{I \backslash I_1} &amp;\mapsto \boldsymbol{x}_{I \backslash I_1}
\end{aligned}\]</p><p>Here, <span>$s$</span> and <span>$t$</span> are arbitrary functions (often neural networks) called the &quot;scaling&quot; and &quot;translation&quot; functions, respectively.  They produce vectors of the same dimension as <span>$\boldsymbol{x}_{I_1}$</span>.</p><h2 id="Implementing-Affine-Coupling-Layer"><a class="docs-heading-anchor" href="#Implementing-Affine-Coupling-Layer">Implementing Affine Coupling Layer</a><a id="Implementing-Affine-Coupling-Layer-1"></a><a class="docs-heading-anchor-permalink" href="#Implementing-Affine-Coupling-Layer" title="Permalink"></a></h2><p>We start by defining a simple 3-layer multi-layer perceptron (MLP) using <code>Flux.jl</code>,  which will be used to define the scaling <span>$s$</span> and translation functions <span>$t$</span> in the affine coupling layer.</p><pre><code class="language-julia hljs">using Flux

function MLP_3layer(input_dim::Int, hdims::Int, output_dim::Int; activation=Flux.leakyrelu)
    return Chain(
        Flux.Dense(input_dim, hdims, activation),
        Flux.Dense(hdims, hdims, activation),
        Flux.Dense(hdims, output_dim),
    )
end</code></pre><pre class="documenter-example-output"><code class="nohighlight hljs ansi">MLP_3layer (generic function with 1 method)</code></pre><h4 id="Construct-the-Object"><a class="docs-heading-anchor" href="#Construct-the-Object">Construct the Object</a><a id="Construct-the-Object-1"></a><a class="docs-heading-anchor-permalink" href="#Construct-the-Object" title="Permalink"></a></h4><p>Following the user interface of <code>Bijectors.jl</code>, we define a struct <code>AffineCoupling</code> as a subtype of <code>Bijectors.Bijector</code>. The functions <code>parition</code> , <code>combine</code> are used to partition and recombine a vector into 3 disjoint subvectors.  And <code>PartitionMask</code> is used to store this partition rule.  These three functions are all defined in <code>Bijectors.jl</code>; see the <a href="https://github.com/TuringLang/Bijectors.jl/blob/49c138fddd3561c893592a75b211ff6ad949e859/src/bijectors/coupling.jl#L3">documentaion</a> for more details.</p><pre><code class="language-julia hljs">using Functors
using Bijectors
using Bijectors: partition, combine, PartitionMask

struct AffineCoupling &lt;: Bijectors.Bijector
    dim::Int
    mask::Bijectors.PartitionMask
    s::Flux.Chain
    t::Flux.Chain
end

# to apply functions to the parameters that are contained in AffineCoupling.s and AffineCoupling.t,
# and to re-build the struct from the parameters, we use the functor interface of `Functors.jl`
# see https://fluxml.ai/Flux.jl/stable/models/functors/#Functors.functor
@functor AffineCoupling (s, t)

function AffineCoupling(
    dim::Int,  # dimension of input
    hdims::Int, # dimension of hidden units for s and t
    mask_idx::AbstractVector, # index of dimension that one wants to apply transformations on
)
    cdims = length(mask_idx) # dimension of parts used to construct coupling law
    s = MLP_3layer(cdims, hdims, cdims)
    t = MLP_3layer(cdims, hdims, cdims)
    mask = PartitionMask(dim, mask_idx)
    return AffineCoupling(dim, mask, s, t)
end</code></pre><pre class="documenter-example-output"><code class="nohighlight hljs ansi">Main.AffineCoupling</code></pre><p>By default, we define <span>$s$</span> and <span>$t$</span> using the <code>MLP_3layer</code> function, which is a 3-layer MLP with leaky ReLU activation function.</p><h4 id="Implement-the-Forward-and-Inverse-Transformations"><a class="docs-heading-anchor" href="#Implement-the-Forward-and-Inverse-Transformations">Implement the Forward and Inverse Transformations</a><a id="Implement-the-Forward-and-Inverse-Transformations-1"></a><a class="docs-heading-anchor-permalink" href="#Implement-the-Forward-and-Inverse-Transformations" title="Permalink"></a></h4><pre><code class="language-julia hljs">function Bijectors.transform(af::AffineCoupling, x::AbstractVector)
    # partition vector using &#39;af.mask::PartitionMask`
    x₁, x₂, x₃ = partition(af.mask, x)
    y₁ = x₁ .* af.s(x₂) .+ af.t(x₂)
    return combine(af.mask, y₁, x₂, x₃)
end

function Bijectors.transform(iaf::Inverse{&lt;:AffineCoupling}, y::AbstractVector)
    af = iaf.orig
    # partition vector using `af.mask::PartitionMask`
    y_1, y_2, y_3 = partition(af.mask, y)
    # inverse transformation
    x_1 = (y_1 .- af.t(y_2)) ./ af.s(y_2)
    return combine(af.mask, x_1, y_2, y_3)
end</code></pre><h4 id="Implement-the-Log-determinant-of-the-Jacobian"><a class="docs-heading-anchor" href="#Implement-the-Log-determinant-of-the-Jacobian">Implement the Log-determinant of the Jacobian</a><a id="Implement-the-Log-determinant-of-the-Jacobian-1"></a><a class="docs-heading-anchor-permalink" href="#Implement-the-Log-determinant-of-the-Jacobian" title="Permalink"></a></h4><p>Notice that here we wrap the transformation and the log-determinant of the Jacobian into a single function, <code>with_logabsdet_jacobian</code>.</p><pre><code class="language-julia hljs">function Bijectors.with_logabsdet_jacobian(af::AffineCoupling, x::AbstractVector)
    x_1, x_2, x_3 = Bijectors.partition(af.mask, x)
    y_1 = af.s(x_2) .* x_1 .+ af.t(x_2)
    logjac = sum(log ∘ abs, af.s(x_2))
    return combine(af.mask, y_1, x_2, x_3), logjac
end

function Bijectors.with_logabsdet_jacobian(
    iaf::Inverse{&lt;:AffineCoupling}, y::AbstractVector
)
    af = iaf.orig
    # partition vector using `af.mask::PartitionMask`
    y_1, y_2, y_3 = partition(af.mask, y)
    # inverse transformation
    x_1 = (y_1 .- af.t(y_2)) ./ af.s(y_2)
    logjac = -sum(log ∘ abs, af.s(y_2))
    return combine(af.mask, x_1, y_2, y_3), logjac
end</code></pre><h4 id="Construct-Normalizing-Flow"><a class="docs-heading-anchor" href="#Construct-Normalizing-Flow">Construct Normalizing Flow</a><a id="Construct-Normalizing-Flow-1"></a><a class="docs-heading-anchor-permalink" href="#Construct-Normalizing-Flow" title="Permalink"></a></h4><p>Now with all the above implementations, we are ready to use the <code>AffineCoupling</code> layer for normalizing flow  by applying it to a base distribution <span>$q_0$</span>.</p><pre><code class="language-julia hljs">using Random, Distributions, LinearAlgebra
dim = 4
hdims = 10
Ls = [
    AffineCoupling(dim, hdims, 1:2),
    AffineCoupling(dim, hdims, 3:4),
    AffineCoupling(dim, hdims, 1:2),
    AffineCoupling(dim, hdims, 3:4),
    ]
ts = reduce(∘, Ls)
q₀ = MvNormal(zeros(Float32, dim), I)
flow = Bijectors.transformed(q₀, ts)</code></pre><pre class="documenter-example-output"><code class="nohighlight hljs ansi">Bijectors.MultivariateTransformed{Distributions.MvNormal{Float32, PDMats.ScalMat{Float32}, Vector{Float32}}, ComposedFunction{ComposedFunction{ComposedFunction{Main.AffineCoupling, Main.AffineCoupling}, Main.AffineCoupling}, Main.AffineCoupling}}(
dist: Distributions.MvNormal{Float32, PDMats.ScalMat{Float32}, Vector{Float32}}(
dim: 4
μ: Float32[0.0, 0.0, 0.0, 0.0]
Σ: Float32[1.0 0.0 0.0 0.0; 0.0 1.0 0.0 0.0; 0.0 0.0 1.0 0.0; 0.0 0.0 0.0 1.0]
)

transform: Main.AffineCoupling(4, Bijectors.PartitionMask{Bool, SparseArrays.SparseMatrixCSC{Bool, Int64}}(sparse([1, 2], [1, 2], Bool[1, 1], 4, 2), sparse([3, 4], [1, 2], Bool[1, 1], 4, 2), sparse(Int64[], Int64[], Bool[], 4, 0)), Chain(Dense(2 =&gt; 10, leakyrelu), Dense(10 =&gt; 10, leakyrelu), Dense(10 =&gt; 2)), Chain(Dense(2 =&gt; 10, leakyrelu), Dense(10 =&gt; 10, leakyrelu), Dense(10 =&gt; 2))) ∘ Main.AffineCoupling(4, Bijectors.PartitionMask{Bool, SparseArrays.SparseMatrixCSC{Bool, Int64}}(sparse([3, 4], [1, 2], Bool[1, 1], 4, 2), sparse([1, 2], [1, 2], Bool[1, 1], 4, 2), sparse(Int64[], Int64[], Bool[], 4, 0)), Chain(Dense(2 =&gt; 10, leakyrelu), Dense(10 =&gt; 10, leakyrelu), Dense(10 =&gt; 2)), Chain(Dense(2 =&gt; 10, leakyrelu), Dense(10 =&gt; 10, leakyrelu), Dense(10 =&gt; 2))) ∘ Main.AffineCoupling(4, Bijectors.PartitionMask{Bool, SparseArrays.SparseMatrixCSC{Bool, Int64}}(sparse([1, 2], [1, 2], Bool[1, 1], 4, 2), sparse([3, 4], [1, 2], Bool[1, 1], 4, 2), sparse(Int64[], Int64[], Bool[], 4, 0)), Chain(Dense(2 =&gt; 10, leakyrelu), Dense(10 =&gt; 10, leakyrelu), Dense(10 =&gt; 2)), Chain(Dense(2 =&gt; 10, leakyrelu), Dense(10 =&gt; 10, leakyrelu), Dense(10 =&gt; 2))) ∘ Main.AffineCoupling(4, Bijectors.PartitionMask{Bool, SparseArrays.SparseMatrixCSC{Bool, Int64}}(sparse([3, 4], [1, 2], Bool[1, 1], 4, 2), sparse([1, 2], [1, 2], Bool[1, 1], 4, 2), sparse(Int64[], Int64[], Bool[], 4, 0)), Chain(Dense(2 =&gt; 10, leakyrelu), Dense(10 =&gt; 10, leakyrelu), Dense(10 =&gt; 2)), Chain(Dense(2 =&gt; 10, leakyrelu), Dense(10 =&gt; 10, leakyrelu), Dense(10 =&gt; 2)))
)
</code></pre><p>We can now sample from the flow:</p><pre><code class="language-julia hljs">x = rand(flow, 10)</code></pre><pre class="documenter-example-output"><code class="nohighlight hljs ansi">4×10 Matrix{Float32}:
  0.129723    0.187571   -0.0116294   2.01838   …  -0.00746077   0.0829434
 -0.0153817  -0.0920763  -0.0236419  -0.408698     -0.0229209   -0.0247007
  0.198745    0.313873   -0.154054    1.00878      -0.131809     0.151409
 -0.213462   -0.270188   -0.0423213  -1.26537      -0.0720663   -0.137927</code></pre><p>And evaluate the density of the flow:</p><pre><code class="language-julia hljs">logpdf(flow, x[:,1])</code></pre><pre class="documenter-example-output"><code class="nohighlight hljs ansi">7.9496574f0</code></pre><h2 id="Reference"><a class="docs-heading-anchor" href="#Reference">Reference</a><a id="Reference-1"></a><a class="docs-heading-anchor-permalink" href="#Reference" title="Permalink"></a></h2><p>Dinh, L., Sohl-Dickstein, J. and Bengio, S., 2016. <em>Density estimation using real nvp.</em>  arXiv:1605.08803.</p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../example/">« Example</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 0.27.25 on <span class="colophon-date" title="Saturday 19 August 2023 23:38">Saturday 19 August 2023</span>. Using Julia version 1.9.2.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
