<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, minimum-scale=1" />
<meta name="generator" content="pdoc 0.10.0" />
<title>fathom.core.optimizers API documentation</title>
<meta name="description" content="Lightweight library for working with optimizers.
To replace fedjax.core.optimizers with different instantiations to allow dynamic learning rate." />
<link rel="preload stylesheet" as="style" href="https://cdnjs.cloudflare.com/ajax/libs/10up-sanitize.css/11.0.1/sanitize.min.css" integrity="sha256-PK9q560IAAa6WVRRh76LtCaI8pjTJ2z11v0miyNNjrs=" crossorigin>
<link rel="preload stylesheet" as="style" href="https://cdnjs.cloudflare.com/ajax/libs/10up-sanitize.css/11.0.1/typography.min.css" integrity="sha256-7l/o7C8jubJiy74VsKTidCy1yBkRtiUGbVkYBylBqUg=" crossorigin>
<link rel="stylesheet preload" as="style" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/10.1.1/styles/github.min.css" crossorigin>
<style>:root{--highlight-color:#fe9}.flex{display:flex !important}body{line-height:1.5em}#content{padding:20px}#sidebar{padding:30px;overflow:hidden}#sidebar > *:last-child{margin-bottom:2cm}.http-server-breadcrumbs{font-size:130%;margin:0 0 15px 0}#footer{font-size:.75em;padding:5px 30px;border-top:1px solid #ddd;text-align:right}#footer p{margin:0 0 0 1em;display:inline-block}#footer p:last-child{margin-right:30px}h1,h2,h3,h4,h5{font-weight:300}h1{font-size:2.5em;line-height:1.1em}h2{font-size:1.75em;margin:1em 0 .50em 0}h3{font-size:1.4em;margin:25px 0 10px 0}h4{margin:0;font-size:105%}h1:target,h2:target,h3:target,h4:target,h5:target,h6:target{background:var(--highlight-color);padding:.2em 0}a{color:#058;text-decoration:none;transition:color .3s ease-in-out}a:hover{color:#e82}.title code{font-weight:bold}h2[id^="header-"]{margin-top:2em}.ident{color:#900}pre code{background:#f8f8f8;font-size:.8em;line-height:1.4em}code{background:#f2f2f1;padding:1px 4px;overflow-wrap:break-word}h1 code{background:transparent}pre{background:#f8f8f8;border:0;border-top:1px solid #ccc;border-bottom:1px solid #ccc;margin:1em 0;padding:1ex}#http-server-module-list{display:flex;flex-flow:column}#http-server-module-list div{display:flex}#http-server-module-list dt{min-width:10%}#http-server-module-list p{margin-top:0}.toc ul,#index{list-style-type:none;margin:0;padding:0}#index code{background:transparent}#index h3{border-bottom:1px solid #ddd}#index ul{padding:0}#index h4{margin-top:.6em;font-weight:bold}@media (min-width:200ex){#index .two-column{column-count:2}}@media (min-width:300ex){#index .two-column{column-count:3}}dl{margin-bottom:2em}dl dl:last-child{margin-bottom:4em}dd{margin:0 0 1em 3em}#header-classes + dl > dd{margin-bottom:3em}dd dd{margin-left:2em}dd p{margin:10px 0}.name{background:#eee;font-weight:bold;font-size:.85em;padding:5px 10px;display:inline-block;min-width:40%}.name:hover{background:#e0e0e0}dt:target .name{background:var(--highlight-color)}.name > span:first-child{white-space:nowrap}.name.class > span:nth-child(2){margin-left:.4em}.inherited{color:#999;border-left:5px solid #eee;padding-left:1em}.inheritance em{font-style:normal;font-weight:bold}.desc h2{font-weight:400;font-size:1.25em}.desc h3{font-size:1em}.desc dt code{background:inherit}.source summary,.git-link-div{color:#666;text-align:right;font-weight:400;font-size:.8em;text-transform:uppercase}.source summary > *{white-space:nowrap;cursor:pointer}.git-link{color:inherit;margin-left:1em}.source pre{max-height:500px;overflow:auto;margin:0}.source pre code{font-size:12px;overflow:visible}.hlist{list-style:none}.hlist li{display:inline}.hlist li:after{content:',\2002'}.hlist li:last-child:after{content:none}.hlist .hlist{display:inline;padding-left:1em}img{max-width:100%}td{padding:0 .5em}.admonition{padding:.1em .5em;margin-bottom:1em}.admonition-title{font-weight:bold}.admonition.note,.admonition.info,.admonition.important{background:#aef}.admonition.todo,.admonition.versionadded,.admonition.tip,.admonition.hint{background:#dfd}.admonition.warning,.admonition.versionchanged,.admonition.deprecated{background:#fd4}.admonition.error,.admonition.danger,.admonition.caution{background:lightpink}</style>
<style media="screen and (min-width: 700px)">@media screen and (min-width:700px){#sidebar{width:30%;height:100vh;overflow:auto;position:sticky;top:0}#content{width:70%;max-width:100ch;padding:3em 4em;border-left:1px solid #ddd}pre code{font-size:1em}.item .name{font-size:1em}main{display:flex;flex-direction:row-reverse;justify-content:flex-end}.toc ul ul,#index ul{padding-left:1.5em}.toc > ul > li{margin-top:.5em}}</style>
<style media="print">@media print{#sidebar h1{page-break-before:always}.source{display:none}}@media print{*{background:transparent !important;color:#000 !important;box-shadow:none !important;text-shadow:none !important}a[href]:after{content:" (" attr(href) ")";font-size:90%}a[href][title]:after{content:none}abbr[title]:after{content:" (" attr(title) ")"}.ir a:after,a[href^="javascript:"]:after,a[href^="#"]:after{content:""}pre,blockquote{border:1px solid #999;page-break-inside:avoid}thead{display:table-header-group}tr,img{page-break-inside:avoid}img{max-width:100% !important}@page{margin:0.5cm}p,h2,h3{orphans:3;widows:3}h1,h2,h3,h4,h5,h6{page-break-after:avoid}}</style>
<script defer src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/10.1.1/highlight.min.js" integrity="sha256-Uv3H6lx7dJmRfRvH8TH6kJD1TSK1aFcwgx+mdg3epi8=" crossorigin></script>
<script>window.addEventListener('DOMContentLoaded', () => hljs.initHighlighting())</script>
</head>
<body>
<main>
<article id="content">
<header>
<h1 class="title">Module <code>fathom.core.optimizers</code></h1>
</header>
<section id="section-intro">
<p>Lightweight library for working with optimizers.
To replace fedjax.core.optimizers with different instantiations to allow dynamic learning rate.</p>
<details class="source">
<summary>
<span>Expand source code</span>
</summary>
<pre><code class="python"># Copyright 2022 FATHOM Authors
#
# Licensed under the Apache License, Version 2.0 (the &#34;License&#34;);
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an &#34;AS IS&#34; BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
&#34;&#34;&#34;Lightweight library for working with optimizers.
   To replace fedjax.core.optimizers with different instantiations to allow dynamic learning rate.
&#34;&#34;&#34;

from typing import Callable, List, Optional, Tuple, Union
from fedjax.core.typing import Params
from fedjax.core.optimizers import Optimizer, ScalarOrSchedule, create_optimizer_from_optax
import optax

Grads = Params

def adagrad(learning_rate: ScalarOrSchedule,
            initial_accumulator_value: float = 0.1,
            eps: float = 1e-6) -&gt; Optimizer:
    &#34;&#34;&#34;The Adagrad optimizer.

    Adagrad is an algorithm for gradient based optimisation that anneals the
    learning rate for each parameter during the course of training.

    WARNING: Adagrad&#39;s main limit is the monotonic accumulation of squared
    gradients in the denominator: since all terms are &gt;0, the sum keeps growing
    during training and the learning rate eventually becomes vanishingly small.

    References:
    [Duchi et al, 2011](https://jmlr.org/papers/v12/duchi11a.html)

    Args:
    learning_rate: This is a fixed global scaling factor.
    initial_accumulator_value: Initialisation for the accumulator.
    eps: A small constant applied to denominator inside of the square root (as
      in RMSProp) to avoid dividing by zero when rescaling.

    Returns:
    The corresponding `Optimizer`.
    &#34;&#34;&#34;
    return create_optimizer_from_optax(
        optax.inject_hyperparams(optax.adagrad)(
            learning_rate=learning_rate,
            initial_accumulator_value=initial_accumulator_value,
            eps=eps,
        )
    )


def adam(learning_rate: ScalarOrSchedule,
         b1: float = 0.9,
         b2: float = 0.999,
         eps: float = 1e-8,
         eps_root: float = 0.0) -&gt; Optimizer:
    &#34;&#34;&#34;The classic Adam optimiser.

    Adam is an SGD variant with learning rate adaptation. The `learning_rate`
    used for each weight is computed from estimates of first- and second-order
    moments of the gradients (using suitable exponential moving averages).

    References:
    [Kingma et al, 2014](https://arxiv.org/abs/1412.6980)

    Args:
    learning_rate: This is a fixed global scaling factor.
    b1: The exponential decay rate to track the first moment of past gradients.
    b2: The exponential decay rate to track the second moment of past gradients.
    eps: A small constant applied to denominator outside of the square root (as
      in the Adam paper) to avoid dividing by zero when rescaling.
    eps_root: A small constant applied to denominator inside the square root (as
      in RMSProp), to avoid dividing by zero when rescaling. This is needed for
      example when computing (meta-)gradients through Adam.

    Returns:
    The corresponding `Optimizer`.
    &#34;&#34;&#34;
    return create_optimizer_from_optax(
        optax.inject_hyperparams(optax.adam)(
            learning_rate=learning_rate, 
            b1=b1, 
            b2=b2, 
            eps=eps,
            eps_root=eps_root,
        )
    )

def rmsprop(learning_rate: ScalarOrSchedule,
            decay: float = 0.9,
            eps: float = 1e-8,
            initial_scale: float = 0.,
            centered: bool = False,
            momentum: Optional[float] = None,
            nesterov: bool = False) -&gt; Optimizer:
    &#34;&#34;&#34;A flexible RMSProp optimiser.

    RMSProp is an SGD variant with learning rate adaptation. The `learning_rate`
    used for each weight is scaled by a suitable estimate of the magnitude of the
    gradients on previous steps. Several variants of RMSProp can be found
    in the literature. This alias provides an easy to configure RMSProp
    optimiser that can be used to switch between several of these variants.

    References:
    [Tieleman and Hinton, 2012](www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf)
    [Graves, 2013](https://arxiv.org/abs/1308.0850)

    Args:
    learning_rate: This is a fixed global scaling factor.
    decay: The decay used to track the magnitude of previous gradients.
    eps: A small numerical constant to avoid dividing by zero when rescaling.
    initial_scale: Initialisation of accumulators tracking the magnitude of
      previous updates. PyTorch uses `0`, TF1 uses `1`. When reproducing results
      from a paper, verify the value used by the authors.
    centered: Whether the second moment or the variance of the past gradients is
      used to rescale the latest gradients.
    momentum: The `decay` rate used by the momentum term, when it is set to
      `None`, then momentum is not used at all.
    nesterov: Whether nesterov momentum is used.

    Returns:
    The corresponding `Optimizer`.
    &#34;&#34;&#34;
    return create_optimizer_from_optax(
        optax.inject_hyperparams(optax.rmsprop)(
            learning_rate=learning_rate,
            decay=decay,
            eps=eps,
            initial_scale=initial_scale,
            centered=centered,
            momentum=momentum,
            nesterov=nesterov,
        )
    )

def sgd(learning_rate: ScalarOrSchedule,
        momentum: Optional[float] = None,
        nesterov: bool = False) -&gt; Optimizer:
    &#34;&#34;&#34;A canonical Stochastic Gradient Descent optimiser.

    This implements stochastic gradient descent. It also includes support for
    momentum, and nesterov acceleration, as these are standard practice when
    using stochastic gradient descent to train deep neural networks.

    References:
    [Sutskever et al, 2013](http://proceedings.mlr.press/v28/sutskever13.pdf)

    Args:
    learning_rate: This is a fixed global scaling factor.
    momentum: The `decay` rate used by the momentum term, when it is set to
      `None`, then momentum is not used at all.
    nesterov: Whether nesterov momentum is used.

    Returns:
    The corresponding `Optimizer`.
    &#34;&#34;&#34;
    return create_optimizer_from_optax(
        optax.inject_hyperparams(optax.sgd)(
            learning_rate=learning_rate,
            momentum=momentum,
            nesterov=nesterov,
        )
    )</code></pre>
</details>
</section>
<section>
</section>
<section>
</section>
<section>
<h2 class="section-title" id="header-functions">Functions</h2>
<dl>
<dt id="fathom.core.optimizers.adagrad"><code class="name flex">
<span>def <span class="ident">adagrad</span></span>(<span>learning_rate: Union[float, Callable[[Union[jax._src.numpy.lax_numpy.ndarray, float, int]], Union[jax._src.numpy.lax_numpy.ndarray, float, int]]], initial_accumulator_value: float = 0.1, eps: float = 1e-06) ‑> fedjax.core.optimizers.Optimizer</span>
</code></dt>
<dd>
<div class="desc"><p>The Adagrad optimizer.</p>
<p>Adagrad is an algorithm for gradient based optimisation that anneals the
learning rate for each parameter during the course of training.</p>
<p>WARNING: Adagrad's main limit is the monotonic accumulation of squared
gradients in the denominator: since all terms are &gt;0, the sum keeps growing
during training and the learning rate eventually becomes vanishingly small.</p>
<p>References:
<a href="https://jmlr.org/papers/v12/duchi11a.html">Duchi et al, 2011</a></p>
<p>Args:
learning_rate: This is a fixed global scaling factor.
initial_accumulator_value: Initialisation for the accumulator.
eps: A small constant applied to denominator inside of the square root (as
in RMSProp) to avoid dividing by zero when rescaling.</p>
<p>Returns:
The corresponding <code>Optimizer</code>.</p></div>
<details class="source">
<summary>
<span>Expand source code</span>
</summary>
<pre><code class="python">def adagrad(learning_rate: ScalarOrSchedule,
            initial_accumulator_value: float = 0.1,
            eps: float = 1e-6) -&gt; Optimizer:
    &#34;&#34;&#34;The Adagrad optimizer.

    Adagrad is an algorithm for gradient based optimisation that anneals the
    learning rate for each parameter during the course of training.

    WARNING: Adagrad&#39;s main limit is the monotonic accumulation of squared
    gradients in the denominator: since all terms are &gt;0, the sum keeps growing
    during training and the learning rate eventually becomes vanishingly small.

    References:
    [Duchi et al, 2011](https://jmlr.org/papers/v12/duchi11a.html)

    Args:
    learning_rate: This is a fixed global scaling factor.
    initial_accumulator_value: Initialisation for the accumulator.
    eps: A small constant applied to denominator inside of the square root (as
      in RMSProp) to avoid dividing by zero when rescaling.

    Returns:
    The corresponding `Optimizer`.
    &#34;&#34;&#34;
    return create_optimizer_from_optax(
        optax.inject_hyperparams(optax.adagrad)(
            learning_rate=learning_rate,
            initial_accumulator_value=initial_accumulator_value,
            eps=eps,
        )
    )</code></pre>
</details>
</dd>
<dt id="fathom.core.optimizers.adam"><code class="name flex">
<span>def <span class="ident">adam</span></span>(<span>learning_rate: Union[float, Callable[[Union[jax._src.numpy.lax_numpy.ndarray, float, int]], Union[jax._src.numpy.lax_numpy.ndarray, float, int]]], b1: float = 0.9, b2: float = 0.999, eps: float = 1e-08, eps_root: float = 0.0) ‑> fedjax.core.optimizers.Optimizer</span>
</code></dt>
<dd>
<div class="desc"><p>The classic Adam optimiser.</p>
<p>Adam is an SGD variant with learning rate adaptation. The <code>learning_rate</code>
used for each weight is computed from estimates of first- and second-order
moments of the gradients (using suitable exponential moving averages).</p>
<p>References:
<a href="https://arxiv.org/abs/1412.6980">Kingma et al, 2014</a></p>
<p>Args:
learning_rate: This is a fixed global scaling factor.
b1: The exponential decay rate to track the first moment of past gradients.
b2: The exponential decay rate to track the second moment of past gradients.
eps: A small constant applied to denominator outside of the square root (as
in the Adam paper) to avoid dividing by zero when rescaling.
eps_root: A small constant applied to denominator inside the square root (as
in RMSProp), to avoid dividing by zero when rescaling. This is needed for
example when computing (meta-)gradients through Adam.</p>
<p>Returns:
The corresponding <code>Optimizer</code>.</p></div>
<details class="source">
<summary>
<span>Expand source code</span>
</summary>
<pre><code class="python">def adam(learning_rate: ScalarOrSchedule,
         b1: float = 0.9,
         b2: float = 0.999,
         eps: float = 1e-8,
         eps_root: float = 0.0) -&gt; Optimizer:
    &#34;&#34;&#34;The classic Adam optimiser.

    Adam is an SGD variant with learning rate adaptation. The `learning_rate`
    used for each weight is computed from estimates of first- and second-order
    moments of the gradients (using suitable exponential moving averages).

    References:
    [Kingma et al, 2014](https://arxiv.org/abs/1412.6980)

    Args:
    learning_rate: This is a fixed global scaling factor.
    b1: The exponential decay rate to track the first moment of past gradients.
    b2: The exponential decay rate to track the second moment of past gradients.
    eps: A small constant applied to denominator outside of the square root (as
      in the Adam paper) to avoid dividing by zero when rescaling.
    eps_root: A small constant applied to denominator inside the square root (as
      in RMSProp), to avoid dividing by zero when rescaling. This is needed for
      example when computing (meta-)gradients through Adam.

    Returns:
    The corresponding `Optimizer`.
    &#34;&#34;&#34;
    return create_optimizer_from_optax(
        optax.inject_hyperparams(optax.adam)(
            learning_rate=learning_rate, 
            b1=b1, 
            b2=b2, 
            eps=eps,
            eps_root=eps_root,
        )
    )</code></pre>
</details>
</dd>
<dt id="fathom.core.optimizers.rmsprop"><code class="name flex">
<span>def <span class="ident">rmsprop</span></span>(<span>learning_rate: Union[float, Callable[[Union[jax._src.numpy.lax_numpy.ndarray, float, int]], Union[jax._src.numpy.lax_numpy.ndarray, float, int]]], decay: float = 0.9, eps: float = 1e-08, initial_scale: float = 0.0, centered: bool = False, momentum: Optional[float] = None, nesterov: bool = False) ‑> fedjax.core.optimizers.Optimizer</span>
</code></dt>
<dd>
<div class="desc"><p>A flexible RMSProp optimiser.</p>
<p>RMSProp is an SGD variant with learning rate adaptation. The <code>learning_rate</code>
used for each weight is scaled by a suitable estimate of the magnitude of the
gradients on previous steps. Several variants of RMSProp can be found
in the literature. This alias provides an easy to configure RMSProp
optimiser that can be used to switch between several of these variants.</p>
<p>References:
<a href="www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf">Tieleman and Hinton, 2012</a>
<a href="https://arxiv.org/abs/1308.0850">Graves, 2013</a></p>
<p>Args:
learning_rate: This is a fixed global scaling factor.
decay: The decay used to track the magnitude of previous gradients.
eps: A small numerical constant to avoid dividing by zero when rescaling.
initial_scale: Initialisation of accumulators tracking the magnitude of
previous updates. PyTorch uses <code>0</code>, TF1 uses <code>1</code>. When reproducing results
from a paper, verify the value used by the authors.
centered: Whether the second moment or the variance of the past gradients is
used to rescale the latest gradients.
momentum: The <code>decay</code> rate used by the momentum term, when it is set to
<code>None</code>, then momentum is not used at all.
nesterov: Whether nesterov momentum is used.</p>
<p>Returns:
The corresponding <code>Optimizer</code>.</p></div>
<details class="source">
<summary>
<span>Expand source code</span>
</summary>
<pre><code class="python">def rmsprop(learning_rate: ScalarOrSchedule,
            decay: float = 0.9,
            eps: float = 1e-8,
            initial_scale: float = 0.,
            centered: bool = False,
            momentum: Optional[float] = None,
            nesterov: bool = False) -&gt; Optimizer:
    &#34;&#34;&#34;A flexible RMSProp optimiser.

    RMSProp is an SGD variant with learning rate adaptation. The `learning_rate`
    used for each weight is scaled by a suitable estimate of the magnitude of the
    gradients on previous steps. Several variants of RMSProp can be found
    in the literature. This alias provides an easy to configure RMSProp
    optimiser that can be used to switch between several of these variants.

    References:
    [Tieleman and Hinton, 2012](www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf)
    [Graves, 2013](https://arxiv.org/abs/1308.0850)

    Args:
    learning_rate: This is a fixed global scaling factor.
    decay: The decay used to track the magnitude of previous gradients.
    eps: A small numerical constant to avoid dividing by zero when rescaling.
    initial_scale: Initialisation of accumulators tracking the magnitude of
      previous updates. PyTorch uses `0`, TF1 uses `1`. When reproducing results
      from a paper, verify the value used by the authors.
    centered: Whether the second moment or the variance of the past gradients is
      used to rescale the latest gradients.
    momentum: The `decay` rate used by the momentum term, when it is set to
      `None`, then momentum is not used at all.
    nesterov: Whether nesterov momentum is used.

    Returns:
    The corresponding `Optimizer`.
    &#34;&#34;&#34;
    return create_optimizer_from_optax(
        optax.inject_hyperparams(optax.rmsprop)(
            learning_rate=learning_rate,
            decay=decay,
            eps=eps,
            initial_scale=initial_scale,
            centered=centered,
            momentum=momentum,
            nesterov=nesterov,
        )
    )</code></pre>
</details>
</dd>
<dt id="fathom.core.optimizers.sgd"><code class="name flex">
<span>def <span class="ident">sgd</span></span>(<span>learning_rate: Union[float, Callable[[Union[jax._src.numpy.lax_numpy.ndarray, float, int]], Union[jax._src.numpy.lax_numpy.ndarray, float, int]]], momentum: Optional[float] = None, nesterov: bool = False) ‑> fedjax.core.optimizers.Optimizer</span>
</code></dt>
<dd>
<div class="desc"><p>A canonical Stochastic Gradient Descent optimiser.</p>
<p>This implements stochastic gradient descent. It also includes support for
momentum, and nesterov acceleration, as these are standard practice when
using stochastic gradient descent to train deep neural networks.</p>
<p>References:
<a href="http://proceedings.mlr.press/v28/sutskever13.pdf">Sutskever et al, 2013</a></p>
<p>Args:
learning_rate: This is a fixed global scaling factor.
momentum: The <code>decay</code> rate used by the momentum term, when it is set to
<code>None</code>, then momentum is not used at all.
nesterov: Whether nesterov momentum is used.</p>
<p>Returns:
The corresponding <code>Optimizer</code>.</p></div>
<details class="source">
<summary>
<span>Expand source code</span>
</summary>
<pre><code class="python">def sgd(learning_rate: ScalarOrSchedule,
        momentum: Optional[float] = None,
        nesterov: bool = False) -&gt; Optimizer:
    &#34;&#34;&#34;A canonical Stochastic Gradient Descent optimiser.

    This implements stochastic gradient descent. It also includes support for
    momentum, and nesterov acceleration, as these are standard practice when
    using stochastic gradient descent to train deep neural networks.

    References:
    [Sutskever et al, 2013](http://proceedings.mlr.press/v28/sutskever13.pdf)

    Args:
    learning_rate: This is a fixed global scaling factor.
    momentum: The `decay` rate used by the momentum term, when it is set to
      `None`, then momentum is not used at all.
    nesterov: Whether nesterov momentum is used.

    Returns:
    The corresponding `Optimizer`.
    &#34;&#34;&#34;
    return create_optimizer_from_optax(
        optax.inject_hyperparams(optax.sgd)(
            learning_rate=learning_rate,
            momentum=momentum,
            nesterov=nesterov,
        )
    )</code></pre>
</details>
</dd>
</dl>
</section>
<section>
</section>
</article>
<nav id="sidebar">
<h1>Index</h1>
<div class="toc">
<ul></ul>
</div>
<ul id="index">
<li><h3>Super-module</h3>
<ul>
<li><code><a title="fathom.core" href="index.html">fathom.core</a></code></li>
</ul>
</li>
<li><h3><a href="#header-functions">Functions</a></h3>
<ul class="">
<li><code><a title="fathom.core.optimizers.adagrad" href="#fathom.core.optimizers.adagrad">adagrad</a></code></li>
<li><code><a title="fathom.core.optimizers.adam" href="#fathom.core.optimizers.adam">adam</a></code></li>
<li><code><a title="fathom.core.optimizers.rmsprop" href="#fathom.core.optimizers.rmsprop">rmsprop</a></code></li>
<li><code><a title="fathom.core.optimizers.sgd" href="#fathom.core.optimizers.sgd">sgd</a></code></li>
</ul>
</li>
</ul>
</nav>
</main>
<footer id="footer">
<p>Generated by <a href="https://pdoc3.github.io/pdoc" title="pdoc: Python API documentation generator"><cite>pdoc</cite> 0.10.0</a>.</p>
</footer>
</body>
</html>