{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "fb2b2856",
   "metadata": {},
   "source": [
    "# Flow Matching tutorial: training ODE generative models like Diffusion models"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "66040230",
   "metadata": {},
   "source": [
    "## Introduction\n",
    "\n",
    "**Flow Matching was introduced in three different ICLR 2023 papers and has drawn a lot of attention in the machine learning community recently. We would like to highlight all of them here: Flow Matching [(Anonymous et al.)](https://arxiv.org/abs/2210.02747), Stochastic Interpolants [(Anonymous et al.)](https://arxiv.org/abs/2209.15571) and Rectified Flow [(Liu et al.)](https://arxiv.org/abs/2209.03003).**\n",
    "\n",
    "Flow Matching, a recently introduced generative model, leverages an ordinary differential equation (ODE) to mold a base density into the desired data distribution. In contrast, diffusion is based on a stochastic differential equation (SDE). This notebook illustrates the training of Flow Matching methods, highlighting key components. In this notebook, we present two Flow Matching models built upon the original formulation: Independent Conditional Flow Matching (I-CFM) and Optimal Transport Conditional Flow Matching (OT-CFM).\n",
    "\n",
    "In our notation, $\\alpha$ represents the noise distribution, typically a Gaussian, while $\\beta$ denotes the distribution corresponding to real data.\n",
    "\n",
    "Note from the authors: this is a beta tutorial! Do not hesitate to suggest improvements through the opened issue https://github.com/anon_repo/conditional-flow-matching/issues/88"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0b833711",
   "metadata": {},
   "source": [
    "## ODE: transforming a data distribution into another one\n",
    "\n",
    "Let's delve into some foundational concepts.  We consider a smooth time-varying vector field $u : [0, 1] \\times \\mathbb{R}^d \\to \\mathbb{R}^d$ that defines an ordinary differential equation:\n",
    "$$\n",
    "dx = u_t(x)\\,dt.\n",
    "$$\n",
    "\n",
    "We denote by $\\phi_t(x)$ the solution of the ODE with initial condition $\\phi_0(x)=x$. In essence, $\\phi_t(x)$ represents the trajectory of a point $x$ transported along the vector field $u$ from time $0$ to $t$. \n",
    "\n",
    "Assuming the knowledge of the distributions $p_0$ and $p_1$, a transformative process between them can be initiated. The integration map $\\phi_t$ induces a pushforward measure $p_t:=[\\phi_t]_\\#(p_0)$ between $p_0$ and $p_1$. This measure characterizes the density of points $x\\sim p_0$ transported by $u$ from time $0$ to time $t$. The time-varying density $p_t$, viewed as a function $p:[0,1]\\times\\mathbb{R}^d\\to\\mathbb{R}$, is characterized by the initial conditions $\\phi_0(x)=x$ and the well-known $\\textbf{continuity equation}$:\n",
    "\\begin{equation}\n",
    "\\frac{\\partial p}{\\partial t}=-\\nabla\\cdot(p_t u_t)\n",
    "\\end{equation}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7bf34b36",
   "metadata": {},
   "source": [
    "## Regressing the vector field: an intractable loss\n",
    "\n",
    "Suppose we have knowledge of the probability path $p_t(x)$ and the associated vector field $u_t(x)$ that generates it, with the added convenience that $p_t(x)$ is readily samplable.\n",
    "\n",
    "If $v_\\theta(\\cdot,\\cdot):[0,1]\\times\\mathbb{R}^d\\to\\mathbb{R}^d$ is a time-dependent vector field parametrized as a neural network (with weights $\\theta$), we aim to regress $v_\\theta$ to $u$ using the $\\textbf{flow matching (FM)}$ objective:\n",
    "\n",
    "$$\n",
    "\\mathcal{L}_{\\text{FM}}(\\theta) := \\mathbb{E}_{t\\sim [0,1],x\\sim p_t(x)} \\| v_\\theta(t, x) - u_t(x) \\|^2.\n",
    "$$\n",
    "\n",
    "This objective allows us to align $v_\\theta$ with $u$ by minimizing the expected squared norm difference across various time points $t$ and sampled points $x$ from $p_t(x)$. Then we can generate data by using $v_\\theta$ in place of $u$ in the above ODE. However, this objective becomes intractable for general source and target distributions as $p_t(x)$ and $u_t$ are unknown general functions."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "74f3a461",
   "metadata": {},
   "source": [
    "## Conditional Flow Matching\n",
    "\n",
    "The primary limitation of the above approach is that it is untractable as we do not know $u_t$ and $p_t$. To render the objective feasible, we opt for a more manageable strategy by specifying the form of the probability path and vector field.\n",
    "\n",
    "Here, we assume that the probability path $p_t(x)$ takes the form of a mixture involving **conditional probability paths $p_t(x|z)$**, where $z$ serves as a conditioning variable:\n",
    "\n",
    "\\begin{equation}\\label{eq:ppath}\n",
    "p_t(x) := \\int p_t(x | z) q(z)\\, dz.\n",
    "\\end{equation}\n",
    "$q(z)$ represents a distribution over the conditioning variable. Furthermore, we want $p_1(x | z)$ to be a distribution concentrated around $z$.  A natural design choise for the conditional probability path $p_t(x|z)$ is to adopt Gaussian distributions, expressed as $p_t(x | z) = \\mathcal{N}(\\mu_t(z), \\sigma_t^2)$. In our paper, we choose to condition over a tuple of source and target samples $z = (x_0, x_1) \\sim \\alpha \\otimes \\beta$. We define $\\mu_t$ as the linear interpolation between $x_0$ and $x_1$ (with respect to time, *i.e.,* $\\mu_t(x_0, x_1) = t x_1 + (1 - t) x_0$), and set the standard deviation $\\sigma_t = \\sigma>0$ as a constant float. The motivation behind this specific $\\mu_t$ lies in the optimal transport theory that we will detail later. It's worth noting that alternate selections are possible as highlithed in (Anonymous et al, 2023).\n",
    "\n",
    "Unfortunately, direct sampling from the unconditional probability path $p_t(x)$ is hindered by the integral in its formulation. Instead, we will sample from the conditional probability path $p_t(x|z)$ and show later that we can build a conditional Flow Matching loss which shares a lot of benefits with its unconditional counter part."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "278f89ee",
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2\n",
    "\n",
    "def sample_conditional_pt(x0, x1, t, sigma):\n",
    "    \"\"\"\n",
    "    Draw a sample from the probability path N(t * x1 + (1 - t) * x0, sigma), see (Eq.14) [1].\n",
    "\n",
    "    Parameters\n",
    "    ----------\n",
    "    x0 : Tensor, shape (bs, *dim)\n",
    "        represents the source minibatch\n",
    "    x1 : Tensor, shape (bs, *dim)\n",
    "        represents the target minibatch\n",
    "    t : FloatTensor, shape (bs)\n",
    "\n",
    "    Returns\n",
    "    -------\n",
    "    xt : Tensor, shape (bs, *dim)\n",
    "\n",
    "    References\n",
    "    ----------\n",
    "    [1] Improving and Generalizing Flow-Based Generative Models with minibatch optimal transport, Preprint, Anonymous et al.\n",
    "    \"\"\"\n",
    "    t = t.reshape(-1, *([1] * (x0.dim() - 1)))\n",
    "    mu_t = t * x1 + (1 - t) * x0\n",
    "    epsilon = torch.randn_like(x0)\n",
    "    return mu_t + sigma * epsilon"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8121e08c",
   "metadata": {},
   "source": [
    "## Building an admissible vector field\n",
    "\n",
    "We proceed with the assumption that the conditional probability paths $p_t(x|z)$, used to define the unconditional probability path, are generated by some conditional vector field $u_t(x|z)$. \n",
    "\n",
    "The next challenge is to define an unconditional vector field $u_t(x)$ that, when paired with $p_t(x)$, satisfies the continuity equation.\n",
    "\n",
    "Drawing from an adapted theorem based on [(Anonymous et al., Theorem 1)](https://arxiv.org/abs/2210.02747), we can establish the existence of such a vector field, as demonstrated in [(Anonymous et al., Theorem 3.1)](https://arxiv.org/abs/2302.00482). Inheritantly influenced by the structure of $p_t$, $u_t$ takes the form of a mixture involving the conditional vector field $u_t(x | z)$. Formally, it will be defined by:\n",
    "\n",
    "\\begin{equation}\\label{eq:mvec}\n",
    "u_t(x) := \\mathbb{E}_{q(z)} \\frac{u_t(x | z) p_t(x | z)}{p_t(x)}.\n",
    "\\end{equation}\n",
    "\n",
    "This formulation ensures that $u_t$ aligns with the theoretical requirements to satisfy the continuity equation when associated with $p_t(x)$. Once more, the complexity introduced by the integral renders the above $u_t(x)$ impractical for direct use. Instead, we rely on use its conditional counterpart $u_t(x|z)$. \n",
    "\n",
    "In the specefic case where the conditional probability path $p_t(x|z) = \\mathcal{N}(\\mu_t(z), \\sigma_t^2)$ takes the form of a Gaussian, the conditional vector field $u_t(x|z)$ possesses a unique closed-form expression:\n",
    "\n",
    "\\begin{equation}\n",
    "u_t(x|x_0, x_1) = \\frac{\\sigma_t'}{\\sigma_t} (x - \\mu_t) + \\mu_t',\n",
    "\\end{equation}\n",
    "\n",
    "As $\\mu_t(x_0, x_1) = t x_1 + (1-t) x_0$ and $\\sigma_t = \\sigma>0$, the conditional vector field $u_t(x∣z)$ simplifies further:\n",
    "\n",
    "\\begin{equation}\n",
    "u_t(x|x_0, x_1) = x_1 - x_0.\n",
    "\\end{equation}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ec3460f1",
   "metadata": {},
   "outputs": [],
   "source": [
    "def compute_conditional_vector_field(x0, x1):\n",
    "    \"\"\"\n",
    "    Compute the conditional vector field ut(x1|x0) = x1 - x0, see Eq.(15) [1].\n",
    "\n",
    "    Parameters\n",
    "    ----------\n",
    "    x0 : Tensor, shape (bs, *dim)\n",
    "        represents the source minibatch\n",
    "    x1 : Tensor, shape (bs, *dim)\n",
    "        represents the target minibatch\n",
    "\n",
    "    Returns\n",
    "    -------\n",
    "    ut : conditional vector field ut(x1|x0) = x1 - x0\n",
    "\n",
    "    References\n",
    "    ----------\n",
    "    [1] Improving and Generalizing Flow-Based Generative Models with minibatch optimal transport, Preprint, Anonymous et al.\n",
    "    \"\"\"\n",
    "    return x1 - x0"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0a2eb9af",
   "metadata": {},
   "source": [
    "As explained earlier,  the computation of the unconditional vector field $u_t(x)$ and probability path $p_t(x)$ are intractable. Nevertheless, we can easily sample from the conditional probability path $p_t(x|z)$, given its Gaussian nature, and compute the conditional vector field $u_t(x|z)$ with the available closed-form expression.  Consequently, instead of regressing against the unconditional vector field, we opt to regress our network $v_\\theta$ against the $\\textit{conditional vector field}$ $u_t(x|z)$. The Conditional Flow Matching loss is thus defined as:\n",
    "\n",
    "$$\n",
    "\\mathcal{L}_{\\text{CFM}}(\\theta) := \\mathbb{E}_{t, q(z), p_t(x | z)} \\|v_\\theta(t, x) - u_t(x | z)\\|^2.\n",
    "$$\n",
    "\n",
    "Remarkably, the expectation in this loss reflects the definition of $u_t(x)$. Furthermore, this loss is equal to the unconditional Flow Matching loss, up to a constant that is independent of $\\theta$ (Theorem 2, [(Anonymous et al.)](https://arxiv.org/abs/2210.02747), Theorem 3.2 [(Anonymous et al.)](https://arxiv.org/abs/2302.00482)).\n",
    "\n",
    "It's important to note that the constructed marginal probability path $p_t(x)$ and the derived vector field $u_t(x)$ do not encompass all probability paths and vector fields that satisfy the continuity equation as defined in the ODE section."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "543e9716",
   "metadata": {},
   "source": [
    "## Choosing $q$\n",
    "\n",
    "Another crucial element to consider is the choice of the latent distribution $q(z)$. Presently, we opt for a straightforward approach by considering a uniform distribution over a tuple of noise and real data, i.e., $q(z) = q(x_0,x_1) = q(x_0) q(x_1)$. We will see later different choices for $q$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2035a615",
   "metadata": {},
   "outputs": [],
   "source": [
    "import math\n",
    "import os\n",
    "import time\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import ot as pot\n",
    "import torch\n",
    "import torchdyn\n",
    "from torchdyn.core import NeuralODE\n",
    "from torchdyn.datasets import generate_moons\n",
    "\n",
    "import sys\n",
    "import os\n",
    "sys.path.insert(0, './Desktop/uot_wfm/conditional-flow-matching')\n",
    "\n",
    "from torchcfm.conditional_flow_matching import *\n",
    "from torchcfm.models.models import *\n",
    "from torchcfm.utils import *\n",
    "\n",
    "savedir = \"models/8gaussian-moons\"\n",
    "os.makedirs(savedir, exist_ok=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "faf18883",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "sigma = 0.1\n",
    "dim = 2\n",
    "batch_size = 256\n",
    "model = MLP(dim=dim, time_varying=True)\n",
    "optimizer = torch.optim.Adam(model.parameters())\n",
    "\n",
    "start = time.time()\n",
    "for k in range(20000):\n",
    "    optimizer.zero_grad()\n",
    "\n",
    "    x0 = sample_8gaussians(batch_size)\n",
    "    x1 = sample_moons(batch_size)\n",
    "\n",
    "    t = torch.rand(x0.shape[0]).type_as(x0)\n",
    "    xt = sample_conditional_pt(x0, x1, t, sigma=0.01)\n",
    "    ut = compute_conditional_vector_field(x0, x1)\n",
    "\n",
    "    vt = model(torch.cat([xt, t[:, None]], dim=-1))\n",
    "    loss = torch.mean((vt - ut) ** 2)\n",
    "\n",
    "    loss.backward()\n",
    "    optimizer.step()\n",
    "\n",
    "    if (k + 1) % 5000 == 0:\n",
    "        end = time.time()\n",
    "        print(f\"{k+1}: loss {loss.item():0.3f} time {(end - start):0.2f}\")\n",
    "        start = end\n",
    "        node = NeuralODE(\n",
    "            torch_wrapper(model), solver=\"dopri5\", sensitivity=\"adjoint\", atol=1e-4, rtol=1e-4\n",
    "        )\n",
    "        with torch.no_grad():\n",
    "            traj = node.trajectory(\n",
    "                sample_8gaussians(1024),\n",
    "                t_span=torch.linspace(0, 1, 100),\n",
    "            )\n",
    "            plot_trajectories(traj.cpu().numpy())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c2a025c6",
   "metadata": {},
   "source": [
    "As we can see above, the proposed method works really well to generate samples. However, the paths are rather curved. These paths are not efficient and lead to a longer inference than straighter paths would lead to. To get straighter paths, we can choose a different distribution $q$."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f2d9e10d",
   "metadata": {},
   "source": [
    "# Optimal Transport"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5d9991b3",
   "metadata": {},
   "source": [
    "To get straighter paths, we can leverage the optimal transport (OT) theory. Indeed, as discussed in [(Peyré et al., chapter 7)](https://arxiv.org/abs/1803.00567), the OT problem can be reformulated as a minimal-path length problem. Thus, we want to leverage this property in order to produce straighter flows that would lead to a faster inference process.\n",
    "\n",
    "The OT problem aims to minimize the displacement cost between two distributions. Formally, let $\\alpha = \\frac1n \\sum_{i=1}^n \\delta_{x_i}$ and $\\beta = \\frac1n \\sum_{j=1}^n \\delta_{z_j}$ represents the source and target distributions (i.e., they are sum of diracs). \n",
    "\n",
    "Optimal Transport is based on two key notions: $\\textbf{Transport}$ and $\\textbf{cost}$. Let's delve into the first concept.\n",
    "\n",
    "To transport a distribution towards another, we examine their probabilistic representation. The distribution $\\alpha$ and $\\beta$ are sum of diracs with uniform weights. This implies that we can represent them as tuples of a position and a probability weight (i.e., $(x_i, \\frac1n)_{i=1}^n$ and $(z_j, \\frac1n)_{j=1}^n$)). To move $\\alpha$ towards $\\beta$, the task is to transport each individual sample mass $\\frac1n$ from $\\alpha$ to $\\beta$ samples.\n",
    "\n",
    "To formalize this concept, we introduce a transport matrix, denoted as $\\Pi$. Each row of $\\Pi$ corresponds to a source sample, and each column represents a target sample. Thus, an element $\\Pi_{ij}$ element represents the quantity of mass transported from $x_i$ to $z_j$. \n",
    "\n",
    "To ensure that each individual $\\alpha$ sample is transported to $\\beta$ (and vice-versa), the transport plan $\\Pi$ must satisfy the following constraints: $\\sum_j \\Pi_{ij} = a_i = \\frac1n$ and $\\sum_i \\Pi_{ij} = b_j = \\frac1n$. It means that each sample's mass $\\frac1n$ has been moved to the other distribution. These constraints guarantee the conservation of mass during the transport process.\n",
    "\n",
    "Having established how to transport distributions, let's delve into the concept of a $\\textbf{displacement cost}$. \n",
    "\n",
    "In the context of optimal transport, which aims to minimize the distance cost from $\\alpha$ to $\\beta$, we need a notion of distance. That is why we evaluate the distance between the supports of the source and target distributions (*i.e.*, $(x_i)_{i=1}^n$ for $\\alpha$ and $(z_j)_{j=1}^n$ for $\\beta$). This involves introducing a ground cost matrix $C$, where $C_{ij} = \\|x_i - z_j\\|$ measures the distance between the support points $x_i$ and $z_j$.\n",
    "\n",
    "Bringing all the components together, the cost of moving $\\alpha$ to $\\beta$ is given by $\\sum_{ij} \\Pi_{ij} C_{ij} = \\langle \\Pi, C \\rangle_F$.\n",
    "\n",
    "As the OT problem seeks the minimal displacement cost, we solve the following problem: $$\\Pi^\\star=\\text{argmin}_{\\Pi \\in U(\\alpha, \\beta)} \\langle \\Pi, C \\rangle.$$\n",
    "\n",
    "where $U(\\alpha, \\beta)$ represents the set of admissible transport plans satisfying the mass conservation constraints.\n",
    "\n",
    "Therefore, to produce a faster inference process, **we choose the latent distribution $q(x_0, x_1) = \\Pi(x_0, x_1)$.**"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e1f48194",
   "metadata": {},
   "source": [
    "Unfortunately, for large dataset, it is not possible to compute $\\Pi$ and we rely on a minibatch approximation instead (see [(Anonymous et al.)](https://proceedings.mlr.press/v108/fatras20a.html) for a reference on minibatch OT and its coupling). Nevertheless, the minibatch OT couplings leads to good performance as shown below and we refer to the TorchCFM notebook **The_unreasonable_performance_of_minibatch_OT** for a longer discussion on minibatch OT."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a7831d15",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "\n",
    "import sys\n",
    "import os\n",
    "\n",
    "if 'torchcfm.optimal_transport' in sys.modules:\n",
    "    del sys.modules['torchcfm.optimal_transport']\n",
    "if 'torchcfm.optimal_transport' in sys.modules:\n",
    "    importlib.reload(sys.modules['torchcfm.optimal_transport'])\n",
    "\n",
    "sys.path.append('../../torchcfm')\n",
    "from optimal_transport import OTPlanSampler\n",
    "\n",
    "#ot_sampler = OTPlanSampler(method=\"exact\")\n",
    "#ot_sampler = OTPlanSampler(method=\"sinkhorn\", reg=1.0) # too low reg induces sinkhorn iteration failure\n",
    "ot_sampler = OTPlanSampler(method=\"unbalanced_knopp\",reg=1.0, reg_m=(float(\"inf\"), 1.0)) # float(\"inf\") can be used\n",
    "sigma = 0.1\n",
    "dim = 2\n",
    "batch_size = 256\n",
    "model = MLP(dim=dim, time_varying=True)\n",
    "optimizer = torch.optim.Adam(model.parameters())\n",
    "FM = ConditionalFlowMatcher(sigma=sigma)\n",
    "\n",
    "start = time.time()\n",
    "for k in range(20000):\n",
    "    optimizer.zero_grad()\n",
    "\n",
    "    x0 = sample_8gaussians(batch_size)\n",
    "    x1 = sample_moons(batch_size)\n",
    "\n",
    "    # Draw samples from OT plan\n",
    "    #x0, x1 = ot_sampler.sample_plan(x0, x1) # for exact OT\n",
    "    x0, x1, u, v = ot_sampler.sample_plan_with_weights(x0, x1) # for unbalanced OT\n",
    "    #print(f\"u, zero: {np.sum(u==0)}, not zero: {np.sum(u!=0)}\")\n",
    "    #print(f\"v, zero: {np.sum(v==0)}, not zero: {np.sum(v!=0)}\")\n",
    "\n",
    "    t = torch.rand(x0.shape[0]).type_as(x0)\n",
    "    xt = sample_conditional_pt(x0, x1, t, sigma=0.01)\n",
    "    ut = compute_conditional_vector_field(x0, x1)\n",
    "\n",
    "    vt = model(torch.cat([xt, t[:, None]], dim=-1))\n",
    "    loss = torch.mean((vt - ut) ** 2)\n",
    "\n",
    "    loss.backward()\n",
    "    optimizer.step()\n",
    "\n",
    "    if (k + 1) % 5000 == 0:\n",
    "        end = time.time()\n",
    "        print(f\"{k+1}: loss {loss.item():0.3f} time {(end - start):0.2f}\")\n",
    "        start = end\n",
    "        node = NeuralODE(\n",
    "            torch_wrapper(model), solver=\"dopri5\", sensitivity=\"adjoint\", atol=1e-4, rtol=1e-4\n",
    "        )\n",
    "        with torch.no_grad():\n",
    "            traj = node.trajectory(\n",
    "                sample_8gaussians(1024),\n",
    "                t_span=torch.linspace(0, 1, 100),\n",
    "            )\n",
    "            plot_trajectories(traj.cpu().numpy())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "76d09989",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f7348050",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "120c0132",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "41411446",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "torchcfm",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}