In this notebook, we show how a topological loss can be combined with a non-linear embedding procedure, as to regularize the embedding and better reflect the topological---in this case bifurcating---prior.
We start by setting the working directory and importing the necessary libraries.
# Set working directory
import os
os.chdir("..")
# Handling arrays and data.frames
import pandas as pd
import numpy as np
# Loading R objects into python
import rpy2.robjects as ro
from rpy2.robjects import pandas2ri
# Pytorch compatible topology layer and losses
import torch
from topologylayer.nn import AlphaLayer
from Code.losses import DiagramLoss, umap_loss
# Ordinary and topologically regularized UMAP embedding
from Code.topembed import UMAP
UMAP(np.array(np.random.rand(30, 5)), num_epochs=0) # Execute once for numba code compilation
# Plotting
import seaborn as sns
import matplotlib.pyplot as plt
# Quantitative evaluation
from sklearn.svm import SVC
from Code.evaluation import evaluate_embeddings
%matplotlib inline
We start by loading the data and visualize it by means of its ordinary UMAP embedding.
# Load the data
file_name = os.path.join("Data", "CellBifurcating.rds")
cell_info = ro.r["readRDS"](file_name)
cell_info = dict(zip(cell_info.names, list(cell_info)))
pandas2ri.activate()
data = ro.conversion.rpy2py(cell_info["expression"])
t = list(ro.conversion.rpy2py(cell_info["cell_info"])
.rename(columns={"milestone_id": "group_id"}).loc[:,"group_id"])
pandas2ri.deactivate()
print("Data shape: " + str(data.shape))
# Learning hyperparameters
num_epochs = 100
learning_rate = 1e-1
# Conduct ordinary UMAP embedding
Y_umap, losses_umap = UMAP(data, num_epochs=num_epochs, learning_rate=learning_rate, random_state=42)
# View the data through its UMAP embedding
fig, ax = plt.subplots()
sns.scatterplot(x=Y_umap[:,0], y=Y_umap[:,1], s=50, hue=t, palette="husl")
ax.get_legend().remove()
plt.show()
Data shape: (154, 1770) [epoch 1] [emb. loss: 9376.855727, top. loss: 0.000000, total loss: 9376.855727] [epoch 10] [emb. loss: 9103.008830, top. loss: 0.000000, total loss: 9103.008830] [epoch 20] [emb. loss: 8850.547525, top. loss: 0.000000, total loss: 8850.547525] [epoch 30] [emb. loss: 8824.020165, top. loss: 0.000000, total loss: 8824.020165] [epoch 40] [emb. loss: 8703.961411, top. loss: 0.000000, total loss: 8703.961411] [epoch 50] [emb. loss: 8450.941803, top. loss: 0.000000, total loss: 8450.941803] [epoch 60] [emb. loss: 8749.852056, top. loss: 0.000000, total loss: 8749.852056] [epoch 70] [emb. loss: 8618.780763, top. loss: 0.000000, total loss: 8618.780763] [epoch 80] [emb. loss: 8421.186200, top. loss: 0.000000, total loss: 8421.186200] [epoch 90] [emb. loss: 8466.879089, top. loss: 0.000000, total loss: 8466.879089] [epoch 100] [emb. loss: 8419.206423, top. loss: 0.000000, total loss: 8419.206423] Time for embedding: 00:00:00
We now show how we can bias a non-linear embedding using a loss function that captures our topological prior. This topological loss will be a linear combination of two separate losses:
To obtain these losses, we require an additional layer that constructs the alpha complex from the embedding, from which subsequently persistent homology is computed.
# Define topological optimization
def g(p): return p[1] - p[0] # function that returns the persistence d - b of a point (b, d)
TopLayer = AlphaLayer(maxdim=1) # alpha complex layer
ComponentPersistence = DiagramLoss(dim=0, i=2, g=g) # compute total (finite) persistence
Component3Persistence = DiagramLoss(dim=0, i=3, j=3, g=g) # compute persistence of third most prominent gap
lambda_top = 1e1 # scalar factor that trades off embedding and topological loss
tau_flare = 0.25 # sample fraction for which the flare topological loss is computed
# Construct topological loss function
def top_loss(output):
# Loss for connectedness
dgminfo_connected = TopLayer(output)
loss_connected = ComponentPersistence(dgminfo_connected)
# Loss for flare
f = torch.norm(output - torch.mean(output, dim=0), dim=1)
f /= torch.max(f)
dgminfo_flare = TopLayer(output[f > tau_flare,:])
loss_flare = -Component3Persistence(dgminfo_flare)
# Total loss
loss = lambda_top * (loss_connected + loss_flare)
return loss
We can now conduct the topologically regularized embedding as follows.
# Learning hyperparameters
num_epochs = 100
learning_rate = 1e-1
# Conduct topological regularization
Y_top, losses_top = UMAP(data, top_loss=top_loss, num_epochs=num_epochs,
learning_rate=learning_rate, random_state=42)
# View topologically regularized embedding
fig, ax = plt.subplots()
sns.scatterplot(x=Y_top[:,0], y=Y_top[:,1], s=50, hue=t, palette="husl")
ax.get_legend().remove()
plt.show()
[epoch 1] [emb. loss: 9376.855727, top. loss: 1412.720459, total loss: 10789.576186] [epoch 10] [emb. loss: 9424.549514, top. loss: 967.172241, total loss: 10391.721755] [epoch 20] [emb. loss: 9237.509794, top. loss: 804.668701, total loss: 10042.178495] [epoch 30] [emb. loss: 9163.613059, top. loss: 710.923767, total loss: 9874.536827] [epoch 40] [emb. loss: 9133.507288, top. loss: 658.947815, total loss: 9792.455103] [epoch 50] [emb. loss: 9239.509584, top. loss: 652.785522, total loss: 9892.295106] [epoch 60] [emb. loss: 9136.785422, top. loss: 643.777466, total loss: 9780.562888] [epoch 70] [emb. loss: 9038.252752, top. loss: 629.470581, total loss: 9667.723333] [epoch 80] [emb. loss: 8974.596282, top. loss: 643.432922, total loss: 9618.029205] [epoch 90] [emb. loss: 9085.446099, top. loss: 623.973328, total loss: 9709.419426] [epoch 100] [emb. loss: 8981.926104, top. loss: 630.493286, total loss: 9612.419390] Time for embedding: 00:00:07
For comparison, we also conduct the same topological optimization procedure directly on the initialized embedding.
# Learning hyperparameters
num_epochs = 100
learning_rate = 1e-1
# Conduct topological optimization (UMAP)
Y_opt, losses_opt = UMAP(Y_umap, emb_loss=False, top_loss=top_loss, num_epochs=num_epochs,
learning_rate=learning_rate, random_state=42)
# View topologically optimized embedding (UMAP)
fig, ax = plt.subplots()
sns.scatterplot(x=Y_opt[:,0], y=Y_opt[:,1], s=50, hue=t, palette="husl")
ax.get_legend().remove()
plt.show()
[epoch 1] [emb. loss: 0.000000, top. loss: 1167.495728, total loss: 1167.495728] [epoch 10] [emb. loss: 0.000000, top. loss: 723.169067, total loss: 723.169067] [epoch 20] [emb. loss: 0.000000, top. loss: 539.727295, total loss: 539.727295] [epoch 30] [emb. loss: 0.000000, top. loss: 429.159058, total loss: 429.159058] [epoch 40] [emb. loss: 0.000000, top. loss: 364.528503, total loss: 364.528503] [epoch 50] [emb. loss: 0.000000, top. loss: 320.426392, total loss: 320.426392] [epoch 60] [emb. loss: 0.000000, top. loss: 288.649963, total loss: 288.649963] [epoch 70] [emb. loss: 0.000000, top. loss: 267.297821, total loss: 267.297821] [epoch 80] [emb. loss: 0.000000, top. loss: 252.922058, total loss: 252.922058] [epoch 90] [emb. loss: 0.000000, top. loss: 242.767914, total loss: 242.767914] [epoch 100] [emb. loss: 0.000000, top. loss: 233.778214, total loss: 233.778214] Time for embedding: 00:00:09
We observe that without the embedding loss, the represented topologies are more fragmented and more interior points representing the bifurcation are pulled towards the ends.
First, we evaluate the different losses (embedding and topological) for all final embeddings.
print("\033[1mLosses for umap embedding: \033[0m")
print("Embedding: " + str(umap_loss(losses_umap["P"], torch.tensor(Y_umap).type(torch.float),
losses_umap["a"], losses_umap["b"]).item()))
print("Topological: " + str(top_loss(torch.tensor(Y_umap).type(torch.float)).item() / np.abs(lambda_top)) + "\n")
print("\033[1mLosses for topologically optimized umap embedding: \033[0m")
print("Embedding: " + str(umap_loss(losses_opt["P"], torch.tensor(Y_opt).type(torch.float),
losses_opt["a"], losses_opt["b"]).item()))
print("Topological: " + str(top_loss(torch.tensor(Y_opt).type(torch.float)).item() /
np.abs(lambda_top)) + "\n")
print("\033[1mLosses for topologically regularized umap embedding: \033[0m")
print("Embedding: " + str(umap_loss(losses_top["P"], torch.tensor(Y_top).type(torch.float),
losses_top["a"], losses_top["b"]).item()))
print("Topological: " + str(top_loss(torch.tensor(Y_top).type(torch.float)).item() / np.abs(lambda_top)))
Losses for umap embedding: Embedding: 8576.364944816134 Topological: 116.74957275390625 Losses for topologically optimized umap embedding: Embedding: 9932.898996191445 Topological: 23.302349853515626 Losses for topologically regularized umap embedding: Embedding: 8871.473401997777 Topological: 62.9324462890625
Finally, we compare if the topologically regularized embedding improves on the ordinary UMAP embedding for predicting data point labels.
# Machine learning model to be used for label prediction
Ys = {"umap": Y_umap, "top. opt.": Y_opt, "top. reg.": Y_top}
model = SVC()
scoring = "accuracy"
# Hyperparameters for quantitative evaluation
ntimes = 100
test_frac = 0.1
params = {"C":[0.01, 0.1, 1, 10, 100]}
# Obtain performances over multiple train-test splits
performances = evaluate_embeddings(Ys, t, model, scoring, params=params, stratify=t,
ntimes=ntimes, test_frac=test_frac, random_state=42)
# View resulting performances
pd.concat([pd.DataFrame({"mean":performances.mean(axis=0)}),
pd.DataFrame({"std":performances.std(axis=0)})], axis=1)\
.style.highlight_max(subset="mean", color="lightgreen", axis=0)
mean | std | |
---|---|---|
umap | 0.787500 | 0.077951 |
top. opt. | 0.811875 | 0.070786 |
top. reg. | 0.820625 | 0.080763 |
We explore how topological regularization reacts to different thresholds $\tau$ that are used to topologically optimize for a flare. The different embeddings are obtained and visualized as follows.
# Learning hyperparameters
num_epochs = 100
learning_rate = 1e-1
# Conduct topological regularization for flare thresholds tau
taus = [0.75, 0.5, 0.25] # note that tau (main paper) = 1 - tau (notebook)
Y_taus = []
for idx, tau in enumerate(taus):
# Construct topological loss function
def tau_top_loss(output):
# Loss for connectedness
dgminfo_connected = TopLayer(output)
loss_connected = ComponentPersistence(dgminfo_connected)
# Loss for flare
f = torch.norm(output - torch.mean(output, dim=0), dim=1)
f /= torch.max(f)
dgminfo_flare = TopLayer(output[f > tau,:])
loss_flare = -Component3Persistence(dgminfo_flare)
# Total loss
loss = lambda_top * (loss_connected + loss_flare)
return loss
# Conduct topological regularization
print("\033[1mConducting topologically regularized embedding for tau = " + str(1 - tau) + "\033[0m")
this_Y_tau, losses_tau = UMAP(data, top_loss=tau_top_loss, num_epochs=num_epochs,
learning_rate=learning_rate, random_state=42)
Y_taus.append(this_Y_tau)
print("\n")
# View topologically regularized embeddings for different thresholds tau
fig, ax = plt.subplots(1, 3, figsize=(15, 3.5))
for idx, tau in enumerate(taus):
sns.scatterplot(x=Y_taus[idx][:,0], y=Y_taus[idx][:,1], s=50, hue=t, palette="husl", ax=ax[idx])
ax[idx].get_legend().remove()
ax[idx].set_title(r"$\tau$ = " + str(1 - tau))
plt.show()
Conducting topologically regularized embedding for tau = 0.25 [epoch 1] [emb. loss: 9376.855727, top. loss: 1458.245239, total loss: 10835.100966] [epoch 10] [emb. loss: 9415.376703, top. loss: 1013.097839, total loss: 10428.474542] [epoch 20] [emb. loss: 9367.941448, top. loss: 872.273071, total loss: 10240.214519] [epoch 30] [emb. loss: 9285.828090, top. loss: 775.673889, total loss: 10061.501979] [epoch 40] [emb. loss: 9281.692348, top. loss: 737.991821, total loss: 10019.684169] [epoch 50] [emb. loss: 9125.022434, top. loss: 728.142822, total loss: 9853.165257] [epoch 60] [emb. loss: 9189.738371, top. loss: 721.931274, total loss: 9911.669646] [epoch 70] [emb. loss: 9032.871499, top. loss: 710.517517, total loss: 9743.389016] [epoch 80] [emb. loss: 8947.278904, top. loss: 697.761475, total loss: 9645.040379] [epoch 90] [emb. loss: 9027.731164, top. loss: 686.090942, total loss: 9713.822106] [epoch 100] [emb. loss: 8917.668728, top. loss: 682.858276, total loss: 9600.527004] Time for embedding: 00:00:04 Conducting topologically regularized embedding for tau = 0.5 [epoch 1] [emb. loss: 9376.855727, top. loss: 1366.098755, total loss: 10742.954482] [epoch 10] [emb. loss: 9180.298697, top. loss: 914.549255, total loss: 10094.847952] [epoch 20] [emb. loss: 9265.636038, top. loss: 736.941467, total loss: 10002.577505] [epoch 30] [emb. loss: 9180.415900, top. loss: 632.937561, total loss: 9813.353461] [epoch 40] [emb. loss: 9252.427713, top. loss: 599.358093, total loss: 9851.785806] [epoch 50] [emb. loss: 9105.864087, top. loss: 587.109436, total loss: 9692.973523] [epoch 60] [emb. loss: 9207.009473, top. loss: 579.573669, total loss: 9786.583142] [epoch 70] [emb. loss: 9173.441741, top. loss: 573.375793, total loss: 9746.817535] [epoch 80] [emb. loss: 8865.755123, top. loss: 574.220093, total loss: 9439.975216] [epoch 90] [emb. loss: 9053.634942, top. loss: 579.570435, total loss: 9633.205377] [epoch 100] [emb. loss: 9162.937852, top. loss: 582.826111, total loss: 9745.763963] Time for embedding: 00:00:05 Conducting topologically regularized embedding for tau = 0.75 [epoch 1] [emb. loss: 9376.855727, top. loss: 1412.720459, total loss: 10789.576186] [epoch 10] [emb. loss: 9424.549514, top. loss: 967.172241, total loss: 10391.721755] [epoch 20] [emb. loss: 9237.509794, top. loss: 804.668701, total loss: 10042.178495] [epoch 30] [emb. loss: 9163.613059, top. loss: 710.923767, total loss: 9874.536827] [epoch 40] [emb. loss: 9133.507288, top. loss: 658.947815, total loss: 9792.455103] [epoch 50] [emb. loss: 9239.509584, top. loss: 652.785522, total loss: 9892.295106] [epoch 60] [emb. loss: 9136.785422, top. loss: 643.777466, total loss: 9780.562888] [epoch 70] [emb. loss: 9038.252752, top. loss: 629.470581, total loss: 9667.723333] [epoch 80] [emb. loss: 8974.596282, top. loss: 643.432922, total loss: 9618.029205] [epoch 90] [emb. loss: 9085.446099, top. loss: 623.973328, total loss: 9709.419426] [epoch 100] [emb. loss: 8981.926104, top. loss: 630.493286, total loss: 9612.419390] Time for embedding: 00:00:06
Different powers of the persistence may result in different behavior of the topological regularization. We explore this (keeping all other hyperparameters identical) as follows.
# Learning hyperparameters
num_epochs = 100
learning_rate = 1e-1
# Conduct topological regularization for different powers of the persistence lifetime
powers = [1 / 2, 1, 2] # p = 1 equals or previous case, which we include for easy comparison
Y_tops = []
for idx, p in enumerate(powers):
# Construct topological loss function
def gp(d): return (d[1] - d[0])**p # returns the p-th power of the persistence d - b of a point (b, d)
pComponentPersistence = DiagramLoss(dim=0, i=2, g=gp) # for low total (finite) persistence
pComponent3Persistence = DiagramLoss(dim=0, i=3, j=3, g=gp) # for high persistence of third most prominent gap
def p_top_loss(output):
# Loss for connectedness
dgminfo_connected = TopLayer(output)
loss_connected = pComponentPersistence(dgminfo_connected)
# Loss for flare
f = torch.norm(output - torch.mean(output, dim=0), dim=1)
f /= torch.max(f)
dgminfo_flare = TopLayer(output[f > tau_flare,:])
loss_flare = -pComponent3Persistence(dgminfo_flare)
# Total loss
loss = lambda_top * (loss_connected + loss_flare)
return loss
# Conduct topological regularization
print("\033[1mConducting topologically regularized embedding for lifetime power " + str(p) + "\033[0m")
this_Y_top, losses_top = UMAP(data, top_loss=p_top_loss, num_epochs=num_epochs,
learning_rate=learning_rate, random_state=42)
Y_tops.append(this_Y_top)
print("\n")
# View topologically regularized embeddings for different persistence lifetime powers
fig, ax = plt.subplots(1, 3, figsize=(15, 3.5))
for idx, p in enumerate(powers):
sns.scatterplot(x=Y_tops[idx][:,0], y=Y_tops[idx][:,1], s=50, hue=t, palette="husl", ax=ax[idx])
ax[idx].get_legend().remove()
ax[idx].set_title("p = " + str(p))
plt.show()
Conducting topologically regularized embedding for lifetime power 0.5 [epoch 1] [emb. loss: 9376.855727, top. loss: 1400.352173, total loss: 10777.207900] [epoch 10] [emb. loss: 9150.710593, top. loss: 1202.514771, total loss: 10353.225364] [epoch 20] [emb. loss: 9151.445905, top. loss: 1126.345215, total loss: 10277.791120] [epoch 30] [emb. loss: 9055.769724, top. loss: 1076.250000, total loss: 10132.019724] [epoch 40] [emb. loss: 8940.097309, top. loss: 1046.674316, total loss: 9986.771625] [epoch 50] [emb. loss: 8987.077069, top. loss: 1028.369385, total loss: 10015.446454] [epoch 60] [emb. loss: 8903.144402, top. loss: 1028.905396, total loss: 9932.049798] [epoch 70] [emb. loss: 8937.635326, top. loss: 1003.204895, total loss: 9940.840221] [epoch 80] [emb. loss: 8786.045473, top. loss: 1004.524902, total loss: 9790.570375] [epoch 90] [emb. loss: 8853.924331, top. loss: 984.087219, total loss: 9838.011550] [epoch 100] [emb. loss: 8987.524060, top. loss: 987.737366, total loss: 9975.261426] Time for embedding: 00:00:07 Conducting topologically regularized embedding for lifetime power 1 [epoch 1] [emb. loss: 9376.855727, top. loss: 1412.720459, total loss: 10789.576186] [epoch 10] [emb. loss: 9424.549514, top. loss: 967.172241, total loss: 10391.721755] [epoch 20] [emb. loss: 9237.509794, top. loss: 804.668701, total loss: 10042.178495] [epoch 30] [emb. loss: 9163.613059, top. loss: 710.923767, total loss: 9874.536827] [epoch 40] [emb. loss: 9133.507288, top. loss: 658.947815, total loss: 9792.455103] [epoch 50] [emb. loss: 9239.509584, top. loss: 652.785522, total loss: 9892.295106] [epoch 60] [emb. loss: 9136.785422, top. loss: 643.777466, total loss: 9780.562888] [epoch 70] [emb. loss: 9038.252752, top. loss: 629.470581, total loss: 9667.723333] [epoch 80] [emb. loss: 8974.596282, top. loss: 643.432922, total loss: 9618.029205] [epoch 90] [emb. loss: 9085.446099, top. loss: 623.973328, total loss: 9709.419426] [epoch 100] [emb. loss: 8981.926104, top. loss: 630.493286, total loss: 9612.419390] Time for embedding: 00:00:07 Conducting topologically regularized embedding for lifetime power 2 [epoch 1] [emb. loss: 9376.855727, top. loss: 1620.952148, total loss: 10997.807875] [epoch 10] [emb. loss: 9415.493040, top. loss: 460.171783, total loss: 9875.664824] [epoch 20] [emb. loss: 9242.411157, top. loss: 153.157578, total loss: 9395.568734] [epoch 30] [emb. loss: 9168.856423, top. loss: -5.797577, total loss: 9163.058846] [epoch 40] [emb. loss: 9254.240796, top. loss: -21.707993, total loss: 9232.532803] [epoch 50] [emb. loss: 9219.296454, top. loss: -116.257019, total loss: 9103.039435] [epoch 60] [emb. loss: 9142.757277, top. loss: -162.312286, total loss: 8980.444990] [epoch 70] [emb. loss: 9267.836763, top. loss: -182.079132, total loss: 9085.757631] [epoch 80] [emb. loss: 9196.095215, top. loss: -146.404800, total loss: 9049.690415] [epoch 90] [emb. loss: 9374.382597, top. loss: -141.650543, total loss: 9232.732054] [epoch 100] [emb. loss: 9039.833428, top. loss: -109.867249, total loss: 8929.966180] Time for embedding: 00:00:07
We observe that the topological regularization is reasonably stable against the choice of the power of the persistence lifetime.
Finally, we study how the tologically regularized embedding varies for a different potentially wrong prior. In particular, we study the topologically regularized embedding when the topological loss function is designed to ensure that the persistence of the most prominent cycle is high. All other hyperparameters will be kept equal.
# Define different topological loss
def g(p): return p[1] - p[0] # function that returns the persistence d - b of a point (b, d)
TopLayer = AlphaLayer(maxdim=1) # alpha complex layer
CircularPersistence = DiagramLoss(dim=1, j=1, g=g) # compute persistence of most prominent cycle
lambda_top = 1e1 # scalar factor that trades off embedding and topological loss
# Construct the topological loss functions
def top_loss_other(output):
dgminfo = TopLayer(output)
loss = -lambda_top * (CircularPersistence(dgminfo))
return loss
We explore the topologically regularized embedding for multiple epochs.
# Learning hyperparameters
num_epochs = [100, 250, 1000]
learning_rate = 1e-1
# Conduct topological regularization for multiple number of epochs
Y_epochs = {}
for epochs in num_epochs:
print("\033[1mConducting topologically regularized embedding for " + str(epochs) + " epochs\033[0m")
Y_epochs[epochs], losses_top_other = UMAP(data, top_loss=top_loss_other, num_epochs=epochs,
learning_rate=learning_rate, random_state=42)
print("\n")
# View topologically regularized embeddings
fig, ax = plt.subplots(1, len(num_epochs), figsize=(len(num_epochs) * 5, 3.5))
for idx, epochs in enumerate(num_epochs):
sns.scatterplot(x=Y_epochs[epochs][:,0], y=Y_epochs[epochs][:,1], s=50, hue=t, palette="husl", ax=ax[idx])
ax[idx].get_legend().remove()
ax[idx].set_title(str(epochs) + " epochs (with UMAP loss)")
plt.show()
Conducting topologically regularized embedding for 100 epochs [epoch 1] [emb. loss: 9376.855727, top. loss: -15.400949, total loss: 9361.454778] [epoch 10] [emb. loss: 9039.074319, top. loss: -26.457535, total loss: 9012.616785] [epoch 20] [emb. loss: 8869.637188, top. loss: -33.650097, total loss: 8835.987091] [epoch 30] [emb. loss: 8756.361980, top. loss: -37.516628, total loss: 8718.845352] [epoch 40] [emb. loss: 8618.454895, top. loss: -39.182213, total loss: 8579.272682] [epoch 50] [emb. loss: 8743.236561, top. loss: -44.038883, total loss: 8699.197678] [epoch 60] [emb. loss: 8721.251276, top. loss: -38.449944, total loss: 8682.801333] [epoch 70] [emb. loss: 8716.132112, top. loss: -43.078751, total loss: 8673.053362] [epoch 80] [emb. loss: 8541.475153, top. loss: -41.121223, total loss: 8500.353930] [epoch 90] [emb. loss: 8631.374361, top. loss: -40.123985, total loss: 8591.250375] [epoch 100] [emb. loss: 8543.023783, top. loss: -37.061806, total loss: 8505.961977] Time for embedding: 00:00:03 Conducting topologically regularized embedding for 250 epochs [epoch 1] [emb. loss: 9376.855727, top. loss: -15.400949, total loss: 9361.454778] [epoch 25] [emb. loss: 8742.524909, top. loss: -37.192924, total loss: 8705.331984] [epoch 50] [emb. loss: 8743.236561, top. loss: -44.038883, total loss: 8699.197678] [epoch 75] [emb. loss: 8599.089407, top. loss: -42.880722, total loss: 8556.208685] [epoch 100] [emb. loss: 8543.023783, top. loss: -37.061806, total loss: 8505.961977] [epoch 125] [emb. loss: 8472.216253, top. loss: -41.600071, total loss: 8430.616182] [epoch 150] [emb. loss: 8426.644576, top. loss: -42.705544, total loss: 8383.939032] [epoch 175] [emb. loss: 8596.689063, top. loss: -45.832348, total loss: 8550.856715] [epoch 200] [emb. loss: 8512.560573, top. loss: -47.835533, total loss: 8464.725040] [epoch 225] [emb. loss: 8489.846744, top. loss: -47.965004, total loss: 8441.881740] [epoch 250] [emb. loss: 8378.663044, top. loss: -51.647968, total loss: 8327.015076] Time for embedding: 00:00:09 Conducting topologically regularized embedding for 1000 epochs [epoch 1] [emb. loss: 9376.855727, top. loss: -15.400949, total loss: 9361.454778] [epoch 100] [emb. loss: 8543.023783, top. loss: -37.061806, total loss: 8505.961977] [epoch 200] [emb. loss: 8512.560573, top. loss: -47.835533, total loss: 8464.725040] [epoch 300] [emb. loss: 8365.334773, top. loss: -51.214813, total loss: 8314.119960] [epoch 400] [emb. loss: 8355.265528, top. loss: -40.292225, total loss: 8314.973303] [epoch 500] [emb. loss: 8446.866949, top. loss: -40.418922, total loss: 8406.448026] [epoch 600] [emb. loss: 8570.864994, top. loss: -40.061611, total loss: 8530.803382] [epoch 700] [emb. loss: 8522.397476, top. loss: -37.163643, total loss: 8485.233833] [epoch 800] [emb. loss: 8517.306434, top. loss: -39.136063, total loss: 8478.170371] [epoch 900] [emb. loss: 8544.170196, top. loss: -37.374268, total loss: 8506.795928] [epoch 1000] [emb. loss: 8543.322056, top. loss: -37.489784, total loss: 8505.832272] Time for embedding: 00:00:34
We observe that the cycle struggles to enlarge for higher number of epochs. The fact that this is due to the inclusion of the UMAP loss, can be confirmed by conducting the same topological optimization without the UMAP loss.
# Learning hyperparameters
num_epochs = [100, 250, 1000]
learning_rate = 1e-1
# Conduct topological optimization for multiple number of epochs
Y_opt_epochs = {}
for epochs in num_epochs:
print("\033[1mConducting topologically optimization for " + str(epochs) + " epochs\033[0m")
Y_opt_epochs[epochs], losses_opt_other = UMAP(Y_umap, emb_loss=False, top_loss=top_loss_other,
num_epochs=epochs, learning_rate=learning_rate, random_state=42)
print("\n")
# View topologically optimized embeddings
fig, ax = plt.subplots(1, len(num_epochs), figsize=(len(num_epochs) * 5, 3.5))
for idx, epochs in enumerate(num_epochs):
sns.scatterplot(x=Y_opt_epochs[epochs][:,0], y=Y_opt_epochs[epochs][:,1],
s=50, hue=t, palette="husl", ax=ax[idx])
ax[idx].get_legend().remove()
ax[idx].set_title(str(epochs) + " epochs (without UMAP loss)")
plt.show()
Conducting topologically optimization for 100 epochs [epoch 1] [emb. loss: 0.000000, top. loss: -8.952889, total loss: -8.952889] [epoch 10] [emb. loss: 0.000000, top. loss: -29.148804, total loss: -29.148804] [epoch 20] [emb. loss: 0.000000, top. loss: -36.754570, total loss: -36.754570] [epoch 30] [emb. loss: 0.000000, top. loss: -47.674465, total loss: -47.674465] [epoch 40] [emb. loss: 0.000000, top. loss: -55.700363, total loss: -55.700363] [epoch 50] [emb. loss: 0.000000, top. loss: -63.632767, total loss: -63.632767] [epoch 60] [emb. loss: 0.000000, top. loss: -72.809006, total loss: -72.809006] [epoch 70] [emb. loss: 0.000000, top. loss: -78.857086, total loss: -78.857086] [epoch 80] [emb. loss: 0.000000, top. loss: -83.158836, total loss: -83.158836] [epoch 90] [emb. loss: 0.000000, top. loss: -88.401649, total loss: -88.401649] [epoch 100] [emb. loss: 0.000000, top. loss: -91.513374, total loss: -91.513374] Time for embedding: 00:00:03 Conducting topologically optimization for 250 epochs [epoch 1] [emb. loss: 0.000000, top. loss: -8.952889, total loss: -8.952889] [epoch 25] [emb. loss: 0.000000, top. loss: -42.277710, total loss: -42.277710] [epoch 50] [emb. loss: 0.000000, top. loss: -63.632767, total loss: -63.632767] [epoch 75] [emb. loss: 0.000000, top. loss: -79.726387, total loss: -79.726387] [epoch 100] [emb. loss: 0.000000, top. loss: -91.513374, total loss: -91.513374] [epoch 125] [emb. loss: 0.000000, top. loss: -97.180016, total loss: -97.180016] [epoch 150] [emb. loss: 0.000000, top. loss: -110.392914, total loss: -110.392914] [epoch 175] [emb. loss: 0.000000, top. loss: -116.347336, total loss: -116.347336] [epoch 200] [emb. loss: 0.000000, top. loss: -128.504272, total loss: -128.504272] [epoch 225] [emb. loss: 0.000000, top. loss: -129.867584, total loss: -129.867584] [epoch 250] [emb. loss: 0.000000, top. loss: -142.175247, total loss: -142.175247] Time for embedding: 00:00:08 Conducting topologically optimization for 1000 epochs [epoch 1] [emb. loss: 0.000000, top. loss: -8.952889, total loss: -8.952889] [epoch 100] [emb. loss: 0.000000, top. loss: -91.513374, total loss: -91.513374] [epoch 200] [emb. loss: 0.000000, top. loss: -128.504272, total loss: -128.504272] [epoch 300] [emb. loss: 0.000000, top. loss: -156.839783, total loss: -156.839783] [epoch 400] [emb. loss: 0.000000, top. loss: -183.337234, total loss: -183.337234] [epoch 500] [emb. loss: 0.000000, top. loss: -210.123260, total loss: -210.123260] [epoch 600] [emb. loss: 0.000000, top. loss: -238.513275, total loss: -238.513275] [epoch 700] [emb. loss: 0.000000, top. loss: -263.181915, total loss: -263.181915] [epoch 800] [emb. loss: 0.000000, top. loss: -289.233246, total loss: -289.233246] [epoch 900] [emb. loss: 0.000000, top. loss: -314.791534, total loss: -314.791534] [epoch 1000] [emb. loss: 0.000000, top. loss: -318.556366, total loss: -318.556366] Time for embedding: 00:00:32
Naturally, by increase the topological regularization strenght, representations of false topological models may still be obtained. We will explore how this can be suggested from the data as follows.
# Topological loss function with stronger regularization
lambda_top_multipliers = [1 / 10, 1, 10]
# Learning hyperparameters
num_epochs = 1000
learning_rate = 1e-1
# Conduct topological regularization for different topological loss function
Y_lambdas = {}
losses_lambdas = {}
for m in lambda_top_multipliers:
def new_lambda_top_loss(output): return m * top_loss_other(output)
print("\033[1mConducting topologically regularized embedding for lambda_top = " +
str(m * lambda_top) + "\033[0m")
Y_lambdas[m], losses_lambdas[m] = UMAP(data, top_loss=new_lambda_top_loss, num_epochs=num_epochs,
learning_rate=learning_rate, random_state=42)
print("\n")
# View topologically regularized embeddings
fig, ax = plt.subplots(1, len(lambda_top_multipliers), figsize=(len(lambda_top_multipliers) * 5, 3.5))
for idx, m in enumerate(lambda_top_multipliers):
sns.scatterplot(x=Y_lambdas[m][:,0], y=Y_lambdas[m][:,1], s=50, hue=t, palette="husl", ax=ax[idx])
ax[idx].get_legend().remove()
ax[idx].set_title("$\lambda_{{{}}}$ = {} \n {} epochs (with UMAP loss)".
format("top", m * lambda_top, num_epochs))
plt.show()
Conducting topologically regularized embedding for lambda_top = 1.0 [epoch 1] [emb. loss: 9376.855727, top. loss: -1.540095, total loss: 9375.315632] [epoch 100] [emb. loss: 8381.159537, top. loss: -2.912832, total loss: 8378.246706] [epoch 200] [emb. loss: 8490.889921, top. loss: -2.164753, total loss: 8488.725168] [epoch 300] [emb. loss: 8457.992684, top. loss: -2.143497, total loss: 8455.849187] [epoch 400] [emb. loss: 8397.265691, top. loss: -3.013646, total loss: 8394.252045] [epoch 500] [emb. loss: 8442.959480, top. loss: -2.163042, total loss: 8440.796438] [epoch 600] [emb. loss: 8631.406087, top. loss: -2.433714, total loss: 8628.972373] [epoch 700] [emb. loss: 8472.105310, top. loss: -2.030765, total loss: 8470.074545] [epoch 800] [emb. loss: 8449.583733, top. loss: -1.350383, total loss: 8448.233349] [epoch 900] [emb. loss: 8438.075491, top. loss: -1.218628, total loss: 8436.856862] [epoch 1000] [emb. loss: 8342.530128, top. loss: -1.157551, total loss: 8341.372577] Time for embedding: 00:00:34 Conducting topologically regularized embedding for lambda_top = 10.0 [epoch 1] [emb. loss: 9376.855727, top. loss: -15.400949, total loss: 9361.454778] [epoch 100] [emb. loss: 8543.023783, top. loss: -37.061806, total loss: 8505.961977] [epoch 200] [emb. loss: 8512.560573, top. loss: -47.835533, total loss: 8464.725040] [epoch 300] [emb. loss: 8365.334773, top. loss: -51.214813, total loss: 8314.119960] [epoch 400] [emb. loss: 8355.265528, top. loss: -40.292225, total loss: 8314.973303] [epoch 500] [emb. loss: 8446.866949, top. loss: -40.418922, total loss: 8406.448026] [epoch 600] [emb. loss: 8570.864994, top. loss: -40.061611, total loss: 8530.803382] [epoch 700] [emb. loss: 8522.397476, top. loss: -37.163643, total loss: 8485.233833] [epoch 800] [emb. loss: 8517.306434, top. loss: -39.136063, total loss: 8478.170371] [epoch 900] [emb. loss: 8544.170196, top. loss: -37.374268, total loss: 8506.795928] [epoch 1000] [emb. loss: 8543.322056, top. loss: -37.489784, total loss: 8505.832272] Time for embedding: 00:00:35 Conducting topologically regularized embedding for lambda_top = 100.0 [epoch 1] [emb. loss: 9376.855727, top. loss: -154.009491, total loss: 9222.846236] [epoch 100] [emb. loss: 8599.419618, top. loss: -834.255493, total loss: 7765.164125] [epoch 200] [emb. loss: 8765.762889, top. loss: -1308.315674, total loss: 7457.447215] [epoch 300] [emb. loss: 8617.362082, top. loss: -1559.152100, total loss: 7058.209982] [epoch 400] [emb. loss: 8786.222871, top. loss: -2075.254395, total loss: 6710.968476] [epoch 500] [emb. loss: 8793.820944, top. loss: -2353.928467, total loss: 6439.892477] [epoch 600] [emb. loss: 8986.778646, top. loss: -2795.119873, total loss: 6191.658773] [epoch 700] [emb. loss: 9019.626057, top. loss: -3065.746826, total loss: 5953.879231] [epoch 800] [emb. loss: 9053.186037, top. loss: -3411.338379, total loss: 5641.847658] [epoch 900] [emb. loss: 8861.325488, top. loss: -3748.851318, total loss: 5112.474170] [epoch 1000] [emb. loss: 9075.385470, top. loss: -4178.979980, total loss: 4896.405489] Time for embedding: 00:00:36
We see that for low regularization strengths, the topological prior has little to no impact on the embedding, whereas for too high regularization strengths, the cycle becomes an unnatural representation of the data, as most points remain clustered together. We can investigate the evolution of the losses during optimization for the different regularization strengths as follows.
for m in lambda_top_multipliers:
# Summarize the topological and embedding losses in a data frame
pd_losses_m = pd.DataFrame({"epoch":range(num_epochs),
"embedding (with top. reg.)":losses_lambdas[m]["losses"]["embedding"],
"topological (with UMAP)":losses_lambdas[m]["losses"]["topological"] /
(m * lambda_top)})
# Plot the losses according to the number of epochs
ax = pd_losses_m.plot(x="epoch", y="embedding (with top. reg.)", legend=False, figsize=(15, 3), color="#2c7fb8")
ax.set_ylabel("embedding loss")
ax2 = ax.twinx()
pd_losses_m.plot(x="epoch", y="topological (with UMAP)", legend=False, color="#ff7f0e", ax=ax2)
ax2.set_ylabel("topological loss")
ax.figure.legend(bbox_to_anchor=(0.6, 0.875))
ax.set_title("$\lambda_{{{}}}$ = {}".format("top", m * lambda_top))
plt.show()