Topological regularization of bifurcating cell trajectory embedding

In this notebook, we show how a topological loss can be combined with a non-linear embedding procedure, as to regularize the embedding and better reflect the topological---in this case bifurcating---prior.

We start by setting the working directory and importing the necessary libraries.

Load data and view ordinary UMAP embedding

We start by loading the data and visualize it by means of its ordinary UMAP embedding.

Apply topological regularization to the embedding

We now show how we can bias a non-linear embedding using a loss function that captures our topological prior. This topological loss will be a linear combination of two separate losses:

To obtain these losses, we require an additional layer that constructs the alpha complex from the embedding, from which subsequently persistent homology is computed.

We can now conduct the topologically regularized embedding as follows.

Compare with ordinary topological optimization

For comparison, we also conduct the same topological optimization procedure directly on the initialized embedding.

We observe that without the embedding loss, the represented topologies are more fragmented and more interior points representing the bifurcation are pulled towards the ends.

Quantitative evaluation

First, we evaluate the different losses (embedding and topological) for all final embeddings.

Finally, we compare if the topologically regularized embedding improves on the ordinary UMAP embedding for predicting data point labels.

Varying threshold tau in flare loss

We explore how topological regularization reacts to different thresholds $\tau$ that are used to topologically optimize for a flare. The different embeddings are obtained and visualized as follows.

Topological regularization for different powers of the persistence lifetime

Different powers of the persistence may result in different behavior of the topological regularization. We explore this (keeping all other hyperparameters identical) as follows.

We observe that the topological regularization is reasonably stable against the choice of the power of the persistence lifetime.

Topological regularization for different shape prior

Finally, we study how the tologically regularized embedding varies for a different potentially wrong prior. In particular, we study the topologically regularized embedding when the topological loss function is designed to ensure that the persistence of the most prominent cycle is high. All other hyperparameters will be kept equal.

We explore the topologically regularized embedding for multiple epochs.

We observe that the cycle struggles to enlarge for higher number of epochs. The fact that this is due to the inclusion of the UMAP loss, can be confirmed by conducting the same topological optimization without the UMAP loss.

Naturally, by increase the topological regularization strenght, representations of false topological models may still be obtained. We will explore how this can be suggested from the data as follows.

We see that for low regularization strengths, the topological prior has little to no impact on the embedding, whereas for too high regularization strengths, the cycle becomes an unnatural representation of the data, as most points remain clustered together. We can investigate the evolution of the losses during optimization for the different regularization strengths as follows.