Multi-view Data Visualisation via Manifold Learning

Multi-view Data Visualisation via Manifold Learning

TMLR Paper1327 Authors

26 Jun 2023 (modified: 17 Sept 2024)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Non-linear dimensionality reduction can be performed by manifold learning approaches, such as Stochastic Neighbour Embedding (SNE), Locally Linear Embedding (LLE) and Isometric Feature Mapping (ISOMAP). These methods aim to produce two or three latent embeddings, primarily to visualise the data in intelligible representations. This manuscript proposes extensions of Student's t-distributed SNE (t-SNE), LLE and ISOMAP, for dimensionality reduction and visualisation of multi-view data. Multi-view data refers to multiple types of data generated from the same samples. The proposed multi-view approaches provide more comprehensible projections of the samples compared to the ones obtained by visualising each data-view separately. Commonly, visualisation is used for identifying underlying patterns within the samples. By incorporating the obtained low-dimensional embeddings from the multi-view manifold approaches into the $K$-means clustering algorithm, it is shown that clusters of the samples are accurately identified. Through extensive comparisons of novel and existing multi-view manifold learning algorithms on real and synthetic data, the proposed multi-view extension of t-SNE, named multi-SNE, is found to have the best performance. We further illustrate the applicability of the multi-SNE approach for the analysis of multi-omics single-cell data, where the aim is to visualise and identify cell heterogeneity and cell types in biological tissues relevant to health and disease.

Submission Length: Long submission (more than 12 pages of main content)

Changes Since Last Submission: 1. We thank the reviewer gKV1 for taking the time to check the two manuscripts. As they have pointed out both manuscripts were submitted as preprints in January 2021. As the reviewer has suggested it is important to set the record straight and discuss that the two works were done in parallel. Following this recommendation we have updated the following paragraph in the Introduction below: In parallel to our work, Do and Canzar (2021) proposed a multi-view extension of t-SNE, named j-SNE. Both multi-SNE and j-SNE first appeared as preprints in January 2021. J-SNE produces low-dimensional embeddings through an iterative procedure that assigns each data-view a weight value that is updated per iteration through regularisation. 2. The description of the simulation strategy has been updated to clearly explain the logic and the steps implemented in obtaining simulated data. The following text has been added to the manuscript at the following relevant places - Section 3.1: Noise, $\boldsymbol{\epsilon}$, increases the variability within a given data-view. The purpose of this additional variability is to assess whether the algorithms are able to equally capture information from all data-views and are not biased towards the data-view(s) with a higher variability. Thus, noise, $\boldsymbol{\epsilon}$, was only included in selected data-views and not in the rest. Although this strategy is equivalent to sampling once using a larger variance, the extra noise explicitly distinguishes the data-view with the higher variability from the rest. - MMDS simulation: In this scenario, only the third data-view contained an extra noise parameter, $\boldsymbol{\epsilon}$, resulting in a data-view with a higher variability than the other two data-views. - MCS simulation: Similarly to MMDS and NDS, in this scenario, only the third data-view contained an extra noise parameter, $\boldsymbol{\epsilon}$, resulting in a data-view with a higher variability than the other two data-views. 3. In the first paragraph of Section 4, the following was added to explain the process of applying the single-view manifold learning algorithms on the concatenated features: Comparisons between the multi-view solutions, along with their respective single-view solutions are implemented. A trivial solution is to concatenate the features of all data-views into a large single data matrix and apply on this dataset a single-view manifold learning algorithm. Since it is likely that each data-view has different variability, each data-view was firstly normalised before concatenation to ensure the same variability across all data-views. Normalisation was achieved by removing the mean and dividing by the standard deviation of the features in all data-views. 4. We have updated the last paragraph in Section 2.2 to include the following: The original t-SNE implementation was applied in the presented work. All t-SNE results presented in this manuscript were based on the original \texttt{R} implementation (https://cran.r-project.org/web/packages/tsne/) and verified by the original \texttt{Python} implementation (\url{https://lvdmaaten.github.io/tsne/}).} 5. The description of step 3 in Section 2.4 has been updated as follows: The $i^{th}$ component of the low-dimensional embedding is given by $y_i = \sqrt{\lambda_p}u^i_p$, where $u^i_p$ the $i^{th}$ component of $p^{th}$ eigenvector and $\lambda_p$ is the $p^{th}$ eigenvalue in decreasing order of the the matrix $\tau(D_G)$ \citep{tenenbaum_isomap}. The operator, $\tau$ is defined by $\tau(D) = - \frac{HSH}{2}$, where $S$ is the matrix of squared distances defined by $S_{ij} = D_{ij}^2$, and $H$ is defined by $H_{ij} = \delta_{ij} - \frac{1}{N}$. This is equivalent to applying classical MDS to $D_G$, leading to a low-dimensional embedding that best preserves the manifold's estimated intrinsic geometry. 6. The following statement was added in the caption of Table 1 to highlight how we defined high-dimensional data: Real data are taken as heterogeneous, whereas the synthetic data are regarded as homogeneous. High-dimensional data contain more features than samples ($p \gg N$). 7. Consensus matrix here refers to the combined weight matrix as described in the manuscript. A clearer definition is now included in the manuscript, see Section 2.3: \\ This solution minimises the cost function by assuming a consensus weight matrix across all data-views, as given in equation (9).

Assigned Action Editor: ~Seungjin_Choi1

Submission Number: 1327

Loading