Abstract: With the increasing number of omics data, there is a great need to incorporate these datasets together to create a better and more robust understanding of the underlying biological processes. We transform this problem into a noisy multiview independent component analysis (ICA) task by assuming that each observed dataset (view) is a linear mixture of independent latent biological processes. Furthermore, we assume that each view contains a mixture of shared and individual sources. To computationally estimate the sources, we optimize a constrained form of the joint log-likelihood of the observed data among all views. Finally, we apply the proposed model in a challenging real-life application, where the estimated shared sources from two large transcriptome datasets (observed data) provided by two different labs (two different views) lead to a more plausible representation of the underlying graph structure than existing baselines.
1 Reply
Loading