Abstract: We propose Network Automatic Relevance Determination (NARD), an extension of ARD for linearly probabilistic models, to simultaneously model sparse relationships between inputs $X \in \mathbb R^{d \times N}$ and outputs $Y \in \mathbb R^{m \times N}$, while capturing the correlation structure among the $Y$. NARD employs a matrix normal prior which contains a sparsity-inducing parameter to identify and discard irrelevant features, thereby promoting sparsity in the model. Algorithmically, it iteratively updates both the precision matrix and the relationship between $Y$ and the refined inputs. To mitigate the computational inefficiencies of the $\mathcal O(m^3 + d^3)$ cost per iteration, we introduce Sequential NARD, which evaluates features sequentially, and a Surrogate Function Method, leveraging an efficient approximation of the marginal likelihood and simplifying the calculation of determinant and inverse of an intermediate matrix. Combining the Sequential update with the Surrogate Function method further reduces computational costs. The computational complexity per iteration for these three methods is reduced to $\mathcal O(m^3+p^3)$, $\mathcal O(m^3 + d^2)$, $\mathcal O(m^3+p^2)$ respectively, where $p \ll d$ is the final number of features in the model. Our methods demonstrate significant improvements in computational efficiency with comparable performance on both synthetic and real-world datasets.
Lay Summary: Understanding how different biological features — like genes or molecular markers — are connected to physical traits or diseases is a key challenge in biomedical science. These relationships are often complex and involve many interacting factors, making it hard to find what truly matters.
We developed a method called Network Automatic Relevance Determination (NARD) to help identify which features are important and how they relate to each other. Unlike traditional approaches, NARD doesn’t just focus on one-to-one links but also captures how different traits are interrelated. This is especially useful when studying biological systems where multiple traits can be influenced by shared genetic or molecular factors.
To make this process more efficient, we designed optimized versions of our method that significantly reduce the time needed to analyze large datasets. This allows researchers to more quickly and effectively uncover meaningful associations in complex biological networks.
Our tools offer a scalable way to explore multi-dimensional biological data, supporting discoveries that could advance disease research or personalized medicine.
Primary Area: General Machine Learning->Scalable Algorithms
Keywords: Automatic Relevance Determination, Graphical Lasso
Submission Number: 3328
Loading