Integrated structural variation and point mutation signatures in cancer genomes using correlated topic models

Abstract: Author summary Over time DNA accumulates mutations from a variety of sources. Some mutations result from external mutagens, such as UV radiation, while others result from processes occurring within the cell itself. Each of these sources can impart characteristic patterns of mutations on the genome, known as mutation signatures, which can be detected using computational techniques. Loss of DNA repair mechanisms can leave specific mutation signatures in the genomes of cancer cells. To identify cancers with broken DNA-repair processes, accurate methods are needed for detecting mutation signatures and, in particular, their activities or probabilities within individual cancers. In this paper, we introduce a class of statistical modeling methods used for natural language processing, known as “topic models”, that outperform standard methods for signature analysis. We show that topic models that incorporate signature probability correlations across cancers perform best, while jointly analyzing multiple mutation types improves robustness to low mutation counts.
0 Replies
Loading