Track: long paper (up to 6 pages)
Keywords: somatic variant calling, cancer genomics, deep learning, transfer learning, FFPE tumor samples, clinical diagnostics
TL;DR: Deep transfer learning based cancer variant caller for clinical FFPE tumor biopsies that outperforms existing methods
Abstract: Detecting somatic mutations in tumors is fundamental to improving our understanding of cancer and to recommend customized treatment strategies. In clinical settings, next-generation sequencing is performed predominantly on tumor biopsies that have undergone formalin fixation and paraffin embedding (FFPE), which allows laboratories to store and transport samples at room temperature without significantly affecting their morphology. However, the FFPE process is known to cause chemical changes in DNA and significantly affects the accuracy of variant calling algorithms, which are primarily developed for research-grade fresh frozen (FF) samples. Hence, there is a mismatch between software tools used in research settings and what is needed in clinical settings. Here we develop FFPENet, a convolutional neural network (CNN) trained to identify and remove FFPE artifacts in the output of somatic variant callers on FFPE tumor samples. FFPENet encodes raw reads overlapping each mutation in a multi-channel tensor input and fine-tunes a CNN pre-trained on a large number of FF tumor whole genome samples using a multi-sector cohort of patient-matched FF and FFPE tumor samples. We demonstrate the improved effectiveness of this approach over existing methods by benchmarking on a multi-sector whole-exome sequencing (WES) cohort of lung cancer.
Submission Number: 18
Loading