An image representation based convolutional network for DNA classification

Bojian Yin, Marleen Balvert, Davide Zambrano, Alexander Schoenhuth, Sander Bohte

Feb 15, 2018 (modified: Feb 22, 2018) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: The folding structure of the DNA molecule combined with helper molecules, also referred to as the chromatin, is highly relevant for the functional properties of DNA. The chromatin structure is largely determined by the underlying primary DNA sequence, though the interaction is not yet fully understood. In this paper we develop a convolutional neural network that takes an image-representation of primary DNA sequence as its input, and predicts key determinants of chromatin structure. The method is developed such that it is capable of detecting interactions between distal elements in the DNA sequence, which are known to be highly relevant. Our experiments show that the method outperforms several existing methods both in terms of prediction accuracy and training time.
  • TL;DR: A method to transform DNA sequences into 2D images using space-filling Hilbert Curves to enhance the strengths of CNNs
  • Keywords: DNA sequences, Hilbert curves, Convolutional neural networks, chromatin structure