CGDM: collaborative genomic data model for molecular profiling data using NoSQL

Shicai Wang, Mihaela A. Mares, Yike Guo

Published: 2016, Last Modified: 15 Nov 2024Bioinform. 2016EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: : High-throughput molecular profiling has greatly improved patient stratification and mechanistic understanding of diseases. With the increasing amount of data used in translational medicine studies in recent years, there is a need to improve the performance of data warehouses in terms of data retrieval and statistical processing. Both relational and Key Value models have been used for managing molecular profiling data. Key Value models such as SeqWare have been shown to be particularly advantageous in terms of query processing speed for large datasets. However, more improvement can be achieved, particularly through better indexing techniques of the Key Value models, taking advantage of the types of queries which are specific for the high-throughput molecular profiling data.