SympGAN: A systematic knowledge integration system for symptom-gene associations network

Published: 01 Jan 2023, Last Modified: 05 Aug 2025Knowl. Based Syst. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We have developed SympGAN, a comprehensive, high-quality, and extensive knowledge graph-based system that encompasses the most comprehensive terminology set of 12,560 symptom phenotypes and their associations with genes, diseases, and drugs.•SympGAN has made a significant breakthrough by acquiring a comprehensive dataset for the knowledge graph, comprising 401,126 symptom–gene triples. These triples, along with their accompanying data, have undergone meticulous collection procedures to ensure exceptional quality. Our methodology involves employing the RoBERTa-PubMed model for named entity recognition (NER) and conducting literature mining from biomedical studies to gather pertinent information. Furthermore, we utilize sophisticated, high-precision algorithms to infer phenotypic associations with genes.•The website http://www.sympgan.org/ offers a comprehensive platform that enables users to conduct integrative searches and perform online knowledge inference and analysis. It serves as a centralized hub for exploring clinical knowledge associated with symptoms, as well as related diseases, genes, drugs, and molecular networks. This robust resource facilitates the interpretation and exploration of symptom phenotypes, particularly in understanding their genetic origins. By promoting precision health research and advancing the field of symptom science, http://www.sympgan.org/ significantly contributes to enhancing our understanding of symptoms and the underlying genetic factors involved.
Loading