Abstract: We present the information extraction system Text2SemRel. The system (semi-) automatically constructs knowledge bases from textual data consisting of facts about entities using semantic relations. An integral part of the system is a graph-based interactive visualization and search layer. The second contribution in this paper is the presentation of a case study on the (semi-) automatic construction of a knowledge base consisting of gene-disease associations. The resulting knowledge base, the Literature-derived Human Gene-Disease Network (LHGDN), is now an integral part of the Linked Life Data initiative and represents currently the largest publicly available gene-disease repository. The LHGDN is compared against several curated state of the art databases. A unique feature of the LHGDN is that the semantics of the associations constitute a wide variety of biomolecular conditions.
Loading