Abstract: Information providers are faced with a critical challenge to process, retrieve and present information to their users in order to satisfy their complex information needs, because data has been increasing in an unprecedented manner, coming from diverse sources, and covering a variety of domains in heterogeneous formats. In this paper, we present Thomson Reuters' effort in developing a family of services for building and querying an enterprise knowledge graph in order to address this challenge. We first acquire data from various sources via different approaches. Furthermore, we mine useful information from the data by adopting a variety of techniques, including Named Entity Recognition and Relation Extraction; such mined information is further integrated with existing structured data (e.g., via Entity Linking techniques) in order to obtain relatively comprehensive descriptions of the entities. By modeling the data as an RDF graph model, we enable easy data management and the embedding of rich semantics in our data. Finally, in order to facilitate the querying of this mined and integrated data, i.e., the knowledge graph, we propose TR Discover, a natural language interface that allows users to ask questions of our knowledge graph in their own words; such natural language questions are then translated into executable queries for answer retrieval. We evaluate our services, i.e., named entity recognition, relation extraction, entity linking and natural language interface, on real-world datasets, and demonstrate and discuss their practicability and limitations.
0 Replies
Loading