Providing Database-like Access to the Web Using Queries Based on Textual SimilarityDownload PDFOpen Website

1998 (modified: 09 Sept 2021)SIGMOD Conference 1998Readers: Everyone
Abstract: Most databases contain “name constants” like course numbers, personal names, and place names that correspond to entities in the real world. Previous work in integration of heterogeneous databases has assumed that local name constants can be mapped into an appropriate global domain by normalization. Here we assume instead that the names are given in natural language text. We then propose a logic for database integration called WHIRL which reasons explicitly about the similarity of local names, as measured using the vector-space model commonly adopted in statistical information retrieval. An implemented data integration system based on WHIRL has been used to successfully integrate information from several dozen Web sites in two domains.
0 Replies

Loading