Enhancing cross-language information retrieval by an automatic acquisition of bilingual terminology from comparable corporaOpen Website

2003 (modified: 12 Nov 2022)SIGIR 2003Readers: Everyone
Abstract: This paper presents an approach to bilingual lexicon extraction from comparable corpora and evaluations on Cross-Language Information Retrieval. We explore a bi-directional extraction of bilingual terminology primarily from comparable corpora. A combined statistics-based and linguistics-based model to select best translation candidates to phrasal translation is proposed. Evaluations using a large test collection for Japanese-English revealed the proposed combination of bi-directional comparable corpora, bilingual dictionaries and transliteration, augmented with linguistics-based pruning to be highly effective in Cross-Language Information Retrieval.
0 Replies

Loading