Bilingual Terminology Mining - Using Brain, not brawn comparable corporaDownload PDFOpen Website

2007 (modified: 13 Nov 2022)ACL 2007Readers: Everyone
Abstract: Current research in text mining favours the quantity of texts over their quality. But for bilingual terminology mining, and for many language pairs, large comparable corpora are not available. More importantly, as terms are defined vis-a-vis a specific domain with a restricted register, it is expected that the quality rather than the quantity of the corpus matters more in terminology mining. Our hypothesis, therefore, is that the quality of the corpus is more important than the quantity and ensures the quality of the acquired terminological resources. We show how important the type of discourse is as a characteristic of the comparable corpus.
0 Replies

Loading