Mining and Using Key-Words and Key-Phrases to Identify the Era of an Anonymous Text

Published: 01 Jan 2017, Last Modified: 06 Jun 2025Trans. Comput. Collect. Intell. 2017EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This study is trying to determine the time-frame in which the author of a given document lived. The documents are rabbinic documents written in Hebrew-Aramaic languages. The documents are undated and do not contain a bibliographic section, which leaves us with an interesting challenge. To do this, we define a set of key-phrases and formulate various types of rules: “Iron-clad”, Heuristic and Greedy, to define the time-frame. These rules are based on key-phrases and key-words in the documents of the authors. Identifying the time-frame of an author can help us determine the generation in which specific documents were written, can help in the examination of documents, i.e., to conclude if documents were edited, and can also help us identify an anonymous author. We tested these rules on two corpora containing responsa documents. The results are promising and are better for the larger corpus than for the smaller corpus.
Loading