Turkish Rule-Based Official Document Type DetectionDownload PDFOpen Website

2020 (modified: 13 Nov 2021)SIU 2020Readers: Everyone
Abstract: This study is the first stage of industrial application that will be used in the product named DATAMIN, which is being developed to help companies adapt Personal Data Protection Law (DPL) No. 6698 came into force in 2016 in Turkey, by extracting and relationing personal information in official documents. Rulebased official document type detection method based on matching control with flexible regular expressions and minimum edit distance was developed by determining the distinctive effect values of the field names in the documents. It was found that proposed method was highly effective and able to make accurate modeling when optical character recognition with high-quality was avaliable.
0 Replies

Loading