Domain-Independent Automated Processing of Free-Form Text Data in TelecomDownload PDFOpen Website

2019 (modified: 02 Nov 2022)ICDE 2019Readers: Everyone
Abstract: Free-form, unstructured and semi-structured textual data has become increasingly more prevalent in the telecommunications industry, with service and equipment providers alike. Some typical examples include textual data from customer care tickets, machine logs, alarm and alerting systems, and diagnostics. There is a growing business need to rapidly and automatically understand the underlying key topics and categories of this bulk collection of text. With the present mode of operation of relying on domain experts to analyze textual data, there is a clear need to apply text analytics to automate the process. Difficulties arise due to the jargon-filled and fragmented, incomplete nature of textual data in this field. In this paper, we propose a domain-agnostic, unsupervised approach that deploys a multi-stage text processing pipeline for automatically discovering the key topics and categories from free-form text documents. Using anonymized datasets retrieved from actual customer care tickets and system logs, we show that our approach outperforms traditional text mining approaches, and performs comparably to manual categorization tasks that were undertaken by domain experts with full system knowledge.
0 Replies

Loading