Question Classification Using Universal Sentence Encoder and Deep Contextualized Transformer

Najam Arif, Seemab Latif, Rabia Latif

Published: 2021, Last Modified: 19 Feb 2025DeSE 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: One of the most vital steps in automatic Question Answer systems is Question classification. The Question classification is also known as Answer type classification, identification, or prediction. The precise and accurate identification of answer types can lead to the elimination of irrelevant candidate answers from the pool of answers available for the question. High accuracy of Question Classification phase means highly accurate answer for the given question. This paper proposes an approach, named Question Sentence Embedding(QSE), for question classification by utilizing semantic features. Extracting a large number of features does not solve the problem every time. Our proposed approach simplifies the feature extraction stage by not extracting features such as named entities which are present in fewer questions because of their short length and features such as hypernyms and hyponyms of a word which requires WordNet extension and hence makes the system more external sources dependent. We encourage the use of Universal Sentence Embedding with Transformer Encoder for obtaining sentence level embedding vector of fixed size and then calculate semantic similarity among these vectors to classify questions in their predefined categories. As it is the time of the Global pandemic COVID-19 and people are more curious to ask questions about COVID. So, our experimental dataset is a publicly available COVID-Q dataset. The acquired result highlights an accuracy of 69% on COVID questions. The approach outperforms the baseline method manifesting the efficacy of the QSE method.