SERCNN: Stacked Embedding Recurrent Convolutional Neural Network in Depression Detection on TwitterDownload PDF

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: Social Media, Twitter, NLP, Depression, Mental Health
Abstract: Conventional approach of self-reporting-based screening for depression is not scalable, expensive, and requires one to be fully aware of their mental health. Motivated by previous studies that demonstrated great potentials for using social media posts to monitor and predict one's mental health status, this study utilizes natural language processing and machine learning techniques on social media data to predict one's risk of depression. Most existing works utilize handcrafted features, and the adoption of deep learning in this domain is still lacking. Social media texts are often unstructured, ill-formed, and contain typos, making handcrafted features and conventional feature extraction methods inefficient. Moreover, prediction models built on these features often require a high number of posts per individual for accurate predictions. Therefore, this study proposes a Stacked Embedding Recurrent Convolutional Neural Network (SERCNN) for a more optimized prediction that has a better trade-off between the number of posts and accuracy. Feature vectors of two widely available pretrained embeddings trained on two distinct datasets are stacked, forming a meta-embedding vector that has a more robust and richer representation for any given word. We adapt Lai et al. (2015) RCNN approach that incorporates both the embedding vector and context learned from the neural network to form the final user representation before performing classification. We conducted our experiments on the Shen et al. (2017) depression Twitter dataset, the largest ground truth dataset used in this domain. Using SERCNN, our proposed model achieved a prediction accuracy of 78% when using only ten posts from each user, and the accuracy increases to 90% with an F1-measure of 0.89 when five hundred posts are analyzed.
One-sentence Summary: SERCNN incorporates and exploits the rich representation extracted from two pretrained GloVe embeddings and the learned context, achieving 78% accuracy even we have only 10 tweets for each user
5 Replies

Loading