Overview of IEEE BigData 2024 Cup Challenges: Suicide Ideation Detection on Social Media

Jun Li, Yifei Yan, Ziyan Zhang, Xiangmeng Wang, Hong Va Leong, Nancy Xiaonan Yu, Qing Li

Published: 2024, Last Modified: 23 Jan 2026IEEE Big Data 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This overview presents one of the cup challenges of IEEE BigData 2024, with the topic of suicide risk level detection on social media posts. Given a training set of N = 2000 posts (N = 500 labelled and N = 1500 unlabelled posts) from r/SuicideWatch subreddits, the task of this challenge is to develop a predictive model capable of classifying the suicidal posts into four levels (i.e., indicator, ideation, behaviour, and attempt). The dataset provided simulated the obstacles existed in relevant fields (e.g., model overfitting, data scarcity and class imbalance), participating teams are supposed to tackle these issues while exploring the effectiveness of various model architectures. We received submissions from 21 teams and works of 13 teams underwent final evaluation. Teams addressed key challenges in suicide risk detection including limited suicidal data and suicidal risk imbalance. They employed novel approaches to overcome these obstacles, leveraging a diverse range of models from foundational base language models (BLMs) to state-of-the-art large language models (LLMs). In the competition, the highest weighted F1-score achieved under the final evaluation was 0.7605. The findings of this challenge can provide technical implications to social media suicide detection and contribute the clinical effectiveness to the applications of machine learning in digital suicide or mental healthcare management.

External IDs:dblp:conf/bigdataconf/LiYZWLYL24