Leveraging Machine Learning and Large Language Models for Enhanced Occupational Stress Detection

Mohammad Junayed Hasan; Jannat Sultana; Silvia Ahmed; Sifat Momen

Leveraging Machine Learning and Large Language Models for Enhanced Occupational Stress Detection

Mohammad Junayed Hasan, Jannat Sultana, Silvia Ahmed, Sifat Momen

Published: 22 Sept 2025, Last Modified: 22 Sept 2025WiML @ NeurIPS 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Advanced Feature Selection, Large Language Models, Machine Learning, Occupational Stress, Survey Data Analysis, Workplace Safety

Abstract: Motivation and Problem. Occupational stress significantly compromises workplace safety, with stressed workers being substantially more likely to be involved in accidents due to decreased attention and impaired decision-making. Traditional stress detection methods are reactive, subjective, and fail to provide real-time insights necessary for proactive safety management. Existing computational approaches achieve accuracy rates below 85% and often operate in isolation, missing synergies between different AI domains that could enhance stress detection for workplace safety applications. Approach and Methodology. We developed an integrated AI framework combining machine learning, deep learning, and natural language processing for occupational stress detection using a recent and novel Malaysian workplace survey dataset (309 participants). Our methodology includes: (1) a hybrid feature selection pipeline integrating Recursive Feature Elimination with Cross-Validation (RFECV) and Analysis of Variance (ANOVA), identifying 39 key stress indicators; (2) training 11 machine learning algorithms, a custom 1D-CNN with 911,873 parameters, and five domain-specific BERT models; (3) creating an ensemble of the three best-performing models (Random Forest, Logistic Regression, SVC) using hard and soft voting; and (4) developing a novel algorithm to convert tabular survey data into natural language sentences with 100% information retention, enabling large language model analysis. Technical Contributions. Our key innovations include: (1) the first demonstration that combining feature importance (RFECV) and feature ranking (ANOVA) techniques yields 5-10% performance improvements over individual methods for a reduced, manageable set of features; (2) a systematic data-to-text conversion algorithm enabling domain analysis with pre-trained language models; (3) comprehensive 3-step validation through holdout, 10-fold cross-validation, and external validation using four synthetic data generation techniques (Gaussian Copula, CTGAN, TVAE, CopulaGAN). Results and Impact. Our ensemble model achieved state-of-the-art performance with 90.32% accuracy and 89.20% macro-averaged F1-score, surpassing eight recent methods by 2-15%. Domain analysis revealed that occupational stress patterns align more closely with biomedical domains than clinical domains, with BioBERT achieving comparable performance (90.32% accuracy) to our best ensemble. External validation on synthetic data demonstrated robust generalizability (89% accuracy on unseen scenarios). Explainable AI analysis identified excessive workload and ambiguity (27%), poor communication (17%), and work environment (16%) as primary stress factors. Deployment and Practical Value. We deployed our model on Hugging Face with response times under 100ms, enabling real-time workplace stress assessment. The framework provides actionable insights for safety managers through interpretable predictions, facilitating proactive intervention before stress-related safety incidents occur. Significance for Machine Learning. This work demonstrates the effectiveness of multi-domain AI integration for real-world applications, advancing ensemble learning methodologies and establishing new benchmarks for occupational health prediction. Our hybrid feature selection approach and data transformation techniques have broader implications for tabular data analysis and domain adaptation in healthcare applications. The comprehensive validation framework, including synthetic data generation, provides a robust methodology for evaluating model generalizability in safety-critical applications. Broader Impact. Beyond technical contributions, this research addresses a critical societal need for workplace safety. Our findings enable evidence-based stress management policies and can inform occupational safety standards that incorporate psychosocial risk factors. The model’s interpretability supports targeted interventions, potentially reducing workplace accidents and improving employee wellbeing across diverse organizational contexts.

Submission Number: 53

Loading