MACHINE LEARNING APPROACHES FOR ELECTRICITY THEFT AND NON-TECHNICAL LOSS DETECTION: A COMPARATIVE ANALYSIS OF LOGISTIC REGRESSION, GRADIENT BOOSTING, AND LSTM ARCHITECTURES

Fasasi Musa Olashile; Akinmade Faruq

MACHINE LEARNING APPROACHES FOR ELECTRICITY THEFT AND NON-TECHNICAL LOSS DETECTION: A COMPARATIVE ANALYSIS OF LOGISTIC REGRESSION, GRADIENT BOOSTING, AND LSTM ARCHITECTURES

Fasasi Musa Olashile, Akinmade Faruq

Published: 03 May 2026, Last Modified: 03 May 2026G-SPARK 1.0 OralEveryoneRevisionsCC BY 4.0

Keywords: electricity theft detection; non-technical losses; LightGBM; LSTM; SMOTE; SHAP explainability; precision-recall; concept drift; billing audit; smart meter fraud

Abstract: Electricity theft and non-technical losses (NTL) cause annual global losses exceeding USD 96 billion. This paper proposes and evaluates a three-phase machine learning pipeline for electricity theft detection using a real-world dataset of 135,493 utility clients and 4,454,637 invoice records (2005–2019). We engineer 68 features across eight groups, introducing a novel billing arithmetic audit; specifically, the index delta mismatch and active-meter-zeroconsumption flag; that achieves fraud rate lifts of 2.08x and 2.16x over the dataset baseline. Under rigorous time-aware evaluation, LightGBM achieves the highest test AUPRC of 0.1296, outperforming Logistic Regression (0.1123) and LSTM (0.1167). SHAP analysis identifies reading remark code 9 as the dominant predictor (22.55% gain importance), and reveals that account lifecycle features outperform raw consumption metrics. We quantify temporal concept drift; fraud prevalence rising from 2.53% (training period) to 6.46% (test period); and demonstrate that aggregated statistical features capture sequential patterns as effectively as LSTM modelling under time-aware conditions.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 2

Loading