Machine learning discovery of regional and social disparities in electric vehicle charging reliability

Published: 22 Sept 2025, Last Modified: 22 Sept 2025WiML @ NeurIPS 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Few-shot learning, large language models, infrastructure reliability, electric vehicle charging, spatial analysis, climate policy
Abstract: A growing body of literature has documented that unreliable electric vehicle (EV) charging poses a major barrier to public infrastructure for climate mitigation. Yet prior methods for detecting reliability have been inadequate for revealing regional and social disparities at scale. This study develops a machine learning pipeline that analyzes 838,785 unstructured consumer reviews to uncover disparities in EV charging performance. Using zero-shot and few-shot learning with iterative expert prompts, we substantially reduce training data needs while achieving new benchmarks for domain-aware reliability detection (F1 score = 0.97, SD = 0.02). We further combine station reliability detection with diversity indices (Shannon and Simpson) for spatial analysis to inform economic and policy decision-making. Our framework provides credible, evidence-based, and scalable measurement of infrastructure risks, supporting a more reliable and equitable transition to electric mobility. Unreliable chargers have become an epidemic, posing a fundamental barrier to EV adoption. The issue has been covered in major outlets such as Forbes (2023), Bloomberg (2023), and the Wall Street Journal (2023), and documented in academic studies across the U.S., Europe, and Asia (Rempel et al., 2024; Liu et al., 2022, 2023; Karanam et al., 2023). Addressing this barrier is crucial as electrification is vital for decarbonizing transportation—the second-largest source of global emissions and the top contributor in many developed countries (UNEP, 2024; U.S. Environmental Protection Agency, 2025). Expanding charging infrastructure is considered cost-effective for boosting adoption through network effects (Li et al., 2017; Springel, 2021; Cole et al., 2023; Asensio et al., 2025), yet the benefits are undermined by poor reliability. The urgency is amplified by the rapid growth of the EV charging market, projected to expand from USD 32.26 billion in 2024 to USD 125.39 billion by 2030 (Grand View Research, 2024). Machine learning has emerged as a key strategy for optimizing EV charging management, including algorithm-based decision-making, load balancing, and demand forecasting (Yaghoubi et al, 2024; Zhang et al., 2024). However, measuring reliability at scale remains challenging due to poor data interoperability. Most stations lack sub-metering, and decentralized infrastructure growth has produced siloed, incompatible datasets. Without mandatory reporting, providers have little incentive to share or standardize data, leaving consumers without real-time information. Traditional methods such as surveys, simulations, and dashboard data are limited in scale and fail to capture user experiences like failed charging attempts. Some studies integrate consumer perspectives: Asensio et al. (2020) and Yu et al. (2025) analyzed sentiment about charging experiences, and Ha et al. (2021) purely classified review topics. However, automated detection is difficult given dozens of imbalanced failure classes and the fact that issues are not always explicitly or negatively stated. Previous models required extensive expert training yet achieved only modest F1 scores (0.45–0.88). In this paper, we examine whether zero-shot and few-shot learning with large language models can reduce costs and improve accuracy in detecting charging reliability. We illustrate how machine learning, combined with geographic performance indices, captures regional and social disparities in consumer charging experiences. Building on earlier classification strategies (Asensio et al., 2020, 2025; Ha et al., 2021), we incorporate iterative expert feedback into prompt design, substantially reducing Type I and II errors and setting new benchmarks in this domain. We further demonstrate how machine learning approaches for station reliability detection can be combined with diversity indices for spatial analysis to inform economic and policy decision-making. For climate mitigation strategies in the transportation sector, we find that EV infrastructure reliability is currently highest in rural areas and less populated communities, whereas reliability issues are widespread in urban centers and metropolitan areas. In recent years, federal policies have prioritized charger installation at 50-mile intervals along designated EV corridors (Hanig et al., 2025). We have uncovered those EV corridors not only have low reliability issues but also have the widest spread of station reliability. Current investment incentives focus on deployment rather than operational reliability, leaving challenges persistent without further incentives or policy interventions. Our contributions include generating behaviorally informed predictions of charging reliability from consumer voices, quantifying disparities overlooked by climate policy, and developing a scalable framework for ensemble learning.
Submission Number: 149
Loading