Peak-R1: Instruction-Tuned Large Language Models for Robust J-Peak Detection in Cardiomechanical Signals

Published: 23 Sept 2025, Last Modified: 01 Dec 2025TS4H NeurIPS 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Cardiomechanical signals, Ballistocardiography (BCG), Body seismography (BSG), J-peak detection, Large Language Model (LLM)
TL;DR: Peak-R1 turns BCG/BSG into compact peak sequences and uses an instruction-tuned LLM (SFT + GRPO RL) for J-peak detection. It achieves SOTA F1 and low HR error on Kansas and Hospital-BSG; peak extraction is crucial, RL improves robustness.
Abstract: Accurate peak detection across diverse cardiacmechanical signals, including the Ballistocardiogram (BCG), and Bodyseismography (BSG), is fundamental for cardiovascular monitoring but is often hindered by artifacts and signal variability. Conventional algorithms are typically engineered with expert knowledge for a single signal modality, limiting their generalizability. Conversely, deep learning-based methods often lack interpretability, raising concerns about their clinical trustworthiness and hindering expert-computer interaction. To address these limitations, we introduce Peak-R1, a novel framework that leverages instruction-tuned Large Language Models (LLMs) for robust, cross-modal, and explainable peak detection. A core innovation of our framework is a "peak-representation" technique that transforms time-series data into a condensed format, preserving critical event information while significantly reducing signal length. This representation provides a crucial inductive bias, guiding the LLM to reason over physiologically meaningful events rather than raw, noisy data. The model is optimized through a two-stage process: supervised fine-tuning (SFT) followed by reinforcement learning (RL) with a multi-objective reward function. The model’s self-explanation capabilities are cultivated by fine-tuning on a custom-built Peak-Explanation dataset. Across four modalities—BCG, and BSG—spanning seven datasets (six public benchmarks plus one real-world cohort), Peak-R1 demonstrates consistently excellent performance, achieving best or tied-best detection under clinically relevant temporal tolerance. Beyond accuracy, the generated rationales surface failure modes and support human-in-the-loop annotation. Together, these results indicate a single, generalizable, and interpretable solution to the complex challenge of peak detection across multiple physiological signals.
Submission Number: 15
Loading