10 Million Particle Events: Enabling Foundation Models for Sparse 3D Inverse Problems

Published: 24 Sept 2025, Last Modified: 26 Dec 2025NeurIPS2025-AI4Science SpotlightEveryoneRevisionsBibTeXCC BY 4.0
Additional Submission Instructions: For the camera-ready version, please include the author names and affiliations, funding disclosures, and acknowledgements.
Track: Track 2: Dataset Proposal Competition
Keywords: scientific datasets, inverse problems, particle physics, 3D reconstruction, self-supervised learning, sparse data, multi-modal learning, foundation models
Abstract: Next-generation particle physics experiments require unprecedented machine learning capabilities to achieve their science goals. We propose generating 10 million particle detector events, the first dataset providing raw sensor waveforms paired with 3D ground truth at scale, enabled by GPU-accelerated JAX simulations achieving two orders of magnitude speedup over traditional CPU-based tools. This dataset will enable large-scale self-supervised training of foundation models for complex inverse problems in the particle physics and beyond.
Submission Number: 445
Loading