GATS: A Time-Series Dataset for Addressing General Aviation Flight Safety

Published: 09 Jun 2025, Last Modified: 09 Jun 2025FMSD @ ICML 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: General Aviation, Time-Series Data, Self-Supervised Learning
Abstract: Assessing general aviation operations has become critical for improving the safety of airspace systems worldwide. Yet, machine learning research in the domain is nearly nonexistent due to the extremely limited amount of publicly available flight data. To encourage research in airspace safety and general aviation as a whole, we release GATS, a dataset comprising more than 7,000 flights anonymously sampled with permission from the privately held US National General Aviation Flight Information Database (NGAFID), corresponding to 10,641 total hours of data recordings. This dataset sets itself apart from previous works with its inclusion of 2 new aircraft types and 76 different flight data sensor parameters, including navigational information and aircraft orientation. We benchmark this dataset on 2 aviation-domain tasks. The first is aircraft classification, a proof-of-concept problem to establish that advanced machine learning methods can be applied effectively on time-series flight data. The second is missing data reconstruction, a more rigorous safety-critical task necessary in real-world environments where sensors can fail and information must be restored for flight analysis purposes. We achieved near-perfect accuracy on the aircraft classification task, but failed to generate meaningful reconstructions on the missing data task. The poor performance on the second task with the chosen models indicates the opportunity for future research into better techniques for understanding and improving flight safety using this dataset.
Submission Number: 70
Loading