SimbaML: Connecting Mechanistic Models and Machine Learning with Augmented DataDownload PDF

01 Mar 2023 (modified: 10 Jun 2023)Submitted to Tiny Papers @ ICLR 2023Readers: Everyone
Keywords: benchmarking, data augmentation, including prior knowledge, machine learning, open source, ordinary differential equation models, synthetic data generation, time series forecasting, transfer learning
TL;DR: SimbaML is an open source, all-in-one solution for generating realistic data from mechanistic models and leveraging them for improved ML experiments.
Abstract: Training sophisticated machine learning (ML) models requires large datasets that are difficult or expensive to collect for many applications. If prior knowledge about system dynamics is available, mechanistic representations can be used to supplement real-world data. We present SimbaML (Simulation-Based ML), an open-source tool that unifies realistic synthetic dataset generation from ordinary differential equation-based models and the direct analysis and inclusion in ML pipelines. SimbaML conveniently enables investigating transfer learning from synthetic to real-world data, data augmentation, identifying needs for data collection, and benchmarking physics-informed ML approaches. SimbaML is available from
8 Replies