Keywords: benchmarking, data augmentation, including prior knowledge, machine learning, open source, ordinary differential equation models, synthetic data generation, time series forecasting, transfer learning
TL;DR: SimbaML is an open source, all-in-one solution for generating realistic data from mechanistic models and leveraging them for improved ML experiments.
Abstract: Training sophisticated machine learning (ML) models requires large datasets that are difficult or expensive to collect for many applications. If prior knowledge about system dynamics is available, mechanistic representations can be used to supplement real-world data. We present SimbaML (Simulation-Based ML), an open-source tool that unifies realistic synthetic dataset generation from ordinary differential equation-based models and the direct analysis and inclusion in ML pipelines. SimbaML conveniently enables investigating transfer learning from synthetic to real-world data, data augmentation, identifying needs for data collection, and benchmarking physics-informed ML approaches. SimbaML is available from https://pypi.org/project/simba-ml/.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/simbaml-connecting-mechanistic-models-and/code)
8 Replies
Loading