MLSea: A Semantic Layer for Discoverable Machine Learning

Published: 29 May 2024, Last Modified: 25 Jan 2026ESWC 2024EveryoneRevisionsCC BY 4.0
Abstract: With the Machine Learning (ML) field rapidly evolving, ML pipelines continuously grow in numbers, complexity and components. Online platforms (e.g., OpenML, Kaggle) aim to gather and disseminate ML experiments. However, available knowledge is fragmented with each platform representing distinct components of the ML process or inter- secting components but in different ways. To address this problem, we leverage semantic web technologies to model and integrate ML datasets, experiments, software and scientific works into MLSea, a resource con- sisting of: (i) MLSO, an ontology that models ML datasets, pipelines and implementations; (ii) MLST, taxonomies with collections of ML knowledge formulated as controlled vocabularies; and (iii) MLSea-KG, an RDF graph containing ML datasets, pipelines, implementations and scientific works from diverse sources. MLSea paves the way for improving the search, explainability and reproducibility of ML pipelines.
Loading