xpandas - python data containers for structured types and structured machine learning tasksDownload PDF

29 Oct 2018 (modified: 05 May 2023)NIPS 2018 Workshop MLOSS Paper10 DecisionReaders: Everyone
Keywords: python, data structure, machine learning, data pipelines, data container, scikit-learn
TL;DR: pandas-like data structures for complex data types
Abstract: Data scientific tasks with structured data types, e.g., arrays, images, time series, text records, are one of the major challenge areas of contemporary machine learning and AI research beyond the ``tabular'' situation - that is, data that fits into a single classical data frame, and learning tasks on it such as the classical supervised learning task where one column is to be predicted from others.\\ With xpandas, we present a python package that extends the pandas data container functionality to cope with arbitrary structured types (such as time series, images) at its column/slice elements, and which provides a transformer interface to scikit-learn's pipeline and composition workflows.\\ We intend xpandas to be the first building block towards scikit-learn like toolbox interfaces for advanced learning tasks such as supervised learning with structured features, structured output prediction, image segmentation, time series forecasting and event risk modelling.
Decision: accept
0 Replies

Loading