xpandas - python data containers for structured types and structured machine learning tasks

Vitaly Davydov, Franz J. Király

Sep 30, 2018 NIPS 2018 Workshop MLOSS Submission readers: everyone
  • Abstract: Data scientific tasks with structured data types, e.g., arrays, images, time series, text records, are one of the major challenge areas of contemporary machine learning and AI research beyond the ``tabular'' situation - that is, data that fits into a single classical data frame, and learning tasks on it such as the classical supervised learning task where one column is to be predicted from others.\\ With xpandas, we present a python package that extends the pandas data container functionality to cope with arbitrary structured types (such as time series, images) at its column/slice elements, and which provides a transformer interface to scikit-learn's pipeline and composition workflows.\\ We intend xpandas to be the first building block towards scikit-learn like toolbox interfaces for advanced learning tasks such as supervised learning with structured features, structured output prediction, image segmentation, time series forecasting and event risk modelling.
  • TL;DR: pandas-like data structures for complex data types
  • Keywords: python, data structure, machine learning, data pipelines, data container, scikit-learn
0 Replies

Loading