WOODS: Benchmarks for Out-of-Distribution Generalization in Time Series

Jean-Christophe Gagnon-Audet; Kartik Ahuja; Mohammad Javad Darvishi Bayazi; Pooneh Mousavi; Guillaume Dumas; Irina Rish

WOODS: Benchmarks for Out-of-Distribution Generalization in Time Series

Jean-Christophe Gagnon-Audet, Kartik Ahuja, Mohammad Javad Darvishi Bayazi, Pooneh Mousavi, Guillaume Dumas, Irina Rish

Published: 02 Sept 2023, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Event Certifications: iclr.cc/ICLR/2024/Journal_Track

Abstract: Deep learning models often fail to generalize well under distribution shifts. Understanding and overcoming these failures have led to a new research field on Out-of-Distribution (OOD) generalization. Despite being extensively studied for static computer vision tasks, OOD generalization has been severely underexplored for time series tasks. To shine a light on this gap, we present WOODS: 10 challenging time series benchmarks covering a diverse range of data modalities, such as videos, brain recordings, and smart device sensory signals. We revise the existing OOD generalization algorithms for time series tasks and evaluate them using our systematic framework. Our experiments show a large room for improvement for empirical risk minimization and OOD generalization algorithms on our datasets, thus underscoring the new challenges posed by time series tasks.

Certifications: Featured Certification

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: **********************************Revision 1 (blue)********************************** 1. We have added a forecasting dataset called PedCount to WOODS. 2. We have added two additional baselines, Conditional Contrastive Domain Generalization (CCDG) and Diversify, to our set of baselines. 3. We have added Section 6.3 discussing the challenges involved in time series OOD generalization tasks and provided potential avenues for future research directions. 4. We have included proper definitions for "domain generalization" and "subpopulation shift" in Section 2.1. 5. We have consolidated the technical descriptions and the backbone used for each dataset from the Appendix into the new Table 2, which we have added to the main body of the manuscript. 6. We made minor formatting changes to stay within 12 pages (Figures 3 to 12 and Tables 3,4,5). ********************************Revision 2 (red)******************************** 1. Updated Figure 1 with the PedCount dataset. 2. Added references and improved reading clarity for the domain generalization and subpopulation shift definition of Section 2.1 3. Citation style uniformity 4. Added hyperparameter ranges for CCDG and Diversify in Table 52 of Appendix F.2 ********************************Revision 3 (Camera Ready)******************************** 1. De-anonymized the paper 2. Removed revision tracking colors

Assigned Action Editor: ~Antoni_B._Chan1

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Submission Number: 1047

Loading