everyone
since 04 Oct 2024">EveryoneRevisionsBibTeXCC BY 4.0
We introduce the Time Series Optimized Transformer for Observability (Toto), a foundation model designed for time series forecasting with a focus on observability metrics. Toto features a novel proportional factorized attention mechanism and a Student-T mixture model head, enabling it to efficiently handle high-dimensional, sparse, and non-stationary data. Trained on one trillion time series data points, including 75% proprietary observability data, Toto demonstrates state-of-the-art zero-shot performance on standard benchmarks such as electricity and weather forecasting. Furthermore, it significantly outperforms existing models in observability-specific tasks, making it an ideal solution for real-time system monitoring and anomaly detection. Toto’s architectural innovations make it a versatile tool for both general-purpose forecasting and domain-specific applications, setting a new benchmark for scalability and accuracy in time series analysis.