preprintarXiv (Cornell University)Nov 27, 2022GREEN OA

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

Indexed inarxivdatacite

Abstract

We propose an efficient design of Transformer-based models for multivariate time series forecasting and self-supervised representation learning. It is based on two key components: (i) segmentation of time series into subseries-level patches which are served as input tokens to Transformer; (ii) channel-independence where each channel contains a single univariate time series that shares the same embedding and Transformer weights across all the series. Patching design naturally has three-fold benefit: local semantic information is retained in the embedding; computation and memory usage of the attention maps are quadratically reduced given the same look-back window; and the model can attend longer history. Our…

Citation impact

543
total citations
FWCI
Percentile
References
0
Citations per year

Authors

4

Topics & keywords

Keywords
  • Computer science
  • Transformer
  • Embedding
  • Univariate
  • Segmentation
  • Computation
  • Artificial intelligence
  • Machine learning
No related works found for this paper.