For time series containing variates (univariate/multivariate), given historical data , wherein is the look-back window size and is the value of the variate at the time step. The time series forecasting task is to predict the values at the future time steps.
when ,
- IMS: iterated multi-step forecasting, learns a single-step forecaster and iteratively applies it to obtain multi-step predictions. IMS predictions have smaller variance thanks to the autoregressive estimation procedure, but they inevitably suffer from error accumulation effects. Consequently, IMS forecasting is preferable when there is a highly-accurate single-step forecaster, and is relatively small.
- DMS: direct multi-step forecasting, directly optimizes the multi-step forecasting objective at once. In contrast, DMS forecasting generates more accurate predictions when it is hard to obtain an unbiased single-step forecasting model, or is large.
- So far, little progress has been made to exploit pre-trained or foundation models for time series analysis.
- One main challenge is the lack of the large amount of data to train a foundation model for time series analysis.
- The largest data sets for time series analysis is less than 10GB, which is much smaller than that for NLP.
Common setting:
whole data: (large datasets: Weather, Traffic, and Electricity)
popular dataset for benchmark
目前该领域都是小模型,单卡训练的百M内的模型,主要瓶颈即数据量,近几年主流:
Time | Publish | Paper | Describe |
---|---|---|---|
2023 | NIPs | One Fits All GPT4TS | GPT pretrained on NLP/CV and finetuned on time-series sequence, only six layers, transfer learning |
2023 | ICLR | TimesNet | treated time series as a 2D signal and utilized a convolution-based inception net backbone to function as a comprehensive time series analysis model, provide benchmark lib, 3k+ stars |
2023 | ICLR | PatchTST | divide a sequence into patches to increase input length and reduce information redundancy, with channel independent. 和waveformer基本一致的预处理 |
2023 | ICLR | DLinear | a simple MLP-based model and validates channel-independence works well in time series forecasting. |
2022 | ICML | FEDformer | uses Fourier enhanced structure to improve computational efficiency and achieves linear complexity |
2022 | NIPs | None-Stationary | propose Non-stationary Transformers as a generic framework with two interdependent modules: Series Stationarization and De-stationary Attention. |
2022 | ICLR Rejected | ETSformer | propose two novel attention mechanisms – the exponential smoothing attention and frequency attention |
2022 | / | LightTS | a light deep learning architecture merely based on simple MLP-based structures |
2021 | AAAI best paper | Informer | provide a benchmark dataset ETT, propose ProbSparse self-attention mechanism and generative style decoder, can predicts the long time-series sequences at one forward operation |
2021 | NIPs | Autoformer | replaces the attention module with an Auto-Correlation mechanism. |
2021 | ICLR | Pyraformer | applies pyramidal attention module with inter-scale and intra-scale connections which also get a linear complexity. |
2020 | ICLR | Reformer | improve the efficiency of Transformers, replace dot-product attention by one that uses locality-sensitive hashing and use reversible residual layers instead of the standard residuals |
2019 | NIPs | LogTrans | uses convolutional self-attention layers with LogSparse design to capture local information and reduce the space complexity |