Time Series Forecasting for the M5 Competition
tsfresh
for automated feature engineering of time series data.altair
, vega_datasets
, category_encoders
, mxnet
, gluonts
, kats
, lightgbm
, hyperopt
and pandarallel
.
kats
requires Python 3.7 or higher.apply
function of the pandas dataframe
and instead used pandarallel
to maximize the parallelization performance.Kats
library. In this case, Tweedie was applied as the loss function. Below is the hyperparameter tuning result.seasonality_prior_scale | changepoint_prior_scale | changepoint_range | n_changepoints | holidays_prior_scale | seasonality_mode |
---|---|---|---|---|---|
0.01 | 0.046 | 0.93 | 5 | 100.00 | multiplicative |
tsfresh
to convert time series into structured data features, which consumes a lot of computational resources even with minimal settings.hyperopt
library. The following is the hyperparameter tuning result.boosting | learning_rate | num_iterations | num_leaves | min_data_in_leaf | min_sum_hessian_in_leaf | bagging_fraction | bagging_freq | feature_fraction | extra_trees | lambda_l1 | lambda_l2 | path_smooth | max_bin |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
gbdt | 0.01773 | 522 | 11 | 33 | 0.0008 | 0.5297 | 4 | 0.5407 | False | 2.9114 | 0.2127 | 217.3879 | 1023 |
Algorithm | WRMSSE | sMAPE | MAE | MASE | RMSE |
---|---|---|---|---|---|
DeepAR | 0.7513 | 1.4200 | 0.8795 | 0.9269 | 1.1614 |
LightGBM | 1.0701 | 1.4429 | 0.8922 | 0.9394 | 1.1978 |
Prophet | 1.0820 | 1.4174 | 1.1014 | 1.0269 | 1.4410 |
VAR | 1.2876 | 2.3818 | 1.5545 | 1.6871 | 1.9502 |
Naive Method | 1.3430 | 1.5074 | 1.3730 | 1.1077 | 1.7440 |
Mean Method | 1.5984 | 1.4616 | 1.1997 | 1.0708 | 1.5352 |
DeepVAR | 4.6933 | 4.6847 | 1.9201 | 1.3683 | 2.3195 |
As a result, DeepAR was finally selected and submitted its predictions to Kaggle, achieving a WRMSSE value of 0.8112 based on the private leaderboard.