Chapter 13 Forecasting II (Not Covered)

Data for this chapter:

  • The msales and qales data are used from the MKT4320BGSU course package. Load the package and use the data() function to load the data.

    # Load the course package
    library(MKT4320BGSU)
    # Load the data
    data(msales)
    data(qsales)

13.1 Introduction

While Base R does have some time-series forecasting capabilities, there are several packages that make working with the data and running the analysis easier. For our class, we will be mainly using the fpp3 package inside of some user-defined functions. The user-defined functions are described briefly below, and then in detail in their own sections.

13.1.1 User-Defined Functions for Forecasting II

  • acplots creates ACF and PACF plots needed for ARIMA forecasting
  • tswh.noise creates a cumulative periodogram and white noise test needed for ARIMA forecasting
  • autoarima runs an ARIMA model as specified by the user and an automatic search for the “best” model if requested
  • tsplot produces a time series plot of the data (covered in Forecasting I)
  • fccompare compares models from saved results after running the methods functions (covered in Forecasting I)

13.1.2 Packages

You must have the following packages installed to use all of the user-defined functions. If you do not have them installed, you should do so now.

  • fpp3
  • slider
  • purrr
  • dplyr
  • ggplot2
  • ggfortify
  • stringr
  • cowplot
# Load necessary packages
library(fpp3)
library(slider)
library(purrr)
library(ggplot2)
library(dplyr)
library(ggfortify)
library(stringr)
library(cowplot)

13.2 acplots User Defined Function

  • Usage: acplots(data, tvar="", obs="", datetype=c("ym", "yq", "yw"), h= , lags=, d=, D=c(0,1))
    • data is the name of the dataframe with the time variable and measure variable
    • tvar is the variable that identifies the time period (in quotes)
    • obs is the variable that identifies the measure (in quotes)
    • datetype can be one of three options:
      • "ym" if the time period is year-month
      • "yq" if the time period is year-quarter
      • "yw" if the time period is year-week
    • h is an integer indicating the number of holdout/forecast periods
    • lags is an integer indicating how many lags should be shown on the plot; default is 25
    • d is the order of non-seasonal differencing; default is 0
    • D is the order of seasonal differencing; default is 0
  • NOTE: The results of this function can be saved to an object. When doing so, the following objects are returned:
    • $acf contains the ACF plot
    • $pacf contains the PACF plot
out <- acplots(msales, "t", "sales", "ym", 12,
               lags=25,   # Number of lags to plot
               d=0,       # Non-seasonal differencing
               D=1)       # Seasonal differencing
out$acf

out$pacf

13.3 tswh.noise User Defined Function

  • Usage: tswh.noise(data, tvar="", obs="", datetype=c("ym", "yq", "yw"), h= , arima=c(p,d,q), sarima=c(P,D,Q))
    • data is the name of the dataframe with the time variable and measure variable
    • tvar is the variable that identifies the time period (in quotes)
    • obs is the variable that identifies the measure (in quotes)
    • datetype can be one of three options:
      • "ym" if the time period is year-month
      • "yq" if the time period is year-quarter
      • "yw" if the time period is year-week
    • h is an integer indicating the number of holdout/forecast periods
    • lags is an integer indicating how many lags should be shown on the plot; default is 25
    • arima=(p,d,q) provides the non-seasonal autoregressive (\(p\)), differencing (\(d\)), and moving average (\(q\)) parameters, in the form of c(p,d,q) where the parameters are integers
    • sarima=(P,D,Q) provides the seasonal autoregressive (\(P\)), differencing (\(D\)), and moving average (\(Q\)) parameters, in the form of c(P,D,Q) where the parameters are integers
  • The function will return a cumulative periodogram and results of the Ljung-Box White Noise test
tswh.noise(msales, "t", "sales", "ym", 12,
           arima=c(0,0,0),     # Seasonal parameters
           sarima=c(0,0,0))    # Non-seasonal parameters

tswh.noise(msales, "t", "sales", "ym", 12, c(2,1,0), c(2,1,0))

13.4 autoarima User Defined Function

  • Usage: tswh.noise(data, tvar="", obs="", datetype=c("ym", "yq", "yw"), h= , arima=c(p,d,q), sarima=c(P,D,Q), auto="")
    • data is the name of the dataframe with the time variable and measure variable
    • tvar is the variable that identifies the time period (in quotes)
    • obs is the variable that identifies the measure (in quotes)
    • datetype can be one of three options:
      • "ym" if the time period is year-month
      • "yq" if the time period is year-quarter
      • "yw" if the time period is year-week
    • h is an integer indicating the number of holdout/forecast periods
    • lags is an integer indicating how many lags should be shown on the plot; default is 25
    • arima=(p,d,q) provides the non-seasonal autoregressive (\(p\)), differencing (\(d\)), and moving average (\(q\)) parameters, in the form of c(p,d,q) where the parameters are integers
    • sarima=(P,D,Q) provides the seasonal autoregressive (\(P\)), differencing (\(D\)), and moving average (\(Q\)) parameters, in the form of c(P,D,Q) where the parameters are integers
    • auto can be one of two options:
      • "Y" an automatic search is performed to find the “best” model
      • "N" an automatic search is not performed
  • NOTE 1: The results of this function should be saved to an object. When doing so, the following objects are returned:
    • $plot contains the ARIMA methods plot
    • $acc contains the accuracy measures
    • $fcresplot contains a plot of the holdout period residuals
    • $fcresid is a dataframe of the holdout period residuals (seldom used separately)
    • $acresid contains a plot of the ACF/PACF residuals for the model(s)
    • $wn contains cumulative periodogram(s) and results of the Ljung-Box White Noise test(s)
  • Note 2: If auto="Y" is specified, this function can take one to two minutes to run
  • Note 3: The model names are:
    • Self for the model specified by the user
    • Auto for the “best” model if requested by the user
selfonly <- autoarima(msales, "t", "sales", "ym", 12,
                      arima=c(2,1,0),     # Seasonal parameters
                      sarima=c(2,1,0),    # Non-seasonal parameters
                      auto="N")           # Do not perform auto search
selfonly$plot

selfonly$acc
                       Model   RMSE    MAE  MAPE     AICc
1 Self: A(2,1,0)S(2,1,0)[12] 43.166 37.388 0.135 1359.636
selfonly$fcresplot

selfonly$acresid

selfonly$wn

arima <- autoarima(msales, "t", "sales", "ym", 12,
                   arima=c(2,1,0),     # Seasonal parameters
                   sarima=c(2,1,0),    # Non-seasonal parameters
                   auto="Y")           # Do not perform auto search
arima$plot

arima$acc
                       Model   RMSE    MAE  MAPE     AICc
1 Auto: A(1,0,1)S(0,1,1)[12] 50.531 45.393 0.163 1347.420
2 Self: A(2,1,0)S(2,1,0)[12] 43.166 37.388 0.135 1359.636
arima$fcresplot

arima$acresid

arima$wn