Teaching(授课) - 孙 建永 - 教师个人主页.pdf
Times Series Analysis (XVI) – Seasonal Models Jianyong Sun School of Mathematics and Statistics Xi’an Jiaotong University 2nd Nov., 2017 Prefix we saw how seasonal deterministic trends might be modeled. However, in many areas in which time series are used, particularly business and economics, the assumption of any deterministic trend is quite suspect even though cyclical tendencies are very common in such series. This figure displays the monthly CO2 levels from January 1994 through December 2004. Prefix carbon dioxide levels are higher during the winter months and much lower in the summer. Deterministic seasonal models such as seasonal means plus linear time trend or sums of cosine curves at various frequencies plus linear time trend could certainly be considered here. Prefix we discover that such models do not explain the behavior of this time series. For this series and many others, it can be shown that the residuals from a seasonal means plus linear time trend model are highly autocorrelated at many lags. In contrast, we will see that the stochastic seasonal models developed in this chapter do work well for this series. Seasonal ARIMA Models We begin by studying stationary models and then consider nonstationary generalizations. Let s denote the known seasonal period; for monthly series s = 12 and for quarterly series s = 4. Consider the time series generated according to Yt = et − Θet−12 Notice that Cov (Yt , Yt−1 ) = Cov (et − Θet−12 , et−1 − Θet−13 ) = 0 but that Cov (Yt , Yt−12 ) = Cov (et − Θet−12 , et−12 − Θet−24 ) = −Θσe2 It is easy to see that such a series is stationary and has nonzero autocorrelations only at lag 12. Seasonal ARIMA Models Generalizing these ideas, we define a seasonal MA(Q) model of order Q with seasonal period s by Yt = et − Θ1 et−s − Θ2 et−2s − · · · − ΘQ et−Qs with seasonal MA characteristic polynomial: Θ(x) = 1 − Θ1 x s − Θ2 x 2s − · · · − ΘQ x Qs It is evident that such a series is always stationary and that the autocorrelation function will be nonzero only at the seasonal lags of s, 2s, 3s, · · · , Qs. Seasonal ARIMA Models In particular, ρks = −Θk + Θ1 Θk+1 + Θ2 Θk+2 + · · · + ΘQ−k ΘQ 1 + Θ21 + Θ22 + · · · + Θ2Q for k = 1, 2, · · · , Q. For the model to be invertible, the roots of Θ(x) = 0 must all exceed 1 in absolute value. It is useful to note that the seasonal MA(Q) model can also be viewed as a special case of a nonseasonal MA model of order q = Qs but with all θ-values zero except at the seasonal lags s, 2s, 3s, · · · , Qs. Seasonal ARIMA Models Seasonal autoregressive models can also be defined. Consider Yt = ΦYt−12 + et where |Φ| < 1 and et is independent of Yt−1 , Yt−2 , · · · . It can be shown that |Φ| < 1 ensures stationarity. Thus it is easy to argue that E (Yt ) = 0; multiplying the equation by Yt−k ,taking expectations, and dividing by γ0 yields ρk = Φρk−12 for k ≥ 1 Clearly, ρ12 = Φρ0 = Φ and ρ24 = Φρ12 = Φ2 More generally, ρ12k = Φk for k = 1, 2, · · · Seasonal ARIMA Models Furthermore, setting k = 1 and then k = 11 and using ρk = ρ−k gives us ρ1 = Φρ11 and ρ11 = Φρ1 which implies that ρ1 = ρ11 = 0. Similarly, one can show that ρk = 0 except at the seasonal lags 12, 24, 36, · · · . At those lags, the autocorrelation function decays exponentially like an AR(1) model. Seasonal ARIMA Models With this example in mind, we define a seasonal AR(P) model of order P and seasonal period s by Yt = Φ1 Yt−s + Φ2 Yt−2s + · · · + ΦP Yt−Ps + et with seasonal characteristic polynomial: Φ(x) = 1 − Φ1 x s − Φ2 x 2s − · · · − ΦP x Ps As always, we require et to be independent of Yt−1 , Yt−2 , · · · , and, for stationarity, that the roots of Φ(x) = 0 be greater than 1 in absolute value. Again, this equation can be seen as a special AR(p) model of order p = Ps with nonzero Φ-coefficients only at the seasonal lags s, 2s, 3s, ..., Ps. Seasonal ARIMA Models It can be shown that the autocorrelation function is nonzero only at lags s, 2s, 3s, · · · , where it behaves like a combination of decaying exponentials and damped sine functions. In particular, previous equations easily generalize to the general seasonal AR(1) model to give ρks = Φk for k = 1, 2, · · · with zero correlation at other lags. Multiplicative Seasonal ARMA Models Rarely shall we need models that incorporate autocorrelation only at the seasonal lags. By combining the ideas of seasonal and nonseasonal ARMA models, we can develop parsimonious models that contain autocorrelation for the seasonal lags but also for low lags of neighboring series values. Consider a model whose MA characteristic polynomial is given by (1 − θx)(1 − Θx 12 ) = 1 − θx − Θx 12 + θΘx 13 Thus the corresponding time series satisfies Yt = et − θet−1 − Θet−12 + θΘet−13 Multiplicative Seasonal ARMA Models For this model, we can check that the autocorrelation function is nonzero only at lags 1, 11, 12, and 13. We find γ0 = (1 + θ2 )(1 + Θ2 )σe2 θ ρ1 = − 1 + θ2 θΘ ρ11 = ρ13 = , and 2 (1 + θ )(1 + Θ2 ) Θ ρ12 = − 1 + Θ2 Multiplicative Seasonal ARMA Models Figure: Autocorrelation from previous equations Of course, we could also introduce both short-term and seasonal autocorrelations by defining an MA model of order 12 with only θ1 and θ12 nonzero Multiplicative Seasonal ARMA Models In general, then, we define a multiplicative seasonal ARMA(p, q)×(P, Q)s model with seasonal period s as a model with AR characteristic polynomial φ(x)Φ(x) and MA characteristic polynomial θ(x)Θ(x), where φ(x) = 1 − φ1 x − φ2 x 2 − · · · − φp x p Φ(x) = 1 − Φ1 x s − Φ2 x 2s − · · · − ΦP x Ps and θ(x) = 1 − θ1 x − θ2 x 2 − · · · − θq x q Θ(x) = 1 − Θ1 x s − Θ2 x 2s − · · · − ΘQ x Qs Multiplicative Seasonal ARMA Models The model may also contain a constant term θ0 . Note once more that we have just a special ARMA model with AR order p + Ps and MA order q + Qs, but the coefficients are not completely general, being determined by only p + P + q + Q coefficients. If s = 12, p + P + q + Q will be considerably smaller than p + Ps + q + Qs and will allow a much more parsimonious model. Multiplicative Seasonal ARMA Models As another example, suppose P = q = 1 and p = Q = 0 with s = 12. The model is then Yt = ΦYt−12 + et − θet−1 Using our standard techniques, we find that γ1 = Φγ11 − θσe2 γk = Φγk−12 for k ≥ 2 Multiplicative Seasonal ARMA Models After considering the equations implied by various choices for k, we arrive at 1 + θ2 2 σ 1 − Φ2 e = Φk for k ≥ 1 = ρ12k+1 = − γ0 = ρ12k ρ12k−1 θ Φk 1 + θ2 for k = 0, 1, 2, · · · , with autocorrelations for all other lags equal to zero. Multiplicative Seasonal ARMA Models Figure: Autocorrelation from previous equations This figure displays the autocorrelation functions for two of these seasonal ARIMA processes with period 12: one with Φ = 0.75 and θ = 0.4, the other with Φ = 0.75 and θ = −0.4. The shape of these autocorrelations is somewhat typical of the sample autocorrelation functions for numerous seasonal time series. The even simpler autocorrelation function given by previous equations and displayed in previous figure also seems to occur frequently in practice (perhaps after differencing). Nonstationary Seasonal ARIMA Models An important tool in modeling nonstationary seasonal processes is the seasonal difference. The seasonal difference of period s for the series {Yt } is denoted ∇s Yt and is defined as ∇s Yt = Yt − Yt−s For example, for monthly series we consider the changes from January to January, February to February, and so forth for successive years. Note that for a series of length n, the seasonal difference series will be of length n − s; that is, s data values are lost due to seasonal differencing. Nonstationary Seasonal ARIMA Models As an example where seasonal differencing is appropriate, consider a process generated according to Yt = St + et with St = St−s + t where {et } and {t } are independent white noise series. Here {St } is a “seasonal random walk", and if σ σe , {St } would model a slowly changing seasonal component. Nonstationary Seasonal ARIMA Models Due to the nonstationarity of {St }, clearly {Yt } is nonstationary. However, if we seasonally difference {Yt }, we find ∇s Yt = St − St−s + et − et−s = t + et − et−s An easy calculation shows that ∇s Yt is stationary and has the autocorrelation function of an MA(1)s model. Nonstationary Seasonal ARIMA Models The model described by previous equations could also be generalized to account for a nonseasonal, slowly changing stochastic trend. Consider Yt = Mt + St + et St = St−s + t Mt = Mt−1 + ξt with and where {et }, {t }, {ξt } are mutually independent white noise series. Nonstationary Seasonal ARIMA Models Here we take both a seasonal difference and an ordinary nonseasonal difference to obtain (It should be noted that ∇s Yt will in fact be stationary and ∇∇s Yt will be noninvertible. We use previous equations merely to help motivate multiplicative seasonal ARIMA models.) ∇∇s Yt = ∇(Mt − Mt−s + t + et − et−s ) = (ξt + t + et ) − (t−1 + et−1 ) − (ξt−s + et−s ) + et−s−1 The process defined here is stationary and has nonzero autocorrelation only at lags 1, s − 1, s, and s + 1, which agrees with the autocorrelation structure of the multiplicative seasonal model ARMA(0,1)×(0,1) with seasonal period s. Nonstationary Seasonal ARIMA Models These examples lead to the definition of nonstationary seasonal models. A process {Yt } is said to be a multiplicative seasonal ARIMA model with nonseasonal (regular) orders p, d, and q, seasonal orders P, D, and Q, and seasonal period s if the differenced series: Wt = ∇d ∇D s Yt satisfies an ARMA(p, q) × (P, Q)s model with seasonal period s. We say that {Yt } is an ARIMA(p, d, q) × (P, D, Q)s model with seasonal period s. Clearly, such models represent a broad, flexible class from which to select an appropriate model for a particular time series. It has been found empirically that many series can be adequately fit by these models, usually with a small number of parameters, say three or four. Model Specification, Fitting, and Checking we shall simply highlight the application of these ideas specifically to seasonal models and pay special attention to the seasonal lags. As always, a careful inspection of the time series plot is the first step. Figure: Sample ACF of CO2 Levels This figure shows the sample autocorrelation function for that series. The seasonal autocorrelation relationships are shown quite prominently in this display. Notice the strong correlation at lags 12, 24, 36, and so on. In addition, there is substantial other correlation that needs to be modeled. Model Specification, Fitting, and Checking Figure: Time Series Plot of the First Differences of CO2 Levels Figure: Sample ACF of First Differences of CO2 Levels Model Specification, Fitting, and Checking Figure: Time Series Plot of First and Seasonal Differences of CO2. The figure displays the time series plot of the CO2 levels after taking both a first difference and a seasonal difference. It appears that most, if not all, of the seasonality is gone now. Model Specification, Fitting, and Checking Figure: Sample ACF of First and Seasonal Differences of CO2. This figure confirms that very little autocorrelation remains in the series after these two differences have been taken. This plot also suggests that a simple model which incorporates the lag 1 and lag 12 autocorrelations might be adequate. Model Specification, Fitting, and Checking We will consider specifying the multiplicative, seasonal ARIMA(0,1,1)×(0, 1, 1)12 model: ∇12 ∇Yt = et − θet−1 − Θet−12 + θΘet−13 which incorporates many of these requirements. As usual, all models are tentative and subject to revision at the diagnostics stage of model building. Model Fitting Having specified a tentative seasonal model for a particular time series, we proceed to estimate the parameters of that model as efficiently as possible. As we have remarked earlier, multiplicative seasonal ARIMA models are just special cases of our general ARIMA models. As such, all of our work on parameter estimation carries over to the seasonal case. Model Fitting Figure: Parameter Estimates for the CO2 Model. This table gives the maximum likelihood estimates and their standard errors for the ARIMA(0,1,1)×(0, 1, 1)12 model for CO2 levels. The coefficient estimates are all highly significant, and we proceed to check further on this model. Diagnostic Checking To check the estimated the ARIMA(0,1,1)×(0, 1, 1)12 model, we first look at the time series plot of the residuals. Figure: Residuals from the ARIMA(0,1,1)×(0, 1, 1)12 Model. Th gives this plot for standardized residuals. Other than some strange behavior in the middle of the series, this plot does not suggest any major irregularities with the model, although we may need to investigate the model further for outliers, as the standardized residual at September 1998 looks suspicious. Diagnostic Checking Figure: ACF of Residuals from the ARIMA(0,1,1)×(0, 1, 1)12 Model. The only “statistically significant" correlation is at lag 22, and this correlation has a value of only -0.17, a very small correlation. Furthermore, we can think of no reasonable interpretation for dependence at lag 22. Finally, we should not be surprised that one autocorrelation out of the 36 displayed is statistically significant. This could easily happen by chance alone. Diagnostic Checking The Ljung-Box test for this model gives a chi-squared value of 25.59 with 22 degrees of freedom, leading to a p-value of 0.27 — a further indication that the model has captured the dependence in the time series. Figure: Residuals from the ARIMA(0,1,1)×(0, 1, 1)12 Model. The figure displays the histogram of the residuals. The shape is somewhat “bell-shaped" but certainly not ideal. Perhaps a quantile-quantile plot will tell us more. Diagnostic Checking Figure: Residuals: ARIMA(0,1,1)×(0, 1, 1)12 Model. Here we again see the one outlier in the upper tail, but the Shapiro-Wilk test of normality has a test statistic of W = 0.982, leading to a p-value of 0.11, and normality is not rejected at any of the usual significance levels. Diagnostic Checking As one further check on the model, we consider overfitting with an ARIMA(0,1,2)×(0, 1, 1)12 model. Figure: ARIMA(0,1,1)×(0, 1, 2)12 Overfitted Model. we see that the estimates of θ1 and Θ have changed very little — especially when the size of the standard errors is taken into consideration. In addition, the estimate of the new parameter, θ2 , is not statistically different from zero. Note also that the estimate σe2 and the log-likelihood have not changed much while the AIC has actually increased. Diagnostic Checking The ARIMA(0,1,1)×(0, 1, 1)12 model was popularized in the first edition of the seminal book of Box and Jenkins (1976) when it was found to characterize the logarithms of a monthly airline passenger time series. This model has come to be known as the airline model. Forecasting Seasonal Models Computing forecasts with seasonal ARIMA models is, as expected, most easily carried out recursively using the difference equation form for the model. For example, consider the model ARIMA(0,1,1)×(1, 0, 1)12 . Yt − Yt−1 = Φ(Yt−12 − Yt−13 ) + et − θet−1 − Θet−12 + θΘet−13 which we rewrite as Yt = Yt−1 + Yt−12 − ΦYt−13 + et − θet−1 − Θet−12 + θΘet−13 The one-step-ahead forecast from origin t is then Ŷt (1) = Yt + ΦYt−11 − ΦYt−12 − θet − Θet−11 + θΘet−12 Forecasting Seasonal Models and the next one is Ŷt (2) = Ŷt (1) + ΦYt−10 − ΦYt−11 − Θet−10 + θΘet−11 ans so forth. The noise terms et−13 , et−12 , et−11 , · · · , et (as residuals) will enter into the forecasts for lead times ` = 1, 2, · · · , 13, but for ` > 13 the autoregressive part of the model takes over and we have Ŷt (`) = Ŷt (` − 1) + ΦŶt (` − 12) − ΦŶt (` − 13) for ` > 13 To understand the general nature of the forecasts, we consider several special cases. Forecasting Seasonal Models Seasonal AR(1)12 : The seasonal AR(1)12 model is Yt = ΦYt−12 + et . Clearly, we have Ŷt (`) = ΦŶt (` − 12) However, iterating back on `, we can also write Ŷt (`) = Φk+1 Yt+r −11 where k and r are defined by ` = 12k + r + 1 with 0 ≤ r < 12 and k = 0, 1, 2, · · · . In other words, k is the integer part of (` − 1)/12 and r /12 is the fractional part of (` − 1)/12. Forecasting Seasonal Models If our last observation is in December, then the next January value is forecast as Φ times the last observed January value, February is forecast as Φ times the last observed February value, and so on. Two Januarys ahead is forecast as Φ2 times the last observed January. Looking just at January values, the forecasts into the future will decay exponentially at a rate determined by the magnitude of Φ. All of the forecasts for each month will behave similarly but with different initial forecasts depending on the particular month under consideration. Forecasting Seasonal Models Note the fact that the Ψ-weights are nonzero only for multiple of 12, namely, j/12 Φ for j = 0, 12, 24, · · · , Ψj = 0 otherwise we have that the forecast error variance can be written as 1 − Φ2k+2 2 Var (et (`)) = σe 1 − Φ2 where as before, k is the integer part of (` − 1)/12. Forecasting Seasonal Models Seasonal MA(1)12 : For the seasonal MA(1)12 , we have Yt = et − Θet−12 + θ0 . In this case, we see that Ŷt (1) = −Θet−11 + θ0 Ŷt (2) = −Θet−10 + θ0 .. . Ŷt (12) = −Θet + θ0 and Ŷt (`) = θ0 for ` > 12 Here we obtain different forecasts for the months of the first year, but from then on all forecasts are given by the process mean. Forecasting Seasonal Models For this model, Ψ0 = 1, Ψ12 = −Θ, and Ψj = 0 otherwise. Thus, we have 2 σe 1 ≤ ` ≤ 12 Var (et (`)) = (1 + Θ2 )σe2 12 < ` Forecasting Seasonal Models ARIMA(0,0,0)×(0, 1, 1)12 The ARIMA(0,0,0)×(0, 1, 1)12 model is Yt − Yt−12 = et − Θet−12 or Yt+` = Yt+`−12 + et+` − Θet+`−12 , so that Ŷt (1) = Yt−11 − Θet−11 Ŷt (2) = Yt−10 − Θet−10 .. . Ŷt (12) = Yt − Θet and then Ŷt (`) = Ŷt (` − 12) for ` > 12 It follows that all Januarys will forecast identically, all Februarys identically, and so forth. Forecasting Seasonal Models If we invert this model, we find that Yt = (1 − Θ)(Yt−12 + ΘYt−24 + Θ2 Yt−36 + · · · ) + et Consequently, we can write Ŷt (1) = (1 − Θ) Ŷt (2) = (1 − Θ) ∞ X j=0 ∞ X Θj Yt−11−12j Θj Yt−10−12j j=0 .. . Ŷt (12) = (1 − Θ) ∞ X j=0 Θj Yt−12j Forecasting Seasonal Models From this representation, we see that the forecast for each January is an EWMA of all observed Januarys, and similarly for each of the other months. In this case, we have Ψj = 1 − Θ for j = 12, 24, · · · , and zero otherwise. The forecast error variance is then Var (et (`)) = [1 + k(1 − Θ)2 ]σe2 where k is the integer part of (` − 1)/12. Forecasting Seasonal Models ARIMA(0,1,1)×(0, 1, 1)12 : Yt = Yt−1 + Yt−12 − Yt−13 + et − θet−1 − Θet−12 + θΘet−13 the forecast satisfy Ŷt (1) = Yt + Yt−11 − Yt−12 − θet − Θet−11 + θΘet−12 Ŷt (2) = Ŷt (1) + Yt−10 − Yt−11 − Θet−10 + θΘet−11 .. . Ŷt (12) = Ŷt (11) + Yt − Yt−1 − Θet + θΘet−1 Ŷt (13) = Ŷt (12) + Ŷt (1) − Yt + θΘet and Ŷt (`) = Ŷt (` − 1) + Ŷt (` − 12) − Ŷt (` − 13) for ` > 13 Forecasting Seasonal Models To understand the general pattern of these forecasts, we can use the representation Ŷt (`) = A1 + A2 ` + 6 X j=0 B1j cos 2πj` 12 + B2j sin 2πj` 12 where the A0 s and B 0 s are dependent on Yt , Yt−1 , · · · , or, alternatively, determined from the initial forecasts Ŷt (1), Ŷt (2), · · · , Ŷt (13). This result follows from the general theory of difference equations and involves the roots of (1 − x)(1 − x 12 ) = 0. Forecasting Seasonal Models Notice that the equation reveals that the forecasts are composed of a linear trend in the lead time plus a sum of periodic components. However, the coefficients Ai and Bij are more dependent on recent data than on past data and will adapt to changes in the process as our forecast origin changes and the forecasts are updated. This is in stark contrast to forecasting with deterministic time trend plus seasonal components, where the coefficients depend rather equally on both recent and past data and remain the same for all future forecasts. Prediction Limits Prediction limits are obtained precisely as in the nonseasonal case. We illustrate this with the carbon dioxide time series. Figure: Forecasts and Forecast Limits for the CO2 Model This figure shows the forecasts and 95% forecast limits for a lead time of two years for the ARIMA(0,1,1)×(0, 1, 1)12 model that we fit. The last two years of observed data are also shown. The forecasts mimic the stochastic periodicity in the data quite well, and the forecast limits give a good feeling for the precision of the forecasts. Prediction Limits Figure: Long-Term Forecasts for the CO2 Model This figure displays the last year of observed data and forecasts out four years. At this lead time, it is easy to see that the forecast limits are getting wider, as there is more uncertainty in the forecasts Summary Multiplicative seasonal ARIMA models provide an economical way to model time series whose seasonal tendencies are not as regular as we would have with a deterministic seasonal trend model. Fortunately, these models are simply special ARIMA models so that no new theory is needed to investigate their properties. We illustrated the special nature of these models with a thorough modeling of an actual time series. Questions?