Vector autoregression

Vector autoregression

Vector autoregression (VAR) is an econometric model used to capture the evolution and the interdependencies between multiple time series, generalizing the univariate AR models. All the variables in a VAR are treated symmetrically by including for each variable an equation explaining its evolution based on its own lags and the lags of all the other variables in the model. Based on this feature, Christopher Sims advocates the use of VAR models as a theory-free method to estimate economic relationships, thus being an alternative to the "incredible identification restrictions" in structural models [Christopher A. Sims, 1980, "Macroeconomics and Reality", Econometrica 48] .



A VAR model describes the evolution of a set of "k" variables (called "endogenous variables") over the same sample period ("t" = 1, ..., "T") as a linear function of only their past evolution. The variables are collected in a "k" × 1 vector "yt", which has as the ith element "yi,t" the time "t" observation of variable "yi". For example, if the "i"th variable is GDP, then "yi,t" is the value of GDP at "t".

A "(reduced) p-th order VAR", denoted "VAR(p)", is

:y_t = c + A_1 y_{t-1} + A_2 y_{t-2} + cdots + A_p y_{t-p} + e_t,

where "c" is a "k" × 1 vector of constants (intercept), "Ai" is a "k" × "k" matrix (for every "i" = 1, ..., "p") and e"t" is a "k" × 1 vector of error terms satisfying

#mathrm{E}(e_{t}) = 0, — every error term has mean zero;
#mathrm{E}(e_{t}e_{t}') = Omega, — the contemporaneous covariance matrix of error terms is Ω (a "n" × "n" positive definite matrix);
#mathrm{E}(e_{t}e_{t-k}') = 0, for any non-zero "k" — there is no correlation across time; in particular, no serial correlation in individual error terms.

The "l"-periods back observation "y""t"−l is called the "l"-th "lag" of "y". Thus, a "p"th-order VAR is also called a VAR with "p" lags.

Order of integration of the variables

Note that all the variables used have to be of the same order of integration. We have so the following cases:

*All the variables are I(0) (stationary): one is in the standard case, ie. a VAR in level
*All the variables are I(d) (non-stationary) with d>1:
**The variables are cointegrated: the error correction term has to be included in the VAR. The model becomes a Vector error correction model (VECM) which can be seen as a restricted VAR.
**The variables are not cointegrated: the variables have first to be differenced d times and one has a VAR in difference.

Concise matrix notation

One can write a VAR("p") with a concise matrix notation:

: Y=BZ +U ,

Details of the matrices are in a separate page.


For a general example of a VAR(p) with "k" variables, please see this page.

A VAR(1) in two variables can be written in matrix form (more compact notation) as

:egin{bmatrix}y_{1,t} \ y_{2,t}end{bmatrix} = egin{bmatrix}c_{1} \ c_{2}end{bmatrix} + egin{bmatrix}A_{1,1}&A_{1,2} \ A_{2,1}&A_{2,2}end{bmatrix}egin{bmatrix}y_{1,t-1} \ y_{2,t-1}end{bmatrix} + egin{bmatrix}e_{1,t} \ e_{2,t}end{bmatrix},

or, equivalently, as the following system of two equations

:y_{1,t} = c_{1} + A_{1,1}y_{1,t-1} + A_{1,2}y_{2,t-1} + e_{1,t},:y_{2,t} = c_{2} + A_{2,1}y_{1,t-1} + A_{2,2}y_{2,t-1} + e_{2,t}.,

Note that there is one equation for each variable in the model. Also note that the current (time "t") observation of each variable depends on its own lags as well as on the lags of each other variable in the VAR.

Writing VAR("p") as VAR(1)

A VAR with "p" lags can always be equivalently rewritten as a VAR with only one lag by appropriately redefining the dependent variable. The transformation amounts to merely stacking the lags of the VAR("p") variable in the new VAR(1) dependent variable and appending identities to complete the number of equations.

For example, the VAR(2) model

:y_{t}=c + A_{1}y_{t-1} + A_{2}y_{t-2} + e_{t}

can be recast as the VAR(1) model

::egin{bmatrix}y_{t} \ y_{t-1}end{bmatrix} = egin{bmatrix}c \ 0end{bmatrix} + egin{bmatrix}A_{1}&A_{2} \ I&0end{bmatrix}egin{bmatrix}y_{t-1} \ y_{t-2}end{bmatrix} + egin{bmatrix}e_{t} \ 0end{bmatrix},

where "I" is the identity matrix.

The equivalent VAR(1) form is more convenient for analytical derivations and allows more compact statements.

tructural vs. reduced form

tructural VAR

A "structural VAR with p lags" is

:B_0 y_t = c_0 + B_1 y_{t-1} + B_2 y_{t-2} + cdots + B_p y_{t-p} + epsilon_t,

where "c"0 is a "k" × 1 vector of constants, "Bi" is a "k" × "k" matrix (for every "i" = 0, ..., "p") and "ε""t" is a "k" × 1 vector of error terms. The main diagonal terms of the "B"0 matrix (the coefficients on the "i"th variable in the "i"th equation) are scaled to 1.

The error terms ε"t" ("structural shocks") satisfy the conditions (1) - (3) in the definition above, with the particularity that all the elements off the main diagonal of the covariance matrix mathrm{E}(epsilon_tepsilon_t') = Sigma are zero. That is, the structural shocks are uncorrelated.

For example, a two variable structural VAR(1) is:

:egin{bmatrix}1&B_{0;1,2} \ B_{0;2,1}&1end{bmatrix}egin{bmatrix}y_{1,t} \ y_{2,t}end{bmatrix} = egin{bmatrix}c_{0;1} \ c_{0;2}end{bmatrix} + egin{bmatrix}B_{1;1,1}&B_{1;1,2} \ B_{1;2,1}&B_{1;2,2}end{bmatrix}egin{bmatrix}y_{1,t-1} \ y_{2,t-1}end{bmatrix} + egin{bmatrix}epsilon_{1,t} \ epsilon_{2,t}end{bmatrix},


:Sigma = mathrm{E}(epsilon_t epsilon_t') = egin{bmatrix}sigma_{1}&0 \ 0&sigma_{2}end{bmatrix};

that is, the variances of the structural shocks are denoted mathrm{var}(epsilon_i) = sigma_i^2 ("i" = 1, 2) and the covariance is mathrm{cov}(epsilon_1,epsilon_2) = 0.

Writing the first equation explicitly and passing "y2,t" to the right hand side one obtains

:y_{1,t} = c_{0;1} - B_{0;1,2}y_{2,t} + B_{1;1,1}y_{1,t-1} + B_{1;1,2}y_{2,t-2} + epsilon_{1,t},

Note that "y"2,"t" can have a contemporaneous effect on "y1,t" if "B"0;1,2 is not zero. This is different from the case when "B"0 is the identity matrix (all off-diagonal elements are zero — the case in the initial definition), when "y"2,"t" can impact directly "y"1,"t"+1 and subsequent future values, but not "y"1,"t".

Because of the parameter identification problem, ordinary least squares estimation of the structural VAR would yield inconsistent parameter estimates. This problem can be overcome by rewriting the VAR in reduced form.

From an economic point of view it is considered that, if the joint dynamics of a set of variables can be represented by a VAR model, then the structural form is a depiction of the underlying, "structural", economic relationships. Two features of the structural form make it the preferred candidate to represent the underlying relations:

:1. "Error terms are not correlated". The structural, economic shocks which drive the dynamics of the economic variables are assumed to be independent, which implies zero correlation between error terms as a desired property. This is helpful for separating out the effects of economically unrelated influences in the VAR. For instance, there is no reason why an oil price shock (as an example of a supply shock) should be related to a shift in consumers' preferences towards a style of clothing (as an example of a demand shock); therefore one would expect these factors to be statistically independent.

:2. "Variables can have a contemporaneous impact on other variables". This is a desirable feature especially when using low frequency data. For example, an indirect tax rate increase would not affect tax revenues the day the decision is announced, but one could find an effect in that quarter's data.

Reduced VAR

By premultiplying the structural VAR with the inverse of "B"0

: y_t = B_0^{-1}c_0 + B_0^{-1} B_1 y_{t-1} + B_0^{-1} B_2 y_{t-2} + cdots + B_0^{-1} B_p y_{t-p} + B_0^{-1}epsilon_t,

and denoting

: B_{0}^{-1} c_0 = c,quad B_{0}^{-1}B_i = A_{i} ext{ for }i = 1, dots, p ext{ and }B_{0}^{-1}epsilon_t = e_t

one obtains the "p"th order reduced VAR

:y_t = c + A_1 y_{t-1} + A_2 y_{t-2} + cdots + A_p y_{t-p} + e_t

Note that in the reduced form all right hand side variables are predetermined at time "t". As there are no time "t" endogenous variables on the right hand side, no variable has a "direct" contemporaneous effect on other variables in the model.

However, the error terms in the reduced VAR are composites of the structural shocks "e""t" = "B"0−1"ε""t". Thus, the occurrence of one structural shock "εi,t" can potentially lead to the occurrence of shocks in all error terms "ej,t", thus creating contemporaneous movement in all endogenous variables. Consequently, the covariance matrix of the reduced VAR

:Omega = mathrm{E}(e_t e_t') = mathrm{E} (B_0^{-1} epsilon_t epsilon_t' (B_0^{-1})') = B_0^{-1}Sigma(B_0^{-1})',

can have non-zero off-diagonal elements, thus allowing non-zero correlation between error terms.


Estimation of the regression parameters

Starting from the concise matrix notation (for details see this annex):

: Y=BZ +U ,

*Ordinary least squares(OLS) estimation of each equation in the reduced VAR is both consistent and asymptotically efficient. It is furthemore equal to the maximum likelihood estimator (MLE) (Hamilton 1994, p 293).

The OLS estimator for B is given by:

: hat B= YZ^{'}(ZZ^{'})^{-1}

*Generalized least square (GLS) yields the same estimation, as was shown by Zellner (1962).

The GLS estimator for B is given by:

: mbox{Vec}(hat B) = ((ZZ^{'})^{-1} Z otimes I_{k}) mbox{Vec}(Y)

Where otimes denotes the Kronecker product and Vec the vectorization of the matrix "Y".

Estimation of the covariance matrix of the errors

As in the standard case, the MLE estimator of the covariance matrix differs from the OLS estimator.

MLE estimator: hat Sigma = frac{1}{T} sum_{t=1}^T hat epsilon_that epsilon_{t}^{'}

OLS estimator: hat Sigma = frac{1}{T-kp-1} sum_{t=1}^T hat epsilon_that epsilon_t^' for a model with a constant, "k" variables and "p" lags

In a matrix notation, this gives:

: hat Sigma = frac{1}{T-kp-1} (Y-hat{B}Z)(Y-hat{B}Z)^'.

Note that for the GLS estimator the covariance matrix of the errors becomes:

: old hat Sigma_epsilon = I_T otimes hat Sigma_epsilon.

Estimation of the covariance matrix of the parameters

The covariance matrix of the parameters can be estimated as

: hat mbox{Cov} (mbox{Vec}(hat B)) =({ZZ'})^{-1} otimeshat Sigma.,


* Walter Enders, "Applied Econometric Time Series", 2nd Edition, John Wiley & Sons 2003, ISBN 0-471-23065-0
* James D. Hamilton. "Time Series Analysis". Princeton University Press. 1995.
* Helmut Lütkepohl. "New Introduction to Multiple Time Series Analysis". Springer. 2005.
* Zellner (1962) An Efficient Method of Estimating Seemingly Unrelated Regressions and Tests for Aggregation Bias. "Journal of the American Statistical Association", Vol. 57, No. 298 (Jun., 1962), pp. 348-368


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Christopher A. Sims — Born October 21, 1942 (1942 10 21) (age 69) Washington, D.C. Nationality American Institution Princeton University Field Macroeconomics Econometrics Time ser …   Wikipedia

  • Autoregressive moving average model — In statistics, autoregressive moving average (ARMA) models, sometimes called Box Jenkins models after the iterative Box Jenkins methodology usually used to estimate them, are typically applied to time series data.Given a time series of data X t …   Wikipedia

  • Degrees of freedom (statistics) — In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary.[1] Estimates of statistical parameters can be based upon different amounts of information or data. The number… …   Wikipedia

  • List of mathematics articles (V) — NOTOC Vac Vacuous truth Vague topology Valence of average numbers Valentin Vornicu Validity (statistics) Valuation (algebra) Valuation (logic) Valuation (mathematics) Valuation (measure theory) Valuation of options Valuation ring Valuative… …   Wikipedia

  • Principal component analysis — PCA of a multivariate Gaussian distribution centered at (1,3) with a standard deviation of 3 in roughly the (0.878, 0.478) direction and of 1 in the orthogonal direction. The vectors shown are the eigenvectors of the covariance matrix scaled by… …   Wikipedia

  • Linear regression — Example of simple linear regression, which has one independent variable In statistics, linear regression is an approach to modeling the relationship between a scalar variable y and one or more explanatory variables denoted X. The case of one… …   Wikipedia

  • Covariance — This article is about the measure of linear relation between random variables. For other uses, see Covariance (disambiguation). In probability theory and statistics, covariance is a measure of how much two variables change together. Variance is a …   Wikipedia

  • Least squares — The method of least squares is a standard approach to the approximate solution of overdetermined systems, i.e., sets of equations in which there are more equations than unknowns. Least squares means that the overall solution minimizes the sum of… …   Wikipedia

  • Partial correlation — In probability theory and statistics, partial correlation measures the degree of association between two random variables, with the effect of a set of controlling random variables removed. Contents 1 Formal definition 2 Computation 2.1 Using… …   Wikipedia

  • Monte Carlo methods for electron transport — The Monte Carlo method for electron transport is a semiclassical Monte Carlo(MC) approach of modeling semiconductor transport. Assuming the carrier motion consists of free flights interrupted by scattering mechanisms, a computer is utilized to… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”

We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.