Re: AugmentedDickeyFuller and different result in R

Home 21090308 Forums Re: AugmentedDickeyFuller and different result in R


OK. I see what the problem is.

The short answer is that R gives out wrong values for ADF test.

The long answer requires you to know a bit how ADF distributions are constructed and how R works. Theoretically, the “true” ADF distribution for a finite sample depends on 1) trend type, 2) sample size, 3) lag order and some other parameters. See for example

In practice, however, R hard codes only a few critical values for one ADF distribution (probably the infinite sample case, aka asymptotic distribution, as it is the easiest to code up). R does a linear interpolation for the other values it does not have. Here is an example.

R outputs 0.3278, which is wrong!

Here is how R computes this value (Numerical Method Inc.’s internal memo):

    *  The p-value is obtained form the interpolation
    * (0.9-0.1)/(3.24-1.14)=(x-0.1)/(3.24-2.642)
    * “x=0.8/(3.24-1.14)*(3.24-2.642)+0.1=0.3278”
    * -3.24 and -1.14 are from table 4.2c of A. Banerjee, J. J. Dolado, J. W. Galbraith, and D. F. Hendry (1993): Cointegration, Error Correction, and the Econometric Analysis of Non-Stationary Data, Oxford University Press, Oxford.

The “true” value, as computed by SuanShu, should be 0.1 for this finite sample case, no time trend, lag order 4.

SuanShu does not hard code any critical values, unlike R. SuanShu can compute the true ADF distribution table for a given set of parameters in real time. In this case, SuanShu is slower but is more accurate. The user, however, can cache SuanShu results to speed up.

For details, see these references.

  • D. A. Dickey and W. A. Fuller, “Distribution of the Estimators for Autoregressive Time Series with a Unit Root,” J. Amer. Stat. Assoc., vol. 74, pp. 427–431, 1979.
  • E. Said and D. A. Dickey, “Testing for Unit Roots in Autoregressive Moving Average Models of Unknown Order,” Biometrika, vol. 71, 599–607, 1984.
  • A. Banerjee et al., “Cointegration, Error Correction, and the Econometric Analysis of Non-Stationary Data,” Oxford, Oxford University Press, 1993, ch. 4, pp. 99-135.

One more thing to note. Your R code and SuanShu code are not doing the same thing.

Your R code is using the stationary distribution and 0 lag.

Your SuanShu code is using the stationary distribution but non-zero lag. The lag formula for SuanShu and R is the same. It is:

nLag = (int) Math.pow((series.length – 1, 1.0 / 3.0));

Hope this helps.