Re: Different ADF test results between SAS and Python

SASCom1 · Posted 04-01-2026 07:57 PM

Although algorithm differences in software packages could lead to slightly different p values for the ADF test, it is important that you choose the appropriate model and appropriate number of augmenting lags when performing the ADF test and comparing between software packages, since these factors can greatly impact the ADF test results.

I experimented a bit with the data you provided. Following the method described in Dr. Dickey's paper(in section 7)

https://support.sas.com/resources/papers/proceedings/proceedings/sugi30/192-30.pdf

I experimented on the choice of number of augmenting lags,

data b ;
set yourdata;
lagy = lag(y) ;

det = dif(y) ;
det1 = lag(det);
det2 = lag(det1) ;
det3 = lag(det2) ;
det4 = lag(det3);

run;

proc reg data = b outest = d2_lag2 ;
model det = lagy det1 det2 det3 det4 ;
test det1 = 0, det2 = 0, det3 = 0, det4 = 0 ;

run;

the parameters estimates

Parameter Estimates
Variable	DF	Parameter Estimate	Standard Error	t Value	Pr > \|t\|
Intercept	1	1.01485	1.11803	0.91	0.3706
lagy	1	-0.25648	0.16629	-1.54	0.1325
det1	1	0.13969	0.16457	0.85	0.4021
det2	1	-0.09814	0.16308	-0.60	0.5514
det3	1	0.07849	0.15074	0.52	0.6061
det4	1	-0.50851	0.15089	-3.37	0.0019

notice the strong significance on the det4 parameter.

the F test results

Test 1 Results for Dependent Variable det
Source	DF	Mean Square	F Value	Pr > F
Numerator	4	120.04426	3.97	0.0097
Denominator	33	30.21161

strongly reject the null that all four lagged differences have zero coefficient. I also did some additional F tests on subsets of det1~det4, results all seem to suggest the choice of 4 augmenting lags may be needed.

I also experimented using the autolag = 'AIC' option in your Python code, which also selected #of lags = 4 in Python, confirmed the above choice.

So, if we choose 4 number of augmenting lags instead of 0 in your original code, and then compare the ADF test results, the tau statistic is -1.54, with p value for tau 0.5021 in PROC ARIMA, and p value for tau 0.51246 in Python. For both packages, the evidence for non-stationarity is quite strong. The slight difference in the p values between packages in this case is of no importance.

I hope this helps.

Catch up on SAS Innovate 2026