Hi,
I'm a new SAS user. I have been trying to replicate the tests I conducted in Python (e.g. normality, Dicky Fuller Test) in SAS. However, I have noticed that there are differences in results (Pr < Tau) and ADF stat between SAS and Python. For clarity, I used PROC ARIMA for ADF in SAS and adfuller package in Python. May I know why there are differences in result?
Appreciate if anyone can respond to this.
Thank you.
Generic question gets a generic response: differences arise from different details in the programming and often the machines the code executes on. Things like number of decimal places maintained internally for computations can easily effect results even when using the same algorithms and just because a statistical test has the same name in different packages there is very likely no commonality at all points in the way the algorithm was programmed.
HOW much difference might be the question that you need an answer to. So perhaps sharing the results in question is place to start.
Also some programs will report the results using different defaults. For instance in a logistic regression with a true/false type outcome one program may default to modeling the "true" value and the other the "false". So same data would tend to report something that looks like a compliment of the other (70% true or 30% false for example).
For serious details you might need to include:
Your data
Your Code for both approaces
The output
The research you question you want answered so we can validate that the SAS (at least) approach is using appropriate options.
Some of the SAS procedures will have a section in the online help called details that may include some of the computation details.
Thank you for your response.
Further details on this:
1. Data (as per Excel attachment)
2. You may find the code as below:
a. SAS code:
/*stationarity test*/
proc arima data=WORK.QUARTERLY;
identify var= &&var&i stationarity=(adf=(0));
ods output stationaritytests=stationary_data;
run;
b. Python code:
adf_result_var1 = adfuller(raw_ln_hfa[combo[0]], maxlag=0, regression='c', autolag=None)
adf_result_var2 = adfuller(raw_ln_hfa[combo[1]], maxlag=0, regression='c', autolag=None)
adf_var1 = adf_result_var1[0] # ADF statistic
adf_var2 = adf_result_var2[0]
adf_pval_var1 = adf_result_var1[1] # p-value
adf_pval_var2 = adf_result_var2[1]
3. Output:
a. SAS p-value (from pr < tau): 0.51376756197933
b. Python p-value: 0.521899648105788
4. Research question: to reject or accept whether data is stationary.
You used adfuller package in Python, right?
Link to the Python documentation:
https://www.statsmodels.org/dev/generated/statsmodels.tsa.stattools.adfuller.html
I think it's best you put your Python code AND your SAS code in the reply.
BR, Koen
adf_result_var1 = adfuller(raw_ln_hfa[combo[0]], maxlag=0, regression='c', autolag=None) adf_result_var2 = adfuller(raw_ln_hfa[combo[1]], maxlag=0, regression='c', autolag=None) adf_var1 = adf_result_var1[0] # ADF statistic adf_var2 = adf_result_var2[0] adf_pval_var1 = adf_result_var1[1] # p-value adf_pval_var2 = adf_result_var2[1]
/*stationarity test*/
proc arima data=WORK.QUARTERLY;
identify var= &&var&i stationarity=(adf=(0));
ods output stationaritytests=stationary_data;
run;Hi,
These are my codes for your reference.
An Augmented Dickey-Fuller (ADF) test with zero augmenting lags is equivalent to the original Dickey-Fuller (DF) test.
The ADF test performed in PROC ARIMA is based on the description of this test in Hamilton (1994).
Hamilton, J. D. (1994). Time Series Analysis. Princeton, NJ: Princeton University Press.
Note there are different ways in SAS to perform ADF:
I guess they all provide the same p-values.
No idea why you notice a difference between SAS and Python in your p-values for DF.
a. SAS p-value (from pr < tau): 0.51376756197933
b. Python p-value: 0.521899648105788
Both p-values are very close to each other and there is no doubt about the conclusion (the same conclusion for both), but they may still be far enough apart to be odd.
It could be due to all sorts of things. Please note that the p-values are derived from a huge amount of simulation replications.
Maybe @SASCom1 can help further?
If you want to open a Technical Support ticket,
here is a link for your convenience: https://support.sas.com/en/technical-support.html#contact
(SAS Technical Support)
BR, Koen
Sorry for my late response.
Here is the documentation on the computation of ADF test p values in SAS:
SAS Help Center: PROBDF Function for Dickey-Fuller Tests
**********************************************************************************************************
The PROBDF function is calculated from approximating functions fit to empirical quantiles that are produced by a Monte Carlo simulation that employs 10^8 replications for each simulation. Separate simulations were performed for selected values of n and for d = 1, 2, 4, 6, 12
(where n and d are the second and third arguments to the PROBDF function).
The maximum error of the PROBDF function is approximately +-10^-3 for d in the set (1,2,4,6,12) and can be slightly larger for other d values. Because the number of simulation replications used to produce the PROBDF function is much greater than the 60,000 replications used by Dickey and colleagues (Dickey and Fuller 1979; Dickey, Hasza, and Fuller 1984), the PROBDF function can be expected to produce results that are substantially more accurate than the critical values reported in those papers.
**************************************************************************************************************
Different software packages may implement different algorithms, and even with similar algorithms, there could still be minor differences in the implementation details, and computed p values may not be identical, as the case you observe.
I hope this helps.
There are three kinds of tests under the ADF tests: rho test, tau test, and F test.
For more information about test statistics under the ADF tests, see the section
SAS Help Center: Stationarity Tests
And here's a paper by David A. Dickey himself.
>> SAS Global Forum 2016 proceedings
>> Paper 7080-2016
>> What’s the Difference?
>> David A. Dickey, NC State University
>> https://support.sas.com/resources/papers/proceedings16/7080-2016.pdf
Prof. Dickey claims on p.14: "The taus and their associated pvalues are the most commonly used of these tests."
BR, Koen
Thank you for your response. Since we cannot compare p-value directly between the two platforms, how do we at least determine the consistency of pass/fail stationary test between the two platforms? Is there a way to extract critical value from SAS so I can determine whether tau statistic passes or fails the 1%, 5% and 10% of critical value thresholds and can I extract the coefficients used to approximate p value from SAS?
SAS only outputs the rho, tau, and F test statistics together with their corresponding p values, it does not print the 1%,5%,or 10% critical values. You make conclusions to reject or not reject the null using the computed p values and compare with your desired significance level.
Noted on this response, but my issue arises when the p-value results lead
to different conclusion (e.g. Python rejects null but SAS accepts null). How can I reconcile these differences?
I hope it happens extremely rarely, but of course – at some point – such situations (different conclusions) will arise.
You could say that you’re in a sort of meta-analysis scenario (a bit of a stretch, I agree) and you can opt for a weighted p-value. With equal weights, you get an average p-value.
It’s a tricky topic, ... but you’ll have to choose a reconciliation scenario if you continue to work with both SAS and Python to investigate stationarity.
Good luck,
Koen
What p values did you get from the two software packages for this case, when they lead to different conclusions? Are they using the same codes as you provided earlier in this thread? Can you provide more details on the output and the example data?
Yes, I used the same codes as shown earlier in the thread. I have attached the data and added the output for your reference.
Result
Tau (Python): -2.88491230787933
Tau (SAS): -2.88
The p-value at 0.05 significance level:
Python: 0.0471308321494495
SAS: 0.0551742457504444
Dive into keynotes, announcements and breakthroughs on demand.
Explore Now →