Hi,
I run the following proc autoreg:
proc autoreg data=have maxiter=1000 outest=status;
by id;
model var1=var2 var3 var4/garch=(p=1,q=1,type=exp);
ods output fitsummary=summary_1_1;
output out=want_1_1;
run;
Following is part of summary_1_1 table:
ID | Model | Label1 | cValue1 | nValue1 | Label2 | cValue2 | nValue2 |
10001 | Model1 | SSE | 4.7669483 | 4.766948 | DFE | 6696 | 6696 |
10001 | Model1 | MSE | 0.0007119 | 0.000712 | Root MSE | 0.02668 | 0.026682 |
10001 | Model1 | SBC | -29513.633 | -29514 | AIC | -29540.872 | -29541 |
10001 | Model1 | MAE | 0.01695687 | 0.016957 | AICC | -29540.866 | -29541 |
10001 | Model1 | MAPE | 605.877731 | 605.877731 | HQC | -29531.465 | -29531 |
10001 | Model1 | Durbin-Watson | 2.546 | 2.545994 | Total R-Square | 0.0088 | 0.008755 |
10001 | Model1 | SSE | 4.77694598 | 4.776946 | Observations | 6700 | 6700 |
10001 | Model1 | MSE | 0.000713 | 0.000713 | Uncond Var | . | . |
10001 | Model1 | Log Likelihood | 15532.8915 | 15533 | Total R-Square | 0.0067 | 0.006676 |
10001 | Model1 | SBC | -30995.304 | -30995 | AIC | -31049.783 | -31050 |
10001 | Model1 | MAE | 0.01687189 | 0.016872 | AICC | -31049.761 | -31050 |
10001 | Model1 | MAPE | 371.798568 | 371.798568 | HQC | -31030.969 | -31031 |
10001 | Model1 | 0 | Normality Test | 3677982.25 | 3677982 | ||
10001 | Model1 | 0 | Pr > ChiSq | <.0001 | 1.00E-08 |
In this table two AIC values are reported for one certain id. What is the difference between these two values of AIC?
Thanks.
Thank you so much for your comments and suggestions.
I contacted the SAS technical support and they explained why I have two sets of results. I copy their respond here in case someone faces the same issue.
"The two sets of fitsummary results are obtained from OLS estimation results and GARCH model estimation results respectively. The results from OLS and from GARCH estimation share the same ODS table name but different path names. Since you did not specify their own path in the ODS table name, the results from both OLS and GARCH model estimation are written to the same output data set."
Are you sure you are seeing what you describe? You don't show the ID variable, nor do you show all the observations.
Print your output data set like this and see if there are really two AIC values for one ID.
proc print; by id; id id; run;
Unless you are asking about AIC versus AICC. If so, see page 382 of the documentation.
https://support.sas.com/documentation/onlinedoc/ets/142/autoreg.pdf
Yes, this is the output I get for one specific id. the file itself is too big to put here as I have more than 200K IDs. This table is only for one ID.
I just find it hard to believe that you are really seeing what you say you are seeing. Since you omit the ID from the display, it makes it hard to say what is going on. How about instead of using a BY statement, use a WHERE statement that selects just the ID of interest. If you still think you are getting two values for one ID, post the WHERE clause, info about the ID variable (numeric, character, format, length, etc.) and the results for that one ID.
Sorry, I should have said to add the WHERE statement in addition to the BY statement not instead of.
Thank you for your reply. But I cannot see how this observation is related to using "by" or "where" statement. To make it more clear I updated the table to include the id, and as I said I estimate the EGARCH model for more than 200K id, and thus this table is showing the the "fitsummary" output only for one ID. I have similar problem with other IDs as well, meaning that fitsummary table provide two values for each reported label associated with each specific ID.
I was just trying to isolate the problem.
Are the IDs integers? Are you sure? If they are the result of some calculation they might not be. Is there a format that you use when you print that you did not use for the analysis? Or does PROC PRINT choose a format when you print that masks the fact that the IDs are different? I am guessing that the problem is something like that.
If you can manufacture a WHERE clause that selects just the right group(s), something like this might help.
proc print;
var id;
where .. substitute where clause ..;
format ID hex16.;
run;
It's not a proc print problem. The table I copied here is part of the table I get from fitsummary and I export it to excel so that I can copy part of it here.
I am not saying that it is a PROC PRINT problem. I am saying that given the information that you provide, I suspect that your ID values are such that they appear different to BY processing (hence two tables) but are similar enough that they appear the same when you print the output data set. It is not a problem--just the facts of life of floating point arithmetic. The first PROC PRINT below suggests that I and Y are identical. The second shows that they are not.
data x;
do i = 1 to 10;
do j = 1 to 3;
y = 1e-12 * j + i;
output;
end;
end;
run;
proc print; run;
proc print; format _numeric_ hex16.; run;
I'm afraid that without seeing more information about your ID variable, I can't help.
https://support.sas.com/documentation/onlinedoc/ets/142/autoreg.pdf
OK. I am going to take one more shot then. The page 383 flow chart lists a series of steps. Does that explain it? It does look like it can hit the goodness of fit statistics more than once. If I were the developer of that PROC, and if it were possible for a table to appear more than once in an iteration, I would have ensured that there was a variable in the output data set that distinguished between the tables. But I am not.
My advice about using a WHERE statement or clause that isolates the problem still might be good advice. Perhaps if you use a WHERE clause and looked at both the output data set and the printed output, the answer might be obvious.
Since no one else is chiming in with an answer, if this does not help, please contact technical support. They are the pros. Best of luck with this!
Thank you so much for your comments and suggestions.
I contacted the SAS technical support and they explained why I have two sets of results. I copy their respond here in case someone faces the same issue.
"The two sets of fitsummary results are obtained from OLS estimation results and GARCH model estimation results respectively. The results from OLS and from GARCH estimation share the same ODS table name but different path names. Since you did not specify their own path in the ODS table name, the results from both OLS and GARCH model estimation are written to the same output data set."
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.