BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DavyJones
Obsidian | Level 7

Hi, 

I run the following proc autoreg:

 

proc autoreg data=have maxiter=1000 outest=status;

by id;

model var1=var2 var3 var4/garch=(p=1,q=1,type=exp);

ods output fitsummary=summary_1_1;

output out=want_1_1;

run;

 

Following is part of summary_1_1 table:

 

IDModelLabel1cValue1nValue1Label2cValue2nValue2
10001Model1SSE4.76694834.766948DFE66966696
10001Model1MSE0.00071190.000712Root MSE0.026680.026682
10001Model1SBC-29513.633-29514AIC-29540.872-29541
10001Model1MAE0.016956870.016957AICC-29540.866-29541
10001Model1MAPE605.877731605.877731HQC-29531.465-29531
10001Model1Durbin-Watson2.5462.545994Total R-Square0.00880.008755
10001Model1SSE4.776945984.776946Observations67006700
10001Model1MSE0.0007130.000713Uncond Var..
10001Model1Log Likelihood15532.891515533Total R-Square0.00670.006676
10001Model1SBC-30995.304-30995AIC-31049.783-31050
10001Model1MAE0.016871890.016872AICC-31049.761-31050
10001Model1MAPE371.798568371.798568HQC-31030.969-31031
10001Model1  0Normality Test3677982.253677982
10001Model1  0Pr > ChiSq<.00011.00E-08

 

In this table two AIC values are reported for one certain id. What is the difference between these two values of AIC?

 

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
DavyJones
Obsidian | Level 7

Thank you so much for your comments and suggestions.

 

I contacted the SAS technical support and they explained why I have two sets of results. I copy their respond here in case someone faces the same issue.

 

"The two sets of fitsummary results are obtained from OLS estimation results and GARCH model estimation results respectively. The results from OLS and from GARCH estimation share the same ODS table name but different path names. Since you did not specify their own path in the ODS table name, the results from both OLS and GARCH model estimation are written to the same output data set."

View solution in original post

11 REPLIES 11
WarrenKuhfeld
Ammonite | Level 13

Are you sure you are seeing what you describe?  You don't show the ID variable, nor do you show all the observations.

Print your output data set like this and see if there are really two AIC values for one ID.

 

proc print; by id; id id; run;

 

Unless you are asking about AIC versus AICC.  If so, see page 382 of the documentation.

https://support.sas.com/documentation/onlinedoc/ets/142/autoreg.pdf

DavyJones
Obsidian | Level 7

Yes, this is the output I get for one specific id. the file itself is too big to put here as I have more than 200K IDs. This table is only for one ID.

WarrenKuhfeld
Ammonite | Level 13

I just find it hard to believe that you are really seeing what you say you are seeing.  Since you omit the ID from the display, it makes it hard to say what is going on.  How about instead of using a BY statement, use a WHERE statement that selects just the ID of interest.  If you still think you are getting two values for one ID, post the WHERE clause, info about the ID variable (numeric, character, format, length, etc.) and the results for that one ID.

WarrenKuhfeld
Ammonite | Level 13

Sorry, I should have said to add the WHERE statement in addition to the BY statement not instead of.

DavyJones
Obsidian | Level 7

Thank you for your reply. But I cannot see how this observation is related to using "by" or "where" statement. To make it more clear I updated the table to include the id, and as I said I estimate the EGARCH model for more than 200K id, and thus this table is showing the the "fitsummary" output only for one ID. I have similar problem with other IDs as well, meaning that fitsummary table provide two values for each reported label associated with each specific ID.

WarrenKuhfeld
Ammonite | Level 13

I was just trying to isolate the problem.

 

Are the IDs integers?  Are you sure? If they are the result of some calculation they might not be.  Is there a format that you use when you print that you did not use for the analysis?  Or does PROC PRINT choose a format when you print that masks the fact that the IDs are different?  I am guessing that the problem is something like that.

 

If you can manufacture a WHERE clause that selects just the right group(s), something like this might help.

proc print;
   var id;
   where .. substitute where clause ..;
   format ID hex16.;
run;

 

DavyJones
Obsidian | Level 7

It's not a proc print problem. The table I copied here is part of the table I get from fitsummary and I export it to excel so that I can copy part of it here.

WarrenKuhfeld
Ammonite | Level 13

I am not saying that it is a PROC PRINT problem.  I am saying that given the information that you provide, I suspect that your ID values are such that they appear different to BY processing (hence two tables) but are similar enough that they appear the same when you print the output data set.  It is not a problem--just the facts of life of floating point arithmetic.  The first PROC PRINT below suggests that I and Y are identical.  The second shows that they are not.

 

data x;
   do i = 1 to 10;
      do j = 1 to 3;
         y = 1e-12 * j + i;
         output;
         end;
      end;
   run;
   
proc print; run;
proc print; format _numeric_ hex16.; run;

I'm afraid that without seeing more information about your ID variable, I can't help.

DavyJones
Obsidian | Level 7
Thanks for your reply. The ID variable is an integer assigned by data provider. It's constant throughout the time and specific to each one individual. It is commonly used in this literature including my studies, so I'm pretty confident that this ID is not the reason for this issue. Moreover, the pvalue and normality test provided in the two last lines are reported only once, showing that the other estimates are coming from running egarch model on only one individual.
WarrenKuhfeld
Ammonite | Level 13

https://support.sas.com/documentation/onlinedoc/ets/142/autoreg.pdf

 

OK. I am going to take one more shot then. The page 383 flow chart lists a series of steps.  Does that explain it?  It does look like it can hit the goodness of fit statistics more than once.  If I were the developer of that PROC, and if it were possible for a table to appear more than once in an iteration, I would have ensured that there was a variable in the output data set that distinguished between the tables.  But I am not.


My advice about using a WHERE statement or clause that isolates the problem still might be good advice.  Perhaps if you use a WHERE clause and looked at both the output data set and the printed output, the answer might be obvious.

 

Since no one else is chiming in with an answer, if this does not help, please contact technical support.  They are the pros.  Best of luck with this!

DavyJones
Obsidian | Level 7

Thank you so much for your comments and suggestions.

 

I contacted the SAS technical support and they explained why I have two sets of results. I copy their respond here in case someone faces the same issue.

 

"The two sets of fitsummary results are obtained from OLS estimation results and GARCH model estimation results respectively. The results from OLS and from GARCH estimation share the same ODS table name but different path names. Since you did not specify their own path in the ODS table name, the results from both OLS and GARCH model estimation are written to the same output data set."

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 2590 views
  • 1 like
  • 2 in conversation