BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ESV
Fluorite | Level 6 ESV
Fluorite | Level 6

I'm calling out to @SteveDenham because I've often gleaned very useful guidance from your posts for analytical questions that I've had and your area of expertise seems relevant to my issue.

(For reference, I'm using SAS 9.4 TS Level 1M5 on an X64_10Pro platform.)

For some time, I have been using BIC for model selection in Proc Mixed (e.g. to compare between heterogeneous and homogeneous variance models and to compare covariance structures in repeated measures analysis).  From my understanding, BIC should impose a greater penalty for increasing number of parameters as compared with AIC or AICC.  In some recent analyses, I realized that use of BIC was actually more likely than other IC to lead to selection of more complex models. This behavior appears to be related to a couple of issues with reported BIC values (in Proc Mixed when using REML, at least).  The SAS/STAT 14.3 user’s guide (here)  indicates that, in agreement with the listed reference (Schwarz, The Annals of Statistics 1978 Vol. 6 Issue 2 Pages 461-464),  BIC is calculated  as:

 

                BIC = -2L + d log n

 

where d represents the number of estimated covariance parameters

 

However, this is apparently not the formula that is actually used to compute the reported BIC values. From multiple simulations, it appears that what Proc Mixed is actually reporting is:

 

                BIC = -2L + d log n - d

 

which causes it to greatly discount the influence of additional covariance parameters, which is precisely the opposite of the expected behavior.

 

Is this a bug in the code, or am I missing something here?

 

As a side note, the user’s ability to understand BIC, as compared with the other information criteria, is greatly compromised because of the differing (and very complicated) definitions of n that are used.  For example, the simulated data set below is used as an example of randomized complete block design experiment, with repeated measures on individual subjects. In this situation, n* (as used to determine AICC) is calculated as n – rank(X) and equals 48, whereas n (as used to determine BIC) represents the number of levels of the blocking factor (the first, and only, specified random effect), which equals 4.  It is difficult to reconcile the use of BIC given the confusing and changing definitions of n. As above, I'm wondering if this is the intended behavior, as it does not seem to agree with the definition of BIC in other sources that I read.  As things now stand, I am discontinuing the use of BIC in favor of AICC, and now questioning a lot of published data (mine and others).  Any insight would be much appreciated. 

 

Simulated data to exemplify the issues described:

data;
input ID Block Trt$ time Resp;
cards;
1	1	A	1	1.726894073
2	1	B	1	2.092813457
3	1	C	1	2.72720316
4	1	D	1	3.414096871
5	2	A	1	1.616575827
6	2	B	1	2.448749843
7	2	C	1	3.197066773
8	2	D	1	4.218032532
9	3	A	1	2.250938172
10	3	B	1	3.167495069
11	3	C	1	3.463748709
12	3	D	1	4.300078354
13	4	A	1	2.800758394
14	4	B	1	3.408282671
15	4	C	1	3.932933885
16	4	D	1	5.291471566
1	1	A	2	1.68963511
2	1	B	2	2.1472738
3	1	C	2	2.573448703
4	1	D	2	3.853775305
5	2	A	2	2.234276687
6	2	B	2	2.554918183
7	2	C	2	2.852388857
8	2	D	2	3.941084149
9	3	A	2	2.728948071
10	3	B	2	3.203471147
11	3	C	2	3.620032518
12	3	D	2	4.17829973
13	4	A	2	2.442563804
14	4	B	2	3.440094189
15	4	C	2	4.049735686
16	4	D	2	4.785880324
1	1	A	3	1.73878971
2	1	B	3	1.61274462
3	1	C	3	2.809355931
4	1	D	3	3.553253027
5	2	A	3	1.549664697
6	2	B	3	2.464931929
7	2	C	3	3.219585446
8	2	D	3	3.958149394
9	3	A	3	2.621761763
10	3	B	3	3.129957787
11	3	C	3	3.79504407
12	3	D	3	4.649424914
13	4	A	3	3.000206689
14	4	B	3	3.469261887
15	4	C	3	3.987422634
16	4	D	3	5.245902679
1	1	A	4	1.491970466
2	1	B	4	1.982509609
3	1	C	4	2.964069871
4	1	D	4	3.164638885
5	2	A	4	1.693782424
6	2	B	4	2.553224482
7	2	C	4	3.172781663
8	2	D	4	4.075483509
9	3	A	4	2.336685246
10	3	B	4	3.002418224
11	3	C	4	3.767810137
12	3	D	4	4.881787515
13	4	A	4	2.880757109
14	4	B	4	3.865221255
15	4	C	4	4.01673442
16	4	D	4	4.823744484

;
*Model 1 using unstructured covariance;
proc mixed ranks;
	class id block trt time;
	model resp = trt|time /ddfm=kr2;
	random block;
	repeated time/ sub=id type=un;
run;
*Model 2 using variance components covariance structure;
proc mixed ranks;
	class id block trt time;
	model resp = trt|time /ddfm=kr2;
	random block;
	repeated time/ sub=id type=vc;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
jiltao
SAS Super FREQ

The PROC MIXED documentation below has detailed information on how n and d are determined for various situations --

https://go.documentation.sas.com/doc/en/pgmsascdc/v_014/statug/statug_mixed_syntax01.htm#statug.mixe...

So, n equals the number of effective subjects as displayed in the "Dimensions" table, unless this value equals 1, in which case n equals the number of levels of the first random effect you specify in a RANDOM statement. In your case, n=4 for BIC.

For restricted likelihood estimation, d equals q, the effective number of estimated covariance parameters. In your case, d=11 for the UN model and d=2 for the VC model.

The BIC value for model 1 = -2l + d*logn = 17.0 + 11*log(4) = 32.2

The BIC value for model 2 = -2l + d*logn = 27.1 + 2*log(4) = 29.9

They are consistent with PROC MIXED output.

As far as "inconsistency" between AICC and BIC -- I do not compare these two fit statistics among themselves. I compare the same fit statistics (AICC or BIC) across different models. And I assume that is what these fit statistics are used.

I am not sure what other sources you read about BIC. I am personally not aware of a reference where n and d are defined explicitly for models with RANDOM and REPEATED statements for all these fit statistics. But it could just be my ignorance. Our developers did their research and came up with these calculations for various data and modeling situations. If you have good references for different computations please share it with me, I would appreciate that.

 

Thanks,

Jill

 

View solution in original post

4 REPLIES 4
SteveDenham
Jade | Level 19

Hi @ESV , I hope someone on the tech side at SAS picks up on this.  I spent a while digging through what I thought I knew about BIC, and it turns out I don't know as much as I hoped.  Burnham and Anderson (2002) in their book Model Selection and Inference go through several IC's and concluded (sort of) that the AICc does the best job of selecting models.  Now, if the algorithm for calculating BIC is incorrect, that may change.  Anyhow, AICc is the method I have used for the last couple of decades for frequentist mixed modeling, and DIC for Bayesian mixed modeling (which I am really just getting started with).

 

So in the end, I am not much help here, and I will @ call some others:

@jiltao  @Rick_SAS  @Kathleen  @StatDave 

 

SteveDenham

ESV
Fluorite | Level 6 ESV
Fluorite | Level 6

Thanks, @SteveDenham.  As pointed out in my response to @jiltao, I inadvertently was using the published equation for CAIC instead of BIC when trying to replicate the SAS calculation, and apologize to all for the wild goose chase.  I have tended to use BIC (in my frequentist approach) because I have understood (partly from Burnham and Anderson papers) that it was more conservative with respect to inclusion of additional parameters, and my tendency is generally toward parsimony in model selection.  However, I believe I will now start emphasizing AICC instead - partly because I don't quite follow the rationale behind defining n based on the number of levels of the first listed random effect, and partly because in my recent experience, use of BIC has been less conservative than AICC with respect to model complexity. 

jiltao
SAS Super FREQ

The PROC MIXED documentation below has detailed information on how n and d are determined for various situations --

https://go.documentation.sas.com/doc/en/pgmsascdc/v_014/statug/statug_mixed_syntax01.htm#statug.mixe...

So, n equals the number of effective subjects as displayed in the "Dimensions" table, unless this value equals 1, in which case n equals the number of levels of the first random effect you specify in a RANDOM statement. In your case, n=4 for BIC.

For restricted likelihood estimation, d equals q, the effective number of estimated covariance parameters. In your case, d=11 for the UN model and d=2 for the VC model.

The BIC value for model 1 = -2l + d*logn = 17.0 + 11*log(4) = 32.2

The BIC value for model 2 = -2l + d*logn = 27.1 + 2*log(4) = 29.9

They are consistent with PROC MIXED output.

As far as "inconsistency" between AICC and BIC -- I do not compare these two fit statistics among themselves. I compare the same fit statistics (AICC or BIC) across different models. And I assume that is what these fit statistics are used.

I am not sure what other sources you read about BIC. I am personally not aware of a reference where n and d are defined explicitly for models with RANDOM and REPEATED statements for all these fit statistics. But it could just be my ignorance. Our developers did their research and came up with these calculations for various data and modeling situations. If you have good references for different computations please share it with me, I would appreciate that.

 

Thanks,

Jill

 

ESV
Fluorite | Level 6 ESV
Fluorite | Level 6

Thanks @jiltao for clarifying my error (turns out I was using the equation for CAIC rather than BIC when trying to replicate the SAS calculation.  Also, to clarify - I was not suggesting comparing values between AICC and BIC.  The point I was trying to make is that from the resources I've read - BIC is supposed to impose a greater penalty for each added parameter, and thus should be more likely to select models with fewer parameters when compared with AICC.  This has not been my experience.  However, when trying to understand the difference in model selection w/ AICC vs BIC, it is challenging to parse out because of the different definitions for n that are used for the different IC.  

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 577 views
  • 3 likes
  • 3 in conversation