BookmarkSubscribeRSS Feed
edhuang
Obsidian | Level 7

Hi,

 

I'm wondering if someone can help me understand how to interpret the covariance parameter with my following code.  What does estimate for variance and AR(1) mean and how are they related?  My response (adeno_yes) is binary, so I'm having a harder time grasping this.

 

proc glimmix data=newadr_ge200_avg_2012_v2 noclprint method=laplace;
class nMD procyear nMDsex nMDloc age_newcat2 nsex nrace1_newcat2 charlson_newcat2 bmi_newcat2 nsmoke3
	  nprepquality2 endocuff_newcat mean_vol_cat expyear_cat/ref=first;
model adeno_yes(event=last)=procyear 
							age_newcat2 nsex nrace1_newcat2 charlson_newcat2 bmi_newcat2 nsmoke3    /*patient related characteristics*/
							nMDsex nMDloc nprepquality2 endocuff_newcat mean_vol_cat expyear_cat/cl dist=binary link=logit solution oddsratio;
random procyear/sub=nMD type=ar(1) s CL;
covtest/wald;
run;

My covariance parameter output is below.

Covariance Parameter Estimates
Cov Parm Subject Estimate Standard Z Value Pr Z
Error
Variance nMD 0.2813 0.04826 5.83 <.0001
AR(1) nMD 0.6921 0.05996 11.54 <.0001
12 REPLIES 12
SteveDenham
Jade | Level 19

In the documentation for the type= option for the RANDOM statement, you'll find the parameterization.  Two factors are included: sigma and rho.  Sigma squared is the variance at each time point (labeled Variance in the table) and rho is the correlation between the variance at any two adjacent time points, and is labeled AR(1) in the table.

 

SteveDenh

edhuang
Obsidian | Level 7
Hi Steve,

Thanks. That's helpful.
jiltao
SAS Super FREQ

AR(1) is the correlation and variance is the variance estimate. They are in the Logit scale. Like @SteveDenham mentioned, the documentation has the parameterizations for different covariance structures and you can find the meanings for the parameters there.

However, your model is unusual to me. You are assuming the G-side random effect itself has the AR(1) structure, which is not common. I suspect that you want to model the R-side random effect as the AR(1) structure? Then you need to add the Residual option in the RANDOM statement and take out the METHOD=LAPLACE option in the PROC GLIMMX statement.

edhuang
Obsidian | Level 7

Hi jiltao,

 

Your comment is interesting to me.

So I would like to model the clustering effect of physicians (labeled as nMD) on patient outcome (adeno_yes), but also account for the correlation they have over time (procyear).

 

Are you suggesting my RANDOM statements should be as follows instead?

One random statement for the random effects (random int/sub=nMD) and another for the repeated measure (random procyear/sub=nMD type=ar(1) residual)

 

proc glimmix data=newadr_ge200_avg_2012_v2;
class nMD procyear nMDsex nMDloc_new age_newcat2 nsex nrace1_newcat2 charlson_newcat2 bmi_newcat2 nsmoke3
	  nprepquality2 endocuff_newcat mean_vol_cat expyear_cat/ref=first;
model adeno_yes(event=last)=procyear age_newcat2 nsex nrace1_newcat2 charlson_newcat2 bmi_newcat2 nsmoke3    
                            nMDsex nMDloc_new nprepquality2 endocuff_newcat mean_vol_cat expyear_cat/cl dist=binary link=logit solution;
random int/sub=nMD;
random procyear/sub=nMD type=ar(1) residual;
run;
SteveDenham
Jade | Level 19

@edhuang , that is what @jiltao was suggesting.  It is in line with a paper by Stroup and Claassen that talks about the linearized method (RSPL) with an R side variance for repeated measures as being usually less biased than the integral methods (Laplace and adaptive quadrature), where the RESIDUAL option is not supported.  What you fit in the original code would be the equivalent of the glme or glmmTMB package in R.

 

SteveDenham

edhuang
Obsidian | Level 7

Hi Steve,

 

Thanks.  So I tried using your method=rspl (linearization method) as oppose to integral method.   At first, I thought it may be that my model was too complex, so I took out all of my covariates, but the same thing occurred.  Any pointers on how to troubleshoot this?  I will add @jiltao as well for suggestions.

 

NOTE: The GLIMMIX procedure is modeling the probability that adeno_yes='1'.
WARNING: Obtaining minimum variance quadratic unbiased estimates as starting values for the covariance parameters failed.
NOTE: PROCEDURE GLIMMIX used (Total process time):
real time 9.62 seconds
cpu time 9.51 seconds

SteveDenham
Jade | Level 19

Hi @edhuang  - This error is, for me, the most frustrating of the "can't get started" errors.  It is saying that the default starting values for all of the covariance  parameters don't allow for a MIVQUE estimate to get started.  This means using the PARMS statement to feed in better values.  The question then arises, "Where do I get better values, and how close do they have to be to get things started?"  I think you are lucky in that your integral method converged and gave you values (which might be biased like most MLE variance estimates).  You could plug those in as starting values.

 

And if that doesn't help, you can grid search for starting values.  @jiltao or @STAT_Kathleen  might have better ideas.

 

SteveDenham

edhuang
Obsidian | Level 7

Hi Steve,

 

Excuse my ignorance.  But exactly, what parameter values do I take from the results of the Laplace model for my PARMS statement in the RSPL model?

 

SteveDenham
Jade | Level 19

The covariance parameters.  Be sure they are in the correct order.  Take a look at the examples for the PARMS statement in the GLIMMIX documentation.

 

Based on what you started this with, I would try:

 

PARMS (0.2813) (0.6921);

 

SteveDenham

edhuang
Obsidian | Level 7

Hi Steve,

 

I just changed my correlation matrix type to VC and it appears to converge.  Ar(1) may be too complex.

I tried your method, but it appears that I need three parameters and not just 2.  In my original case, I estimated only the G-side. It may be that I need more when I am estimating both the g-side and also r-side using my newer model.  

jiltao
SAS Super FREQ

This type of convergence issue is often model/data dependent. You can use the PARMS statement to provide your own starting values. Trial and error might not be a bad idea. Sometimes if your model is not appropriate for your data, convergence can be an issue. If you can send in your data set I will take a look to see if I can provide any suggestions.

jiltao
SAS Super FREQ

Yes your PROC GLIMMIX code looks reasonable to me.

sas-innovate-white.png

🚨 Early Bird Rate Extended!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Lock in the best rate now before the price increases on April 1.

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 2101 views
  • 0 likes
  • 3 in conversation