Programming the statistical procedures from SAS

Proc Glimmix

Reply
Regular Contributor
Posts: 173

Proc Glimmix

Hi,

I am working on hierarchical models using PROC GLIMMIX. This is my first time working on this proc.

I tried following code but the log returned an error that says "ERROR: Invalid or missing data."

I am not sure if my code is incorrect ot there is anything missing in the dataset. 

Also, what do Insert in place of n in th e=model statement.


proc glimmix data=  xxxxx ;

class hospid female dm dmcx htn_c  aids alcohol ANEMDEF arth race1   ZIPINC_QRTL hosp_location h_contrl hosp_teach

bldloss chf chrnlung coag depress drug hypothy liver lymph lytes mets neuro obese para perivasc psych pulmcirc renlfail tumor

ulcer valve wghtloss cararrhythmia;

model died/n = female  dm dmcx htn_c aids alcohol ANEMDEF arth race1 ZIPINC_QRTL hosp_location h_contrl hosp_teach

bldloss chf chrnlung coag depress drug hypothy liver lymph lytes mets neuro obese para perivasc psych pulmcirc renlfail tumor

ulcer valve wghtloss cararrhythmia / solution;

random intercept / subject=hospid;

run;


Any help will be greatly appreciate!


Thanks

Ashwini

Respected Advisor
Posts: 2,655

Re: Proc Glimmix

Hi Ashwini,

That error indicates that something is missing in the dataset, and if you have the highlighted "n" in the model statement, then it is very likely that is what is missing.

So, if we can answer that question, then it will probably solve some of your problems.  The events/trials syntax that you are using implies that you know how many events (in this case, died, assuming that is a summed value across the many class variables) and how many trials are available.  For this it could be died plus lived, or if it is an intent to treat analysis, it could be number enrolled.  Just be sure that the number reflects the cell "total".

With this many categorical variables, I would watch out for quasi-separation.  You may need to collapse categories or eliminate some variables if the crosstabs reveal zeroes.

Steve Denham

Regular Contributor
Posts: 173

Re: Proc Glimmix

Hi Steve,

Thanks for your response.

I read a little more and ran another code which is as follows:

proc glimmix data=library.nismicathcabg1_3;

class hospid;

model died (event=last) = female  dm dmcx htn_c  aids alcohol ANEMDEF arth race1 ZIPINC_QRTL hosp_location h_contrl hosp_teach bldloss chf chrnlung coag depress drug hypothy liver lymph lytes mets neuro obese para perivasc psych pulmcirc renlfail tumor  ulcer valve wghtloss cararrhythmia / dist=binary link=logit ddfm=bw solution ;

random intercept / subject=hospid solution;

run;



It returned me incomplete output, where at the end it said "Did not converge".

I am attaching the output for your reference to the original post.

I am not sure why the output is incomplete and showing "did not converge".


Appreciate your advice !

Thanks

Ashwini


Respected Advisor
Posts: 2,655

Re: Proc Glimmix

The default number of iterations in GLIMMIX is twenty.  I see that the objective function looks to be leveling out, but that there is still some iterating to do.

Add the following line to your GLIMMIX code:

nloptions maxiter=200;

If it still does not converge, try adding pconv=1e-6 to the PROC GLIMMIX statement.

Steve Denham

Regular Contributor
Posts: 173

Re: Proc Glimmix

Hi Steve,

Thanks again for your advice. i tried both but to no avail. I still get the same output that has "Did not converge" in it.

I also tried using your suggestions in a very parsimonious model but that didn't help either.

Not sure what is wrong with the code...

Respected Advisor
Posts: 2,655

Re: Proc Glimmix

I really don't think anything is wrong with your code.  Could you please share the output, especially the iteration history part?  I can be more helpful if I see how that is behaving.

Steve Denham

Regular Contributor
Posts: 173

Re: Proc Glimmix

These are the codes

proc glimmix data=library.nismicathcabg1_3;

class hospid;

model died (event=last) = female  dm dmcx htn_c  aids alcohol ANEMDEF arth race1 ZIPINC_QRTL hosp_location h_contrl hosp_teach

bldloss chf chrnlung coag depress drug hypothy liver lymph lytes mets neuro obese para perivasc psych pulmcirc renlfail tumor

ulcer valve wghtloss cararrhythmia / dist=binary link=logit ddfm=bw solution ;

random intercept / subject=hospid solution;

nloptions maxiter=200;

run;


proc glimmix data=library.nismicathcabg1_3 pconv=1e-6;

class hospid;

model died (event=last) = female  dm dmcx htn_c  aids alcohol ANEMDEF arth race1 ZIPINC_QRTL hosp_location h_contrl hosp_teach

bldloss chf chrnlung coag depress drug hypothy liver lymph lytes mets neuro obese para perivasc psych pulmcirc renlfail tumor

ulcer valve wghtloss cararrhythmia / dist=binary link=logit ddfm=bw solution ;

random intercept / subject=hospid solution;

run;

Valued Guide
Valued Guide
Posts: 679

Re: Proc Glimmix

You have a very large number of predictor variables in your model. I am guessing that the log likelihood function for your data and model is fairly flat in the region of the maximum value, so that the optimization method is bouncing around and not converging at the actual maximum. I suggest you try a much simpler fixed effect part of your model. Start with just a few predictor variables so that you can obtain convergence. Then add more predictor variables to see if you can find the circumstances where problems develop.

Other comments: GLMMs with a binary conditional distribution (your situation) can give biased results. You should try METHOD=LAPLACE in the procedure statement. You should also try fitting the model without a random effect, just to look at the parameter values (to give you a sense of the magnitude and sign of the coefficient estimates). This would be just for exploratory work (you should not ignore the random effect for the final analysis). You should be able to see that many of the predictors will likely not be significant (due to an overfit).

Respected Advisor
Posts: 2,655

Re: Proc Glimmix

In addition, presenting the iteration history helps me in interpreting/troubleshooting.  Note that from iteration 5 onward, the objective function moves by a nearly fixed quantity (20 and a tiny fraction) and from iteration 13 on the tiny fraction goes away.  This is a sign of an over parameterized model, with a collinearity problem.  Some of these predictors are nearly completely confounded, such that the effective X matrix has only 17 linearly independent columns (I think that's right.  Hopefully, @lvm will correct me if I messed this up), and the 20 redundant columns just keep adding on with each iteration.

So, use subject matter knowledge to boil the independent variables down to a reasonable list.  If this is an exploratory study, you could try PROC GLMSELECT and use the LASSO option to get a handle on the independent variables.  At least run PROC REG and check the collinearity diagnostics.

Steve Denham


Regular Contributor
Posts: 173

Re: Proc Glimmix

Thanks @Ivm and Steve for your useful suggestions! I tried to minimize the number of predictor variables and I ran the following model. And I could get some output which seems right. But it gives me only estimates and p-values. I am attaching the output for your reference. Is there any way I can also get the odds ratio by using proc glimmix?

proc glimmix data=library.data;

class hospid female dm htn_c  ANEMDEF arth race1   ZIPINC_QRTL bldloss chf chrnlung coag depress drug hypothy

liver lymph lytes mets neuro obese para perivasc pulmcirc renlfail  wghtloss cararrhythmia;

model died (event='1') = female  dm htn_c   ANEMDEF arth race1 ZIPINC_QRTL bldloss chf chrnlung coag depress drug hypothy liver lymph lytes mets neuro obese para perivasc  pulmcirc renlfail wghtloss cararrhythmia /dist=binary link=logit ddfm=bw solution;

random intercept / subject=hospid;

run;

Regular Contributor
Posts: 173

Re: Proc Glimmix

Hello Steve,

As discussed in this post, I finally got the right code for Proc Glimmix and I have been using it thus far. Only today when I started using it on much larger dataset (over 250000 records) . It is the same code as I have posted above. The Proc Glimmix worked fine on the 100000 cases. But on the new dataset witha around 250000 cases, the output I am getting is not complete. Here is the code.I am not sure why it is not yielding a complete. Any help is greatly appreciated. I am also attaching the output for your reference.

proc glimmix data=library.nismi091011_gender5 ; *using oddsratio*/;

class hospid gender racenew ZIPINC_QRTL_1  dm_all   cm_htn_c cm_ANEMDEF cm_arth cm_bldloss cm_chf

cm_chrnlung cm_coag cm_depress cm_drug cm_hypothy cm_liver cm_lymph cm_lytes cm_mets cm_neuro cm_obese cm_para

cm_perivasc  cm_pulmcirc cm_renlfail  cm_wghtloss cararrhythmia;

model died (event='1') = age gender  racenew ZIPINC_QRTL_1  dm_all   cm_htn_c cm_ANEMDEF cm_arth

cm_bldloss cm_chf cm_chrnlung cm_coag cm_depress cm_drug cm_hypothy cm_liver cm_lymph cm_lytes cm_mets cm_neuro cm_obese cm_para cm_perivasc  cm_pulmcirc cm_renlfail  cm_wghtloss cararrhythmia /oddsratio dist=binary link=logit ddfm=bwsolution;

random intercept / subject=hospid;

where pcionly=1 and stemi=0;

weight discwt;

title 'Logi Reg in-hosp mortality vs gender in POST-PCI MI patients using "where" option with RACE +income for nonstemi with pcionly';

run;

quit;

Regular Contributor
Posts: 173

Re: Proc Glimmix

Thanks @Ivm and Steve for your useful suggestions! I tried to minimize the number of predictor variables and I ran the following model. And I could get some output which seems right. But it gives me only estimates and p-values. I am attaching the output for your reference. Is there any way I can also get the odds ratio by using proc glimmix?

proc glimmix data=library.data;

class hospid female dm htn_c  ANEMDEF arth race1   ZIPINC_QRTL bldloss chf chrnlung coag depress drug hypothy

liver lymph lytes mets neuro obese para perivasc pulmcirc renlfail  wghtloss cararrhythmia;

model died (event='1') = female  dm htn_c   ANEMDEF arth race1 ZIPINC_QRTL bldloss chf chrnlung coag depress drug hypothy liver lymph lytes mets neuro obese para perivasc  pulmcirc renlfail wghtloss cararrhythmia /dist=binary link=logit ddfm=bw solution;

random intercept / subject=hospid;

run;

Valued Guide
Valued Guide
Posts: 679

Re: Proc Glimmix

Just add the ODDSRATIO option on the model statement. See the documentation for suboptions, but this should give you what you want.

Regular Contributor
Posts: 173

Re: Proc Glimmix

Hello lvm,

As discussed in this post, I finally got the right code for Proc Glimmix and I have been using it thus far. Only today when I started using it on much larger dataset (over 250000 records) . It is the same code as I have posted above. The Proc Glimmix worked fine on the 100000 cases. But on the new dataset witha around 250000 cases, the output I am getting is not complete. Here is the code.I am not sure why it is not yielding a complete. Any help is greatly appreciated. I am also attaching the output for your reference.

proc glimmix data=library.nismi091011_gender5 ; *using oddsratio*/;

class hospid gender racenew ZIPINC_QRTL_1  dm_all   cm_htn_c cm_ANEMDEF cm_arth cm_bldloss cm_chf

cm_chrnlung cm_coag cm_depress cm_drug cm_hypothy cm_liver cm_lymph cm_lytes cm_mets cm_neuro cm_obese cm_para

cm_perivasc  cm_pulmcirc cm_renlfail  cm_wghtloss cararrhythmia;

model died (event='1') = age gender  racenew ZIPINC_QRTL_1  dm_all   cm_htn_c cm_ANEMDEF cm_arth 

cm_bldloss cm_chf cm_chrnlung cm_coag cm_depress cm_drug cm_hypothy cm_liver cm_lymph cm_lytes cm_mets cm_neuro cm_obese cm_para cm_perivasc  cm_pulmcirc cm_renlfail  cm_wghtloss cararrhythmia /oddsratio dist=binary link=logit ddfm=bw solution;

random intercept / subject=hospid;

where pcionly=1 and stemi=0;

weight discwt;

title 'Logi Reg in-hosp mortality vs gender in POST-PCI MI patients using "where" option with RACE +income for nonstemi with pcionly';

run;

quit;

Valued Guide
Valued Guide
Posts: 679

Re: Proc Glimmix

You may simply be running out of memory. What does the Log say?

Ask a Question
Discussion stats
  • 37 replies
  • 2405 views
  • 12 likes
  • 3 in conversation