BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

By the way, the output says you have 87,303 observations. You may have a problem with your datafile. It comes back tot he Log: what does it say?

Ashwini_uci
Obsidian | Level 7

The number of observations is right. I am working on a subset (87,303) of the whole sample (250,000) and after weighting this number of sub-sample goes up to almost 400,000.

Ashwini_uci
Obsidian | Level 7

Strange enough, none of the models that worked successfully on the earlier small dataset is working on this new larger dataset.

Ashwini_uci
Obsidian | Level 7

The log is showing me the warning as below:

WARNING: Obtaining minimum variance quadratic unbiased estimates as starting values for the covariance parameters failed.

Never encountered it before and not sure what it means. Do you have any idea what it exactly means and how to fix it?

Thanks for your time and help!

Ashwini

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

The default parameters for your covariance matrix (a single variance in your case) are not allowable, given your data.This problem can be difficult to solve. If you search for the the exact warning message in Google, you get plenty of hits. You can try specifying the initial variance value with:

     parms (#);

where # is some constant for the initial guess of the variance. Try putting in the estimate you were getting with your smaller sample (before you had trouble). Or try a much smaller number or much larger.

SteveDenham
Jade | Level 19

I still think the model is over-parameterized, so that for this particular dataset, there is a quasi-separation problem.

With 26 class variables, assuming they are all binary, you have ( I think) 2 to the 26th power possible cells in a cross-tab.  That is a little over 67 million cells, and with only 400000 (really only 87000) cases, you are bound to have massive numbers of cells with no observations.  Thus, when GLIMMIX is trying to get started, there isn't a way to estimate the variance due to hospid.

You are going to have to attack this in chunks, looking at smaller numbers of independent variables in multiple fits of the data.

Steve Denham

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

I am pretty sure that Steve is right, your model is overparameterized. I suggest you start with a simple fixed-effects part of the model. You could have just a few predictor variables. Then add terms in steps to see when things blow up.

Ashwini_uci
Obsidian | Level 7

Thank you Steve and @lvm for your time and ideas abotu this particular problem.

I have already tried to run the model with bare minimum predictors and it just won't give me a complete output. And these are the models that worked just fine with the smaller number of cases. This is actaully the NIS-HCUP dataset. And now I have merged the 3 years worth data to get some valid results and I thought it would work fine with a large sample size. But unfortunately I am running into this particular warning. I shall keep trying..

Thanks again!

Ashwini

SteveDenham
Jade | Level 19

So it works with all of those independent variables on a smaller dataset?  That is interesting.

Consider how the data are sorted.  The sweep operator for matrix inversion can often be influenced by the data order, especially in these very complex models.  If you "stacked" the 3 years, this may be the cause.  Try resorting, and make sure the class statement and model statement reflect the sort order.

Steve Denham

Ashwini_uci
Obsidian | Level 7

Yes, it works with all of those independent variables on a smaller dataset..

Yes, I have "stacked" the 3 years of data

What do you exactly mean by making sure that the class statement and model statement reflect the sort order? I am not qute clear about this. Would you please elaborate on this?

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

Are you saying that you can't even fit an intercept-only model?

model died (event=last) =/ dist=binary link=logit solution ;

I am thinking that you have a problem with your new data file that is confusing GLIMMIX. There is no reason why this model could not be fitted.


Ashwini_uci
Obsidian | Level 7

The intercept model works and parsimonious models work too.

I am running total 10 models on different sub-samples from this data. For Instance..the sample with STEMI type of heart attack and those who received no procedure or STEMI with PCI- Stenting procedure and such.. Below are some examples..

And now each of these and rest of the similar models are running with a very few and different set of predictor variables. Out of total 10 models, only 5 worked after I removed race and income from them. I am not sure how I can get the rest 5 give me complete output.

proc glimmix data=library.nismi091011_gender5 ; *using oddsratio*/;

class hospid gender dm_all   cm_htn_c  cm_aids cm_alcohol cm_ANEMDEF cm_arth  cm_bldloss cm_chf

cm_chrnlung cm_coag cm_depress cm_drug cm_hypothy cm_liver cm_lymph cm_lytes cm_mets cm_neuro cm_obese cm_para cm_perivasc  cm_psych cm_pulmcirc cm_renlfail  cm_wghtloss cararrhythmia;

model died (event='1') = age gender  dm_all   cm_htn_c cm_aids cm_alcohol cm_ANEMDEF cm_arth cm_bldloss cm_chf

cm_chrnlung cm_coag cm_depress cm_drug cm_hypothy cm_liver cm_lymph cm_lytes cm_mets cm_neuro cm_obese cm_para cm_perivasc  cm_psych cm_pulmcirc cm_renlfail  cm_wghtloss cararrhythmia /oddsratio dist=binary link=logit ddfm=bw solution;

random intercept / subject=hospid;

where noproc=1 and stemi=0;

weight discwt;

title 'proc glimmix for in-hosp mortality NSTEMI patients with no procedure';

run;

quit;



proc glimmix data=library.nismi091011_gender5 ; *using oddsratio*/;

class hospid gender dm_all   cm_htn_c  cm_aids cm_alcohol cm_ANEMDEF cm_arth  cm_bldloss cm_chf

cm_chrnlung cm_coag cm_depress cm_drug cm_hypothy cm_liver cm_lymph cm_lytes cm_mets cm_neuro cm_obese cm_para cm_perivasc  cm_psych cm_pulmcirc cm_renlfail  cm_wghtloss cararrhythmia;

model died (event='1') = age gender  dm_all cm_htn_c  cm_aids cm_alcohol cm_ANEMDEF cm_arth  cm_bldloss cm_chf

cm_chrnlung cm_coag cm_depress cm_drug cm_hypothy cm_liver cm_lymph cm_lytes cm_mets cm_neuro cm_obese cm_para cm_perivasc  cm_psych cm_pulmcirc cm_renlfail  cm_wghtloss cararrhythmia /oddsratio dist=binary link=logit ddfm=bw solution;

random intercept / subject=hospid;

where noproc=1 and stemi=1;

weight discwt;

title 'proc glimmix for in-hosp mortality  in STEMI patients with no procedure ';

run;

proc glimmix data=library.nismi091011_gender5 ; *using oddsratio*/;

class hospid gender dm_all   cm_htn_c   cm_ANEMDEF cm_arth cm_bldloss cm_chf

cm_chrnlung cm_coag cm_depress cm_drug cm_hypothy cm_liver cm_lymph cm_lytes cm_mets cm_neuro cm_obese cm_para cm_perivasc  cm_pulmcirc cm_renlfail  cm_wghtloss cararrhythmia;

model died (event='1') = age gender    dm_all cm_htn_c   cm_ANEMDEF cm_arth 

cm_bldloss cm_chf cm_chrnlung cm_coag cm_depress cm_drug cm_hypothy cm_liver cm_lymph cm_lytes cm_mets cm_neuro cm_obese cm_para cm_perivasc  cm_pulmcirc cm_renlfail  cm_wghtloss cararrhythmia /oddsratio dist=binary link=logit ddfm=bw solution;

random intercept / subject=hospid;

where pcionly=1 and stemi=0;

title 'proc glimmix for in-hosp mortality  in POST-PCIonly NSTEMI patients ';

weight discwt;

run;

quit;



proc glimmix data=library.nismi091011_gender5 ; *using oddsratio*/;

class hospid gender dm_all cm_htn_c   cm_ANEMDEF cm_arth cm_bldloss cm_chf

cm_chrnlung cm_coag cm_depress cm_drug cm_hypothy cm_liver cm_lymph cm_lytes cm_mets cm_neuro cm_obese cm_para cm_perivasc  cm_pulmcirc cm_renlfail  cm_wghtloss cararrhythmia;

model died (event='1') = age gender  dm_all cm_htn_c   cm_ANEMDEF cm_arth 

cm_bldloss cm_chf cm_chrnlung cm_coag cm_depress cm_drug cm_hypothy cm_liver cm_lymph cm_lytes cm_mets cm_neuro cm_obese cm_para cm_perivasc  cm_pulmcirc cm_renlfail  cm_wghtloss cararrhythmia /oddsratio dist=binary link=logit ddfm=bw solution;

random intercept / subject=hospid;

where pcionly=1 and stemi=1;

weight discwt;

title 'proc glimmix for in-hosp mortality  in POST-PCIonly STEMI patients ';

run;

quit;

SteveDenham
Jade | Level 19

Is there any way to collapse the data to the events/trial syntax, where dist=binomial can be used? I would collapse on hospid, but it is going to involve the independent variables.  Out put from PROC FREQ crosstabs should give what you need.  Many times this method can overcome starting value problems encountered with dist=binary.

Steve Denham

Ashwini_uci
Obsidian | Level 7

The hospid is the hospital ID variable with 700-800 levels, depending on the sub-sample. Not sure how I can collapse that. All the independent variables, except race and income are binary variables...

SteveDenham
Jade | Level 19

OK.  To get events/trial syntax numbers, you need the number of events, at each hospid, for the cross-tabbed independent variables.  Something like:

proc means noprint nway;

class (same as above);

var died;

output out=eventtrial;

run;

You would then have the counts in the _freq_ variable in this dataset.  Combine the 0 and 1 counts to get the number of trials.

The covariate that most concerns me is age.  If this is handled as a continuous variable, all of this will be for naught, I think, as there would be too few 'trials' for a given age-hospid combination.  However, if reasonable discrete ranges for age can be used, then this should lead to an events/trial dataset that makes sense.

Steve Denham

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 37 replies
  • 8118 views
  • 12 likes
  • 3 in conversation