BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
OJohn_StaT
Obsidian | Level 7

Hello Everyone,

I run the code below. GEE parameter estimate table have some coefficient estimates with Coefficient Value 0 and missing Z values. 

 

Data is a patient level data and collected each year for each patient from 2014 to 2018.

 

Number of observation=2789036

 

Dependent Variables;

hospital_visit=0-1

Independent Variables;

plan_1_enrolled=0-1
patient_race=White Black Asian Hispanic
urban_area=0-1
patient_age=16-64
patient_gender=0-1
care_type=1-2-3-4-5
calender_year=2014-2015-2016-2017-2018
plan_2_enrolled=0-1
plan_3_enrolled=0-1
Patient_ID=12 digit id number

 

The Code 

 

 

proc genmod data = online ;
format calender_year patient_gender care_type patient_age plan_1_enrolled patient_race care_type plan_2_enrolled plan_3_enrolled urban_area;

class Patient_ID calender_year patient_gender (ref='1') care_type(ref='1') patient_race(ref='WHITE') plan_1_enrolled(ref='0') plan_2_enrolled(ref='0') plan_3_enrolled (ref='0') urban_area (ref='0') ;

model hospital_visit (event='1') = plan_1_enrolled*patient_race plan_1_enrolled urban_area patient_race patient_age patient_gender care_type calender_year plan_2_enrolled plan_3_enrolled / dist=normal link=IDENTITY type3  ;

repeated subject=Patient_ID/type=un;

lsmeans plan_1_enrolled*patient_race urban_area plan_1_enrolled patient_race patient_gender care_type calender_year plan_2_enrolled plan_3_enrolled / cl diff ilink exp;


run;

 

      title'MODEL results for question ';
proc genmod data = online ;
  format calender_year patient_gender care_type patient_age plan_1_enrolled patient_race care_type plan_2_enrolled plan_3_enrolled  urban_area;

  class Patient_ID calender_year  patient_gender (ref='1') care_type(ref='1') patient_race(ref='WHITE') plan_1_enrolled(ref='0')  plan_2_enrolled(ref='0')  plan_3_enrolled (ref='0') urban_area (ref='0')  ;

model   hospital_visit (event='1') = plan_1_enrolled*patient_race plan_1_enrolled urban_area  patient_race patient_age patient_gender care_type calender_year   plan_2_enrolled plan_3_enrolled 
/ 
                          dist=normal
link=IDENTITY
                          type3 
;

repeated subject=Patient_ID/type=un;

lsmeans plan_1_enrolled*patient_race urban_area plan_1_enrolled patient_race  patient_gender care_type calender_year  plan_2_enrolled plan_3_enrolled / cl diff ilink exp;


  run;

result  pic jpeg.jpg

 

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

Always check the log for error or warning messages and mention all such messages when you post questions. This warning indicates that the procedure was not able to converge to a proper maximum likelihood solution. When that happens, GENMOD just displays the last iteration which may not be something you'd want to use. Model fitting problems are quite common with binary response models, usually because the data are too sparse to support the complexity of the specified model. I suggest you start with a simpler model. First, either remove the TYPE3 option or add the WALD option. Also, don't use the most complex correlation structure - replace TYPE=UN with TYPE=IND or TYPE=EXCH. If you still get errors or warnings, then simplify the model by reducing the number of predictors in the model, probably starting with the most important variable or two. If that fits successfully, then add a variable at a time. You might not be able to successfully use all of them. 

View solution in original post

5 REPLIES 5
StatDave
SAS Super FREQ

All of the zero parameter estimates, besides for patient race and plan_1_enrolled, are all proper since they are for the reference levels of those CLASS variables. You should investigate the values of those two variables in your data to see is something is wrong there. You should also check the SAS log to see if there are any messages that might suggest a problem causing those parameters to be zero.

 

However, if your response is binary then it is decidedly not normally distributed (and the EVENT= response option will be ignored). You should specify DIST=BIN.

OJohn_StaT
Obsidian | Level 7

Hello @ StatDave_sas ,

 

Thank you for taking the time to help me to solve the issue. I change the distribution comment to dist=bin and link comment to link=logit.  But I had the same issue on the results again. Also, I had the warning of "WARNING: Iteration limit exceeded." after using "dist=bin and link=logit".

 

I also investigated these 2 problematic variables but there is no missing data issue or something like that. The variables seem fine.

I do not know what is the problem here?

 

Thank you in advance.

 

StatDave
SAS Super FREQ

Always check the log for error or warning messages and mention all such messages when you post questions. This warning indicates that the procedure was not able to converge to a proper maximum likelihood solution. When that happens, GENMOD just displays the last iteration which may not be something you'd want to use. Model fitting problems are quite common with binary response models, usually because the data are too sparse to support the complexity of the specified model. I suggest you start with a simpler model. First, either remove the TYPE3 option or add the WALD option. Also, don't use the most complex correlation structure - replace TYPE=UN with TYPE=IND or TYPE=EXCH. If you still get errors or warnings, then simplify the model by reducing the number of predictors in the model, probably starting with the most important variable or two. If that fits successfully, then add a variable at a time. You might not be able to successfully use all of them. 

OJohn_StaT
Obsidian | Level 7

Thank you for your help again. Okay I will try your suggestions and I will share the results. Thank you!

OJohn_StaT
Obsidian | Level 7

I had progress. I remove the interaction between race and plan_1_enrolled and I also use TYPE=IND instead of TYPE=UN. Now I can get the estimates and Z values and Pr>|Z|. Thank you for the help!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1127 views
  • 3 likes
  • 2 in conversation