BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Fluorite | Level 6

Hi, I am using Proc genmod to analyse outcome variable 'has insurance' and contrasting those with pre-existing conditions to those without before and after a policy was implemented using below code:

ods graphics on;
proc genmod data=thesis.mydata;
class preex_status year_class / ref=first;
weight perweight1;
model coverage_status(event= "1")= sex_dummy age_dummy race_dummy employment_dummy marital_status poverty_status year_class preex_status year_class*preex_status / dist=binomial link=identity;
estimate "Diff in Diff" preex_status*year_class 1 -1 -1 1;
lsmeans preex_status*year_class;
lsmestimate preex_status*year_class "Diff in Diff" 1 -1 -1 1;
ods graphics off;


In the model statement I am controlling for confounders with dummy variables. I get the below output. My question is who does this 0.9% represent? I know it is those without pre-existing conditions but other than that how do I know other demographic characteristic (race, age, sex, income etc?) I know I am controlling for White, female, employed, married etc. So, the the LS Means output and the contrast estimates is for whom (What does SAS take as default here)? This is the first time I am doing DiD and Genmod any input is highly appreciated. 


Output of above code is attached. Thanks a lot




Accepted Solutions
Jade | Level 19

Content removed as it was totally incorrect



View solution in original post

Jade | Level 19

Content removed as it was totally incorrect



Fluorite | Level 6

Thank you.This a very precise and helpful answer along with alternative solutions. I really appreciate this help. @SteveDenham 


See this note on estimating the difference in difference. For responses that are typically modeled using a generalized linear model, such as a logistic model in this case, the macros shown there can be used to estimate the difference in difference. While your identity-linked model can also be used, the identity link does not ensure that predicted values are in the valid range (between 0 and 1 for a binary response) and often results in model fitting errors.


In your model, the estimated difference in difference is the estimate at the mean of the dummy variable covariates that you included in your model. To see this, add the E option in your LSMEANS statement to see the coefficients of the linear combinations of model parameters which define the individual LS-means. Note that the mean of each of your dummy variables is used. Typically, you would include binary (or categorical) variables in the CLASS statement so that dummy variables are internally created for you. In that case, the E option will show that each LS-mean is estimated balanced across the levels of each covariate. Using the Margins macro to estimate the difference in difference based on predictive margins avoids associating the estimate with one particular value of each covariate. The predictive margins are the average predicted values computed using the actual values of the covariates rather than fixing them at one value.

Jade | Level 19

Shoot.  My answer is totally wrong. I thought of the dummy variables as CLASS variables, even though I wrote that they were continuous.  No excuses - what I wrote was flat wrong. @drteju , I think you are the only one who can change how that response is marked.


My compliments to @StatDave for providing the CORRECT interpretation.




Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.


Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 3 in conversation