BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lousam
Obsidian | Level 7

Hello,

I am trying to conduct a difference in difference (DID) analysis to examine the effect of an intervention on the prevalence of smoking using a national cross-sectional survey dataset (2010-2015). The outcome of interest is a binary variable (smoking: yes/no), and I am comparing states with and without the policy of interest. 

I have been asked to run both a linear probability model (using proc surveyreg) and a logit model (using proc surveylogistic) to examine the effect of the policy in treated versus reference states.

I have to get the following estimates:

        1) The average difference in the probability of smoking (treated versus reference states) from a linear probability model (i.e., estimate [95% CI])

        2) The pre-post changes in the odds of smoking (treated versus reference states) from a logit model (i.e., odds ratio [95% CI])

 

Question 1: Does the coefficient obtained from the "lsmestimate" and "estimate" statement represent the DID estimate (i.e., the average difference in the probability of smoking)?

If not, how can I get the DID estimate as a probability with a 95% CI? Here is the code I was using:

proc surveyreg data=survey_data;
domain gender;
stratum stratum_var;
cluster cluster_var;
class policy_time intervention covar1 covar2 covar3;
weight weight_var;
model smoking= policy_time intervention policy_time*intervention covar1 covar2 covar3 /CLPARM solution vadjust=none;
estimate "Diff in Diff" policy_time*intervention 1 -1 -1 1;
lsmeans policy_time*intervention;
lsmestimate policy_time*intervention "Diff in Diff" 1 -1 -1 1;
run;

Question 2: How can I obtain the changes in the odds of smoking (treated versus reference states) as an odds ratio with a 95% CI?  Here is a sample code:

proc surveylogistic data=survey_data;
domain gender;
stratum stratum_var;
cluster cluster_var;
class policy_time intervention covar1 covar2 covar3;
weight weight_var;
model smoking= policy_time intervention policy_time*intervention covar1 covar2 covar3;
run;

 

I appreciate any insight you can offer.

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

I think your code for Question 1 will give what you need (although the ESTIMATE and LSMESTIMATE are redundant).  The trick then is to include at least the LSMEANS and LSMESTIMATE statements into the PROC SURVEYLOGISTIC code. You will need to output the results using ODS, and then call the %NLmeans macro to get the differences.  The documentation for the %NLmeans macro is in this note: https://support.sas.com/kb/62/362.html 

 

SteveDenham

View solution in original post

4 REPLIES 4
SteveDenham
Jade | Level 19

I think your code for Question 1 will give what you need (although the ESTIMATE and LSMESTIMATE are redundant).  The trick then is to include at least the LSMEANS and LSMESTIMATE statements into the PROC SURVEYLOGISTIC code. You will need to output the results using ODS, and then call the %NLmeans macro to get the differences.  The documentation for the %NLmeans macro is in this note: https://support.sas.com/kb/62/362.html 

 

SteveDenham

StatDave
SAS Super FREQ

It's not entirely clear what comparison you want in question 2, but since you say you want an odds ratio, I assume you want to compare one or more pairs of the four combinations. In that case, just use the LSMEANS statement with the DIFF and ODDSRATIO options (and CL if you want confidence intervals) which will give each of the pairwise comparisons and the corresponding odds ratios. For example:  lsmeans policy_time*intervention / ilink diff oddsratio cl;

 

If you again want an estimate of the DID on the mean scale, then see the second section of this note that shows how to obtain the DID on the means using the NLMeans macro. Or if pairwise differences in means are needed, that can also be done with NLMeans as shown in this note (though not shown in the context of a model with interaction).

lousam
Obsidian | Level 7

I apologize if my questions were unclear. For my second question, I wanted to get a DID estimate as an odds ratio from "proc surveylogistic". This odds ratio (95% CI) would represents the pre-post changes in smoking among individuals in treated states relative to the individuals in the untreated states. 

I believe there are several articles that provide a DID estimate using the following approaches:

        - DID estimate as an odds ratio (obtained using logistic regression)

        - DID estimate as the average difference in the probability of having the outcome of interest (obtained using linear regression)

StatDave
SAS Super FREQ

"DID" means *difference" in difference. Odds ratios, being ratios, are not differences. So, again, if you want to estimate the difference in differences of the means, then use the NLMeans macro in the note I referred to. If you truly want odds ratios, instead of differences, then use the LSMEANS statement with DIFF and ODDSRATIO options. 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 5012 views
  • 2 likes
  • 3 in conversation