BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cjpsas
Calcite | Level 5

Hi all,

I am trying to learn how to estimate the difference in difference in models with covariates, using the example from this SAS note (https://support.sas.com/kb/61/830.html). In the example (pasted below), I understand the input k1-k12 as the # of means estimated by the LSMEANS statement, but don't fully understand how the datalines (1 -1 -1 1 0 0 0 0...etc) represent the three levels of the covariate of interest. I want to make sure I fully understand this format so I can modify the code for datasets with more covariates or covariates with >3 levels. It seems very basic but any explanation would be greatly appreciated!

data difdif;
        input k1-k12;
        set=1;
        datalines;
        1 -1 -1 1   0 0 0 0     0 0 0 0
        0 0 0 0     1 -1 -1 1   0 0 0 0
        0 0 0 0     0 0 0 0     1 -1 -1 1   
        ;
      %NLMeans(instore=log, coef=coeffs, link=logit, contrasts=difdif,
               title=Difference in Difference of Means)

 

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

As described in that note, C has 3 levels (and A and B each have 2) so that the LSMEANS statement that is shown produces 12 estimates. Hence there are 12 values in each data row of the DIFDIF DATA step. The 12 values in a row multiply the mean estimates from the LSMEANS statement, so if you want a difference in difference estimate at each level of C, you need to construct a list of 12 values to compute the difference in difference at each level of C. Because C is first in the specification, C*A*B, in the LSMEANS statement, the first 4 estimates are for the first level of C. So the first non-zero values in the first data row compute the difference in difference for C=1 and the zeros in the final 8 values ignore the C=2 and 3 levels. Similarly, the second data row picks up just the C=2 estimates to compute the difference in difference for C=2. And similarly for the third data row. 

View solution in original post

3 REPLIES 3
acordes
Rhodochrosite | Level 12
This is about design matrices.
Run proc reg or proc glm with the option print all, use difference statements and study the design matrix.
You'll see which 1 belongs to the intercept, how differences are constructed.
This is subject to your parameterization choice.
Read this blog and things will become clearer.
https://blogs.sas.com/content/iml/2016/02/24/create-a-design-matrix-in-sas.html
StatDave
SAS Super FREQ

As described in that note, C has 3 levels (and A and B each have 2) so that the LSMEANS statement that is shown produces 12 estimates. Hence there are 12 values in each data row of the DIFDIF DATA step. The 12 values in a row multiply the mean estimates from the LSMEANS statement, so if you want a difference in difference estimate at each level of C, you need to construct a list of 12 values to compute the difference in difference at each level of C. Because C is first in the specification, C*A*B, in the LSMEANS statement, the first 4 estimates are for the first level of C. So the first non-zero values in the first data row compute the difference in difference for C=1 and the zeros in the final 8 values ignore the C=2 and 3 levels. Similarly, the second data row picks up just the C=2 estimates to compute the difference in difference for C=2. And similarly for the third data row. 

cjpsas
Calcite | Level 5

Thank you for the clear explanation!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 423 views
  • 1 like
  • 3 in conversation