- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am working on difference in difference analysis for longitudinal data. The goal is to investigate the difference on the health care cost by comparing intervention group and control group. We collect response variable at three timepoints: baseline, first year and second year. I know that hypothesis testing for D-I-D can be specify as following if there are only pre and post (baseline and first year) involved. My question is how I should specify hypothesis testing for D-I-D if one more year data was added. Your help is greatly appreciated.
proc GENMOD data= data_set;
class id treatment(ref='0') post(ref='0');
model cost =treatment post treatment*post / dist=gamma link=log type3;
repeated subject=id / type=un;
estimate "DID Post-Pre" treatment*post 1 -1 -1 1;
lsmestimate treatment*post "DID Post-Pre" 1 -1 -1 1;
run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In this note, see "Difference in Difference Analysis in a Pre/Post Longitudinal Study" in the "Generalized Linear Models with a Non-Identity Link" section. As shown there, you can use the Margins macro to estimate and test a hypothesis on the response means. You haven't stated exactly what you want to test, but assuming it is that the difference in the pre mean minus the average of the post means is the same in the two treatment groups, it would look like the Margins macro call in that section assuming that the POST variable has three levels (pre, 1yr, 2yr) in that order. Note that the contrast coefficients defining the hypothesis are applied to the margin estimates as displayed, so the ordering is important to proper interpretation.
data c;
length label f $32767;
infile datalines delimiter='|';
input label f;
datalines;
DID pre-avg.post | 1 -.5 -.5 -1 .5 .5
;
%Margins(data = data_set,
response = cost,
class = trt post,
model = trt|post,
link = log,
dist = gamma,
geesubject = id,
margins = trt post,
contrasts = c,
options = cl)
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In this note, see "Difference in Difference Analysis in a Pre/Post Longitudinal Study" in the "Generalized Linear Models with a Non-Identity Link" section. As shown there, you can use the Margins macro to estimate and test a hypothesis on the response means. You haven't stated exactly what you want to test, but assuming it is that the difference in the pre mean minus the average of the post means is the same in the two treatment groups, it would look like the Margins macro call in that section assuming that the POST variable has three levels (pre, 1yr, 2yr) in that order. Note that the contrast coefficients defining the hypothesis are applied to the margin estimates as displayed, so the ordering is important to proper interpretation.
data c;
length label f $32767;
infile datalines delimiter='|';
input label f;
datalines;
DID pre-avg.post | 1 -.5 -.5 -1 .5 .5
;
%Margins(data = data_set,
response = cost,
class = trt post,
model = trt|post,
link = log,
dist = gamma,
geesubject = id,
margins = trt post,
contrasts = c,
options = cl)
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, Dave. This is very helpful. If I want to test the difference in the year 2 and baseline between the treatment and control group, should I specify as: DID pre- Year 2 post | 1 0 -1 -1 0 1
I am not sure how to specify testing by position and can't find an easy to follow tutorial/documentation. Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I have a study in which the program (RJ) has 4 years of treatment and one pre-treatment year (exposure; coded as 0-4). I want to compare the odds of getting suspended (1/0) for students during the pre-treatment year, and each subsequent year of exposure to the RJ program with the odds of getting suspended for students who didn't receive the treatment (RJ=0).
In other words,
Students exposed to treatment compared with their pre-treatment year.
Students not exposed to treatment compared with their pre-treatment year.
Students exposed compared with students not exposed.
I'm wondering what the difference would be if I use the following two codes:
class RJ (ref='0') / param=ref;
class exposure (ref='0') / param=ref;
model SUSPENDED =
RJ exposure exposure*RJ
/ clodds=wald ORPVALUE;
oddsratio RJ / diff=ref;
oddsratio exposure / diff=ref;
RUN;
: | Under full-rank parameterizations, Type 3 effect tests are replaced by joint tests. The joint test for an effect is a test that all the parameters associated with that effect are zero. Such joint tests might not be equivalent to Type 3 effect tests under GLM parameterization. |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
The topic thread where you added your new question has already been solved (last year).
Hence ... very few people will read your question (the topic thread participants will get notified though).
Can you start a new topic in the "Statistical Procedures"-board (under the "Analytics"-header)??
Thanks, Koen
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The description of your study implies that your subjects are observed repeatedly over time. Neither of your analyses takes the resulting correlation among the repeated measures into account. See the last section titled "Treated and Control Groups, Binary Response" in this note. The estimate named "1 month change diff" produced using the Margins macro, or the estimate named "adjusted exp change" using the NLMeans macro, compare the change from pre to post in the exposed vs. unexposed groups like what you want to do.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much, this SAS documentation is helpful. I should have mentioned that many of the subjects are not the same from year to year because of new students who enter the school and students who graduate. Given that the measures repeat, but the subjects differ, would I still need to do the interrupted time series analysis?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If "many of the subjects" means that some subjects do have repeated measures then you still have correlation. Of course, it is up to you if you want to ignore the correlation and assume that all measurements are independent.