HI, all. I am currently running a DID analysis to examine the effects of policy interventions to depression. I have a few questions about the control variables. The current data format (example) is as follows. Depression, age, income: continuous variables Sex, work: dummy variables pid post policy depression sex age income work 1 0 1 16 1 50 200 1 1 1 1 15 1 50 300 1 2 0 0 20 0 30 500 1 2 1 0 35 0 30 400 0 3 0 0 17 0 20 600 0 3 1 0 50 0 20 900 1 4 0 1 35 1 40 800 0 4 1 1 25 1 40 400 0 1. In the above data format, should I time-variant control variables (income, work) convert a post-time value to a pre-time value (re-format 1) or a pre-time value convert to a post-time value (re-format 2)? Or should I use it in the current format? RE-FORMAT (1) pid post policy depression sex age income Income_re work Work_re 1 0 1 16 1 50 200 200 1 1 1 1 1 15 1 50 300 200 1 1 2 0 0 20 0 30 500 500 1 1 2 1 0 35 0 30 400 500 0 1 3 0 0 17 0 20 600 600 0 0 3 1 0 50 0 20 900 600 1 0 4 0 1 35 1 40 800 800 0 0 4 1 1 25 1 40 400 800 0 0 RE-FORMAT (2) pid post policy depression sex age income Income_re work Work_re 1 0 1 16 1 50 200 300 1 1 1 1 1 15 1 50 300 300 1 1 2 0 0 20 0 30 500 400 1 0 2 1 0 35 0 30 400 400 0 0 3 0 0 17 0 20 600 900 0 1 3 1 0 50 0 20 900 900 1 1 4 0 1 35 1 40 800 400 0 0 4 1 1 25 1 40 400 400 0 0 2. If what I should use the data in current format is right, how should I make a syntax for the control variables (income, work) in the DID regression? And How should I interpret the results for dummy variables? I wrote currently syntax as bellow. PROC MIXED DATA = LONG; CLASS POST(REF="0") POLICY(REF="0") SEX(REF="0") WORK(REF="0"); MODEL DEPRESSION=POST|POLICY SEX AGE INCOME WORK / SOLUTION; LSMEANS POST|EXPOSED / DIFF; ESTIMATE 'D-I-D' EXPOSED*POST 1 -1 -1 1; RANDOM Int/SUBJECT=PID TYPE=UN; RUN; 3. Also, how would I write the syntax if I want to control for changes in the control variable (income, work) over time? I apologize in advance for my poor English. Thanks!
... View more