BookmarkSubscribeRSS Feed
skclHA
Calcite | Level 5

HI, all. 

I am currently running a DID analysis to examine the effects of policy interventions to depression.

I have a few questions about the control variables.

The current data format (example) is as follows.

 

Depression, age, income: continuous variables

Sex, work: dummy variables

 

pid

post

policy

depression

sex

age

income

work

1

0

1

16

1

50

200

1

1

1

1

15

1

50

300

1

2

0

0

20

0

30

500

1

2

1

0

35

0

30

400

0

3

0

0

17

0

20

600

0

3

1

0

50

0

20

900

1

4

0

1

35

1

40

800

0

4

1

1

25

1

40

400

0

 

1. In the above data format, should I time-variant control variables (income, work) convert a post-time value to a pre-time value (re-format 1) or a pre-time value convert to a post-time value (re-format 2)? Or should I use it in the current format?

 

RE-FORMAT (1)

pid

post

policy

depression

sex

age

income

Income_re

work

Work_re

1

0

1

16

1

50

200

200

1

1

1

1

1

15

1

50

300

200

1

1

2

0

0

20

0

30

500

500

1

1

2

1

0

35

0

30

400

500

0

1

3

0

0

17

0

20

600

600

0

0

3

1

0

50

0

20

900

600

1

0

4

0

1

35

1

40

800

800

0

0

4

1

1

25

1

40

400

800

0

0

 

RE-FORMAT (2)

pid

post

policy

depression

sex

age

income

Income_re

work

Work_re

1

0

1

16

1

50

200

300

1

1

1

1

1

15

1

50

300

300

1

1

2

0

0

20

0

30

500

400

1

0

2

1

0

35

0

30

400

400

0

0

3

0

0

17

0

20

600

900

0

1

3

1

0

50

0

20

900

900

1

1

4

0

1

35

1

40

800

400

0

0

4

1

1

25

1

40

400

400

0

0

 

 

2. If what I should use the data in current format is right, how should I make a syntax for the control variables (income, work) in the DID regression? And How should I interpret the results for dummy variables? I wrote currently syntax as bellow.

 

PROC MIXED DATA = LONG;

CLASS POST(REF="0") POLICY(REF="0") SEX(REF="0") WORK(REF="0");

MODEL DEPRESSION=POST|POLICY SEX AGE INCOME WORK / SOLUTION;

LSMEANS POST|EXPOSED / DIFF;

ESTIMATE 'D-I-D' EXPOSED*POST 1 -1 -1 1;

RANDOM Int/SUBJECT=PID TYPE=UN;

RUN;

 

3. Also, how would I write the syntax if I want to control for changes in the control variable (income, work) over time?

 

I apologize in advance for my poor English. Thanks! 

1 REPLY 1

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 370 views
  • 1 like
  • 2 in conversation