BookmarkSubscribeRSS Feed
r4321
Pyrite | Level 9

Hello all,

 

I have a dataset where I am hoping to indicator variables that help me better organize a matched sample from my data. Essentially, I have data on counties observed each week over the course of a year, some of them experienced a focal event (i.e., if P =1) and some did not and will serve as controls. Additionally, some counties experienced the focal event more than once. I want to add an indicator variable that identifies before and after these focal events at different durations (e.g., 2 weeks or 3 weeks) for both the treatments and the non-treated counties. Also, it would be ideal if once a county experienced a focal event they can no longer be in the control set from there forward, but it’s okay if they stay in the dataset and experience a focal event again.

 

I believe I need some variables such as a before/after variable at different cuts (i.e., 2 weeks before and after as well as 3 weeks before and after), a duration variable at different cuts (i.e., 2 weeks before and after as well as 3 weeks before and after), and a variable (treated) that indicates that a county has been treated (after it has been treated).

 

So I would like to create a set of variables that looks something so:

 

Have:

 

Time

ID

County

FocalEvent

1

222

ABC

0

2

222

ABC

0

3

222

ABC

0

4

222

ABC

0

5

222

ABC

1

6

222

ABC

0

7

222

ABC

0

8

222

ABC

0

9

222

ABC

0

10

222

ABC

0

11

222

ABC

0

12

222

ABC

0

13

222

ABC

0

1

333

BBB

0

2

333

BBB

0

3

333

BBB

0

4

333

BBB

0

5

333

BBB

0

6

333

BBB

0

7

333

BBB

0

8

333

BBB

0

9

333

BBB

0

10

333

BBB

0

11

333

BBB

0

12

333

BBB

0

13

333

BBB

0

1

444

CCC

0

2

444

CCC

0

3

444

CCC

0

4

444

CCC

0

5

444

CCC

1

6

444

CCC

0

7

444

CCC

0

8

444

CCC

0

9

444

CCC

0

10

444

CCC

1

11

444

CCC

0

12

444

CCC

0

13

444

CCC

0

1

555

DDD

0

2

555

DDD

0

3

555

DDD

0

4

555

DDD

0

5

555

DDD

0

6

555

DDD

0

7

555

DDD

0

 

 

Want:

 

Time

ID

County

FocalEvent

BeforeAfter2

Duration2

BeforeAfter3

Duration3

treated

1

222

ABC

0

0

0

0

0

0

2

222

ABC

0

0

0

0

1

0

3

222

ABC

0

0

1

0

1

0

4

222

ABC

0

0

1

0

1

0

5

222

ABC

1

1

1

1

1

1

6

222

ABC

0

1

1

1

1

1

7

222

ABC

0

0

0

1

1

1

8

222

ABC

0

0

0

0

0

1

9

222

ABC

0

0

0

0

0

1

10

222

ABC

0

0

0

0

0

1

11

222

ABC

0

0

0

0

0

1

12

222

ABC

0

0

0

0

0

1

13

222

ABC

0

0

0

0

0

1

1

333

BBB

0

0

0

0

0

0

2

333

BBB

0

0

0

0

1

0

3

333

BBB

0

0

1

0

1

0

4

333

BBB

0

0

1

0

1

0

5

333

BBB

0

1

1

1

1

0

6

333

BBB

0

1

1

1

1

0

7

333

BBB

0

0

0

1

1

0

8

333

BBB

0

0

1

0

1

0

9

333

BBB

0

0

1

0

1

0

10

333

BBB

0

1

1

1

1

0

11

333

BBB

0

1

1

1

1

0

12

333

BBB

0

0

0

1

1

0

13

333

BBB

0

0

0

0

0

0

1

444

CCC

0

0

0

0

0

0

2

444

CCC

0

0

0

0

1

0

3

444

CCC

0

0

1

0

1

0

4

444

CCC

0

0

1

0

1

0

5

444

CCC

1

1

1

1

1

1

6

444

CCC

0

1

1

1

1

1

7

444

CCC

0

0

0

1

1

1

8

444

CCC

0

0

1

0

1

1

9

444

CCC

0

0

1

0

1

1

10

444

CCC

1

1

1

1

1

1

11

444

CCC

0

1

1

1

1

1

12

444

CCC

0

0

0

1

1

1

13

444

CCC

0

0

0

0

0

1

1

555

DDD

0

0

0

0

0

0

2

555

DDD

0

0

0

0

1

0

3

555

DDD

0

0

1

0

1

0

4

555

DDD

0

0

1

0

1

0

5

555

DDD

0

1

1

1

1

0

6

555

DDD

0

1

1

1

1

0

7

555

DDD

0

0

0

1

1

0

 

 

Thanks for your time and attention. Really appreciate it.

 

Kindly,

 

R

 

4 REPLIES 4
arthurcavila
Obsidian | Level 7

I can easily get half of what you need using LAG and RETAIN. The other indicators are tricky. You can sort the database backwards in time and repeat the process to get an indicator for the leading periods.

 

Data have;
	input Time ID County $ FocalEvent;
	cards;
1 222 ABC 0
2 222 ABC 0
3 222 ABC 0
4 222 ABC 0
5 222 ABC 1
6 222 ABC 0
7 222 ABC 0
8 222 ABC 0
9 222 ABC 0
10 222 ABC 0
11 222 ABC 0
12 222 ABC 0
13 222 ABC 0
1 333 BBB 0
2 333 BBB 0
3 333 BBB 0
4 333 BBB 0
5 333 BBB 0
6 333 BBB 0
7 333 BBB 0
8 333 BBB 0
9 333 BBB 0
10 333 BBB 0
11 333 BBB 0
12 333 BBB 0
13 333 BBB 0
1 444 CCC 0
2 444 CCC 0
3 444 CCC 0
4 444 CCC 0
5 444 CCC 1
6 444 CCC 0
7 444 CCC 0
8 444 CCC 0
9 444 CCC 0
10 444 CCC 1
11 444 CCC 0
12 444 CCC 0
13 444 CCC 0
1 555 DDD 0
2 555 DDD 0
3 555 DDD 0
4 555 DDD 0
5 555 DDD 0
6 555 DDD 0
7 555 DDD 0
;
run;

data want_step1;
	set have;
	retain treated;
	by ID;
	if first.id then do;
		treated = 0;
		BeforeAfter2 = FocalEvent;
		BeforeAfter3 = FocalEvent;
	end;
	else do;
		BeforeAfter2 = FocalEvent + lag(FocalEvent);
		BeforeAfter3 = FocalEvent + lag(BeforeAfter2);
		if BeforeAfter2 = . then BeforeAfter2 = 0;
		if BeforeAfter3 = . then BeforeAfter3 = 0;
	end;
	treated = min(treated + FocalEvent, 1);
run;
proc print; run;

r4321
Pyrite | Level 9
Hm... thanks for taking a look!

I am not having much luck with this code, however. It seems to do okay with creating the indicators for the counties that experienced the focal event, but it doesn't do it for the ones that didnt. Also, brainstorming how to best deal with that duration variable as well.

Appreciate your help.
arthurcavila
Obsidian | Level 7

This?

 

Data have;
	input Time ID County $ FocalEvent;
	cards;
1 222 ABC 0
2 222 ABC 0
3 222 ABC 0
4 222 ABC 0
5 222 ABC 1
6 222 ABC 0
7 222 ABC 0
8 222 ABC 0
9 222 ABC 0
10 222 ABC 0
11 222 ABC 0
12 222 ABC 0
13 222 ABC 0
1 333 BBB 0
2 333 BBB 0
3 333 BBB 0
4 333 BBB 0
5 333 BBB 0
6 333 BBB 0
7 333 BBB 0
8 333 BBB 0
9 333 BBB 0
10 333 BBB 0
11 333 BBB 0
12 333 BBB 0
13 333 BBB 0
1 444 CCC 0
2 444 CCC 0
3 444 CCC 0
4 444 CCC 0
5 444 CCC 1
6 444 CCC 0
7 444 CCC 0
8 444 CCC 0
9 444 CCC 0
10 444 CCC 1
11 444 CCC 0
12 444 CCC 0
13 444 CCC 0
1 555 DDD 0
2 555 DDD 0
3 555 DDD 0
4 555 DDD 0
5 555 DDD 0
6 555 DDD 0
7 555 DDD 0
;
run;

proc means data=have nway noprint;
	class time;
	var FocalEvent;
	output out=time_indicator(keep=Time FocalEvent) max = FocalEvent;
run;

proc means data=have nway noprint;
	class ID;
	var FocalEvent;
	output out=control_indicator(keep=ID NotControl) max = NotControl;
run;

data want_step1_control;
	set time_indicator;
	retain treated;
	if _N_ = 1 then do;
		treated = 0;
		BeforeAfter2 = FocalEvent;
		BeforeAfter3 = FocalEvent;
	end;
	else do;
		BeforeAfter2 = FocalEvent + lag(FocalEvent);
		BeforeAfter3 = FocalEvent + lag(BeforeAfter2);
		if BeforeAfter2 = . then BeforeAfter2 = FocalEvent;
		if BeforeAfter3 = . then BeforeAfter3 = BeforeAfter2;
	end;
	treated = min(treated + FocalEvent, 1);
run;

proc sort data=want_step1_control;
	by descending time;
run;

data want_step2_control;
	set want_step1_control;
	if _N_ =1 then do;
		Before1 = FocalEvent;
		Before2 = FocalEvent;
		Before3 = FocalEvent;
	end;
	else do;
		Before1 = FocalEvent + lag(FocalEvent);
		Before2 = FocalEvent + lag(Before1);
		Before3 = FocalEvent + lag(Before2);
	end;
	Duration2 = min(Before2 + BeforeAfter2, 1);
	Duration3 = min(Before3 + BeforeAfter3, 1);
	drop before1-before3 FocalEvent;
run;
proc sort data=want_step2_control;
	by time;
title "indicator for control";
proc print; run;



data want_step1_not_control;
	merge have control_indicator;
	by ID;
	retain treated;
	if NotControl then do;
		if first.id then do;
			treated = 0;
			BeforeAfter2 = FocalEvent;
			BeforeAfter3 = FocalEvent;
		end;
		else do;
			BeforeAfter2 = FocalEvent + lag(FocalEvent);
			BeforeAfter3 = FocalEvent + lag(BeforeAfter2);
			if BeforeAfter2 = . then BeforeAfter2 = 0;
			if BeforeAfter3 = . then BeforeAfter3 = 0;
		end;
	end;
	else delete;
	treated = min(treated + FocalEvent, 1);
run;

proc sort data=want_step1_not_control;
	by id descending time;
run;

data want_step2_not_control;
	set want_step1_not_control;
	by ID;
	retain treated;
	if NotControl then do;
		if first.id then do;
			Before1 = FocalEvent;
			Before2 = FocalEvent;
			Before3 = FocalEvent;
		end;
		else do;
			Before1 = FocalEvent + lag(FocalEvent);
			Before2 = FocalEvent + lag(Before1);
			Before3 = FocalEvent + lag(Before2);
		end;
		Duration2 = min(Before2 + BeforeAfter2, 1);
		Duration3 = min(Before3 + BeforeAfter3, 1);
	end;
	treated = min(treated + FocalEvent, 1);
	drop before1-before3;
run;

title "indicators for not control";
proc print; run;	


data want_step3_control;
	merge have control_indicator;
	by ID;
	if NotControl then delete;
	treated = 0;
proc sort data= want_step2_control;
	by time;
proc sort data= want_step3_control;
	by time;
data want_step4_control;
	merge want_step2_control want_step3_control;
	by time;
data want;
	set want_step2_not_control want_step4_control;
proc sort data=want;
	by id time;
title "Merged results";
proc print; run;	
r4321
Pyrite | Level 9

 

Arthur, thanks for your attention and help. Still not quite getting what I need out of the code. I think the fault is on my end though as I need to rethink how to approach my dataset. Instead of trying to add variables to widen the dataset and keep it the same number of observations, I think I need to adapt a different approach. Essentially, I think I need to duplicate sets of observations to create windows of treated and non-treated counties around the focal events I am interested in examining. I also like to add a few other variables: (1) a variable that indicates which treatment number this is for a county (i.e., first treatment, second treatment, …) (2) a variable that indicates whether a county was treated within a short time window of a previous time it was treated (=1 if treated within <=5 time periods, 0 otherwise) and (3) a variable that indicates the degree of treatment (I want the focal event variable to be binary in the wantdata but in the have data it is a actually >1 sometimes since it is a count variable), so if focal event was 2, for example, this variable would = 2. See below for example.

 

Also, I should note that is a focal occurred before week 6, the durations for those observations may not be able to get up to the full 5 weeks before and after. For example, if a focal event occurred in a county in week 2, I'll only have the 1 week duration for that focal event. 

 

 

 

 

Thanks again

 

 

 

 

 

Have:

 

Time

ID

County

FocalEvent

1

222

ABC

0

2

222

ABC

0

3

222

ABC

0

4

222

ABC

0

5

222

ABC

1

6

222

ABC

0

7

222

ABC

0

8

222

ABC

0

9

222

ABC

0

10

222

ABC

0

11

222

ABC

0

12

222

ABC

0

13

222

ABC

0

1

333

BBB

0

2

333

BBB

0

3

333

BBB

0

4

333

BBB

0

5

333

BBB

0

6

333

BBB

0

7

333

BBB

0

8

333

BBB

0

9

333

BBB

0

10

333

BBB

0

11

333

BBB

0

12

333

BBB

0

13

333

BBB

0

1

444

CCC

0

2

444

CCC

0

3

444

CCC

0

4

444

CCC

0

5

444

CCC

1

6

444

CCC

0

7

444

CCC

0

8

444

CCC

0

9

444

CCC

0

10

444

CCC

1

11

444

CCC

0

12

444

CCC

0

13

444

CCC

0

1

555

DDD

0

2

555

DDD

0

3

555

DDD

0

4

555

DDD

0

5

555

DDD

0

6

555

DDD

0

7

555

DDD

0

 

 

Want:

 

Time

ID

County

FocalEvent

EventWeek

Duration

BeforeAfter

Treated

Tnumber

Trecent

Tdegree

4

222

ABC

0

5

1

0

1

1

0

0

5

222

ABC

1

5

1

1

1

1

0

0

4

333

BBB

0

5

1

0

0

0

0

0

5

333

BBB

0

5

1

1

0

0

0

0

4

444

CCC

0

5

1

0

1

1

0

0

5

444

CCC

1

5

1

1

1

1

0

0

4

555

DDD

0

5

1

0

0

0

0

0

5

555

DDD

0

5

1

1

0

0

0

0

9

444

CCC

0

10

1

0

1

2

0

0

10

444

CCC

1

10

1

1

1

2

0

0

9

333

BBB

0

10

1

0

0

0

0

0

10

333

BBB

0

10

1

1

0

0

0

0

3

222

ABC

0

5

2

0

1

1

0

0

4

222

ABC

0

5

2

0

1

1

0

0

5

222

ABC

1

5

2

1

1

1

0

0

6

222

ABC

0

5

2

1

1

1

0

0

3

333

BBB

0

5

2

0

0

0

0

0

4

333

BBB

0

5

2

0

0

0

0

0

5

333

BBB

0

5

2

1

0

0

0

0

6

333

BBB

0

5

2

1

0

0

0

0

3

444

CCC

0

5

2

0

1

1

0

0

4

444

CCC

0

5

2

0

1

1

0

0

5

444

CCC

1

5

2

1

1

1

0

0

6

444

CCC

0

5

2

1

1

1

0

0

3

555

DDD

0

5

2

0

0

0

0

0

4

555

DDD

0

5

2

0

0

0

0

0

5

555

DDD

0

5

2

1

0

0

0

0

6

555

DDD

0

5

2

1

0

0

0

0

8

444

CCC

0

10

1

0

1

2

0

0

9

444

CCC

0

10

1

0

1

2

0

0

10

444

CCC

1

10

1

1

1

2

0

0

11

444

CCC

1

10

1

1

1

2

0

0

8

333

BBB

0

10

1

0

0

0

0

0

9

333

BBB

0

10

1

0

0

0

0

0

10

333

BBB

0

10

1

1

0

0

0

0

11

333

BBB

0

10

1

1

0

0

0

0

 

 

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1228 views
  • 2 likes
  • 2 in conversation