Hello all, I have code that allows me to reorganize my data (companies observed each year 2006-2015) based on a variable. The variable is an event-based variable that takes on values from 0-1 (1 if it happened). When (and/or when) the variable takes on a 1, it collapses the rows and sum the values on other relevant variables. It also creates different indicators where: (1) one variable indicates when time starts (2) one variable that indicates when the time ends, and (3) a variable that indicates 0/1 whether an event took place at all. Essentially, each observation of a case becomes a start-stop as to when the event-based variable was observed. I Illustrate below what I have: Have: Time ID County FocalEvent X1mva X2police 1 222 ABC 1 10 15 2 222 ABC 0 5 10 3 222 ABC 0 15 5 4 222 ABC 0 2 4 5 222 ABC 1 5 10 6 222 ABC 0 10 5 7 222 ABC 0 5 10 8 222 ABC 0 4 4 9 222 ABC 0 5 5 10 222 ABC 1 10 10 1 333 BBB 0 3 4 2 333 BBB 0 2 1 3 333 BBB 0 5 5 4 333 BBB 1 10 5 5 333 BBB 0 15 15 6 333 BBB 0 10 10 7 333 BBB 0 10 10 8 333 BBB 1 5 5 9 333 BBB 0 10 10 10 333 BBB 0 15 15 1 444 CCC 0 1 1 2 444 CCC 0 5 5 3 444 CCC 0 4 4 4 444 CCC 0 5 5 5 444 CCC 0 1 1 6 444 CCC 1 5 5 7 444 CCC 0 10 10 8 444 CCC 0 5 5 9 444 CCC 0 5 5 10 444 CCC 0 5 5 1 555 DDD 0 1 1 2 555 DDD 0 1 1 3 555 DDD 0 1 1 Want: ID County X1mva X2police Time1 Time2 Event 222 ABC 10 15 1 1 1 222 ABC 27 29 1 5 1 222 ABC 34 34 5 10 1 333 BBB 20 15 1 4 1 333 BBB 40 40 4 8 1 333 BBB 25 25 8 10 0 444 CCC 21 21 1 6 1 444 CCC 25 25 6 10 0 555 DDD 3 3 1 3 0 Code: data have;
input Time ID County :$ FocalEvent X1mva X2police;
cards;
1 222 ABC 1 10 15
2 222 ABC 0 5 10
3 222 ABC 0 15 5
4 222 ABC 0 2 4
5 222 ABC 1 5 10
6 222 ABC 0 10 5
7 222 ABC 0 5 10
8 222 ABC 0 4 4
9 222 ABC 0 5 5
10 222 ABC 1 10 10
1 333 BBB 0 3 4
2 333 BBB 0 2 1
3 333 BBB 0 5 5
4 333 BBB 1 10 5
5 333 BBB 0 15 15
6 333 BBB 0 10 10
7 333 BBB 0 10 10
8 333 BBB 1 5 5
9 333 BBB 0 10 10
10 333 BBB 0 15 15
1 444 CCC 0 1 1
2 444 CCC 0 5 5
3 444 CCC 0 4 4
4 444 CCC 0 5 5
5 444 CCC 0 1 1
6 444 CCC 1 5 5
7 444 CCC 0 10 10
8 444 CCC 0 5 5
9 444 CCC 0 5 5
10 444 CCC 0 5 5
1 555 DDD 0 1 1
2 555 DDD 0 1 1
3 555 DDD 0 1 1
;
run;
proc sort data = have;
by ID County Time;
run;
data want1;
set have;
by ID;
category + first.ID;
output;
category + FocalEvent;
run;
proc sql;
create table want2 as
select
ID, County,
sum(X1mva) as X1mva,
sum(X2police) as X2police,
max(min(Time)-1,1) as Time1,
max(Time) as Time2,
max(FocalEvent) as Event
from want1
group by
ID, County, category
order by
ID, County, Time1, Time2
;
quit;
proc print;
run; In practice, the variables (e.g., X1MVA, X2POLICE) that I sum have a lot of zeros on them and the time based events are spread out and more sparse. Is there a way I can add to my code such that I can count the time since the latest occurrence of the X1MVA or X2POLICE variables that I sum. For instance: Have: Time ID County FocalEvent X1mva X2police 1 222 ABC 1 0 1 2 222 ABC 0 1 0 3 222 ABC 0 0 1 4 222 ABC 0 0 0 5 222 ABC 1 0 0 6 222 ABC 0 0 1 7 222 ABC 0 0 0 8 222 ABC 0 0 0 9 222 ABC 0 1 0 10 222 ABC 1 1 0 1 333 BBB 0 0 1 2 333 BBB 0 0 1 3 333 BBB 0 1 0 4 333 BBB 1 0 0 5 333 BBB 0 1 0 6 333 BBB 0 0 0 7 333 BBB 0 0 0 8 333 BBB 1 0 1 9 333 BBB 0 0 1 10 333 BBB 0 1 0 Want: ID County X1mva X2police timesinceX1 timesinceX2 Time1 Time2 Event 222 ABC 0 1 0 0 1 1 1 222 ABC 1 1 3 2 1 5 1 222 ABC 2 1 0 4 5 10 1 333 BBB 1 2 1 2 1 4 1 333 BBB 1 1 3 0 4 8 1 333 BBB 1 1 0 1 8 10 0 Thanks in advance and I greatly appreciate you taking a look.
... View more