Hi everyone,
I have a problem with counting the number of observations within by group and based on the change in variable value. Let's me clarify it by the data set example:
I want to create a new variable - wanted - based on the number of values 0 and 1 in the variable indicator. The indicator values can change from 0 to 1 and vice versa. I want to count the number of each batch of value 0 or 1. The count need to be based on the by group (ID) as well. If I can add variables showing the start_time and end_time of each batch within by group, it would be awesome.
ID | time | indicator | wanted | start_time | end_time |
1 | 1 | 1 | 4 | 1 | 4 |
1 | 2 | 1 | 4 | 1 | 4 |
1 | 3 | 1 | 4 | 1 | 4 |
1 | 4 | 1 | 4 | 1 | 4 |
1 | 5 | 0 | 3 | 5 | 7 |
1 | 6 | 0 | 3 | 5 | 7 |
1 | 7 | 0 | 3 | 5 | 7 |
1 | 8 | 1 | 4 | 8 | 11 |
1 | 9 | 1 | 4 | 8 | 11 |
1 | 10 | 1 | 4 | 8 | 11 |
1 | 11 | 1 | 4 | 8 | 11 |
2 | 1 | 1 | 6 | 1 | 6 |
2 | 2 | 1 | 6 | 1 | 6 |
2 | 3 | 1 | 6 | 1 | 6 |
2 | 4 | 1 | 6 | 1 | 6 |
2 | 5 | 1 | 6 | 1 | 6 |
2 | 6 | 1 | 6 | 1 | 6 |
2 | 7 | 0 | 5 | 7 | 11 |
2 | 8 | 0 | 5 | 7 | 11 |
2 | 9 | 0 | 5 | 7 | 11 |
2 | 10 | 0 | 5 | 7 | 11 |
2 | 11 | 0 | 5 | 7 | 11 |
2 | 12 | 1 | 4 | 12 | 15 |
2 | 13 | 1 | 4 | 12 | 15 |
2 | 14 | 1 | 4 | 12 | 15 |
2 | 15 | 1 | 4 | 12 | 15 |
Thank you so much in advance for all your support.
Kind regards,
Windy.
Here's one approach
data have ;
input ID Time Indicator ;
cards ;
1 1 1
1 2 1
1 3 1
1 4 1
1 5 0
1 6 0
1 7 0
1 8 1
1 9 1
1 10 1
1 11 1
2 1 1
2 2 1
2 3 1
2 4 1
2 5 1
2 6 1
2 7 0
2 8 0
2 9 0
2 10 0
2 11 0
2 12 1
2 13 1
2 14 1
2 15 1
;
run ;
proc sort data=have ;
by ID Time Indicator ;
data want ;
do until(Last.Indicator) ;
set have ;
by Id Indicator notsorted;
Wanted=sum(Wanted,1) ;
if first.Indicator then Start_Time=Time ;
if last.Indicator then End_Time=Time ;
end ;
do until(Last.Indicator) ;
set Have ;
by Id Indicator notsorted;
output ;
end ;
run ;
Here's one approach
data have ;
input ID Time Indicator ;
cards ;
1 1 1
1 2 1
1 3 1
1 4 1
1 5 0
1 6 0
1 7 0
1 8 1
1 9 1
1 10 1
1 11 1
2 1 1
2 2 1
2 3 1
2 4 1
2 5 1
2 6 1
2 7 0
2 8 0
2 9 0
2 10 0
2 11 0
2 12 1
2 13 1
2 14 1
2 15 1
;
run ;
proc sort data=have ;
by ID Time Indicator ;
data want ;
do until(Last.Indicator) ;
set have ;
by Id Indicator notsorted;
Wanted=sum(Wanted,1) ;
if first.Indicator then Start_Time=Time ;
if last.Indicator then End_Time=Time ;
end ;
do until(Last.Indicator) ;
set Have ;
by Id Indicator notsorted;
output ;
end ;
run ;
data have ;
input ID Time Indicator ;
cards ;
1 1 1
1 2 1
1 3 1
1 4 1
1 5 0
1 6 0
1 7 0
1 8 1
1 9 1
1 10 1
1 11 1
2 1 1
2 2 1
2 3 1
2 4 1
2 5 1
2 6 1
2 7 0
2 8 0
2 9 0
2 10 0
2 11 0
2 12 1
2 13 1
2 14 1
2 15 1
;
run ;
data temp;
set have ;
by Id Indicator notsorted;
if first.Indicator then group+1;
run;
proc sql;
create table want as
select *,count(*) as want,
min(Time) as start_time,
max(Time) as end_time
from temp
group by group;
quit;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.