Hi everyone,
I have a problem with counting the number of observations within by group and based on the change in variable value. Let's me clarify it by the data set example:
I want to create a new variable - wanted - based on the number of values 0 and 1 in the variable indicator. The indicator values can change from 0 to 1 and vice versa. I want to count the number of each batch of value 0 or 1. The count need to be based on the by group (ID) as well. If I can add variables showing the start_time and end_time of each batch within by group, it would be awesome.
ID | time | indicator | wanted | start_time | end_time |
1 | 1 | 1 | 4 | 1 | 4 |
1 | 2 | 1 | 4 | 1 | 4 |
1 | 3 | 1 | 4 | 1 | 4 |
1 | 4 | 1 | 4 | 1 | 4 |
1 | 5 | 0 | 3 | 5 | 7 |
1 | 6 | 0 | 3 | 5 | 7 |
1 | 7 | 0 | 3 | 5 | 7 |
1 | 8 | 1 | 4 | 8 | 11 |
1 | 9 | 1 | 4 | 8 | 11 |
1 | 10 | 1 | 4 | 8 | 11 |
1 | 11 | 1 | 4 | 8 | 11 |
2 | 1 | 1 | 6 | 1 | 6 |
2 | 2 | 1 | 6 | 1 | 6 |
2 | 3 | 1 | 6 | 1 | 6 |
2 | 4 | 1 | 6 | 1 | 6 |
2 | 5 | 1 | 6 | 1 | 6 |
2 | 6 | 1 | 6 | 1 | 6 |
2 | 7 | 0 | 5 | 7 | 11 |
2 | 8 | 0 | 5 | 7 | 11 |
2 | 9 | 0 | 5 | 7 | 11 |
2 | 10 | 0 | 5 | 7 | 11 |
2 | 11 | 0 | 5 | 7 | 11 |
2 | 12 | 1 | 4 | 12 | 15 |
2 | 13 | 1 | 4 | 12 | 15 |
2 | 14 | 1 | 4 | 12 | 15 |
2 | 15 | 1 | 4 | 12 | 15 |
Thank you so much in advance for all your support.
Kind regards,
Windy.
Here's one approach
data have ;
input ID Time Indicator ;
cards ;
1 1 1
1 2 1
1 3 1
1 4 1
1 5 0
1 6 0
1 7 0
1 8 1
1 9 1
1 10 1
1 11 1
2 1 1
2 2 1
2 3 1
2 4 1
2 5 1
2 6 1
2 7 0
2 8 0
2 9 0
2 10 0
2 11 0
2 12 1
2 13 1
2 14 1
2 15 1
;
run ;
proc sort data=have ;
by ID Time Indicator ;
data want ;
do until(Last.Indicator) ;
set have ;
by Id Indicator notsorted;
Wanted=sum(Wanted,1) ;
if first.Indicator then Start_Time=Time ;
if last.Indicator then End_Time=Time ;
end ;
do until(Last.Indicator) ;
set Have ;
by Id Indicator notsorted;
output ;
end ;
run ;
Here's one approach
data have ;
input ID Time Indicator ;
cards ;
1 1 1
1 2 1
1 3 1
1 4 1
1 5 0
1 6 0
1 7 0
1 8 1
1 9 1
1 10 1
1 11 1
2 1 1
2 2 1
2 3 1
2 4 1
2 5 1
2 6 1
2 7 0
2 8 0
2 9 0
2 10 0
2 11 0
2 12 1
2 13 1
2 14 1
2 15 1
;
run ;
proc sort data=have ;
by ID Time Indicator ;
data want ;
do until(Last.Indicator) ;
set have ;
by Id Indicator notsorted;
Wanted=sum(Wanted,1) ;
if first.Indicator then Start_Time=Time ;
if last.Indicator then End_Time=Time ;
end ;
do until(Last.Indicator) ;
set Have ;
by Id Indicator notsorted;
output ;
end ;
run ;
data have ;
input ID Time Indicator ;
cards ;
1 1 1
1 2 1
1 3 1
1 4 1
1 5 0
1 6 0
1 7 0
1 8 1
1 9 1
1 10 1
1 11 1
2 1 1
2 2 1
2 3 1
2 4 1
2 5 1
2 6 1
2 7 0
2 8 0
2 9 0
2 10 0
2 11 0
2 12 1
2 13 1
2 14 1
2 15 1
;
run ;
data temp;
set have ;
by Id Indicator notsorted;
if first.Indicator then group+1;
run;
proc sql;
create table want as
select *,count(*) as want,
min(Time) as start_time,
max(Time) as end_time
from temp
group by group;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.