Hi everyone,
I have a problem with counting the number of observations within by group and based on the change in variable value. Let's me clarify it by the data set example:
I want to create a new variable - wanted - based on the number of values 0 and 1 in the variable indicator. The indicator values can change from 0 to 1 and vice versa. I want to count the number of each batch of value 0 or 1. The count need to be based on the by group (ID) as well. If I can add variables showing the start_time and end_time of each batch within by group, it would be awesome.
ID | time | indicator | wanted | start_time | end_time |
1 | 1 | 1 | 4 | 1 | 4 |
1 | 2 | 1 | 4 | 1 | 4 |
1 | 3 | 1 | 4 | 1 | 4 |
1 | 4 | 1 | 4 | 1 | 4 |
1 | 5 | 0 | 3 | 5 | 7 |
1 | 6 | 0 | 3 | 5 | 7 |
1 | 7 | 0 | 3 | 5 | 7 |
1 | 8 | 1 | 4 | 8 | 11 |
1 | 9 | 1 | 4 | 8 | 11 |
1 | 10 | 1 | 4 | 8 | 11 |
1 | 11 | 1 | 4 | 8 | 11 |
2 | 1 | 1 | 6 | 1 | 6 |
2 | 2 | 1 | 6 | 1 | 6 |
2 | 3 | 1 | 6 | 1 | 6 |
2 | 4 | 1 | 6 | 1 | 6 |
2 | 5 | 1 | 6 | 1 | 6 |
2 | 6 | 1 | 6 | 1 | 6 |
2 | 7 | 0 | 5 | 7 | 11 |
2 | 8 | 0 | 5 | 7 | 11 |
2 | 9 | 0 | 5 | 7 | 11 |
2 | 10 | 0 | 5 | 7 | 11 |
2 | 11 | 0 | 5 | 7 | 11 |
2 | 12 | 1 | 4 | 12 | 15 |
2 | 13 | 1 | 4 | 12 | 15 |
2 | 14 | 1 | 4 | 12 | 15 |
2 | 15 | 1 | 4 | 12 | 15 |
Thank you so much in advance for all your support.
Kind regards,
Windy.
Here's one approach
data have ;
input ID Time Indicator ;
cards ;
1 1 1
1 2 1
1 3 1
1 4 1
1 5 0
1 6 0
1 7 0
1 8 1
1 9 1
1 10 1
1 11 1
2 1 1
2 2 1
2 3 1
2 4 1
2 5 1
2 6 1
2 7 0
2 8 0
2 9 0
2 10 0
2 11 0
2 12 1
2 13 1
2 14 1
2 15 1
;
run ;
proc sort data=have ;
by ID Time Indicator ;
data want ;
do until(Last.Indicator) ;
set have ;
by Id Indicator notsorted;
Wanted=sum(Wanted,1) ;
if first.Indicator then Start_Time=Time ;
if last.Indicator then End_Time=Time ;
end ;
do until(Last.Indicator) ;
set Have ;
by Id Indicator notsorted;
output ;
end ;
run ;
Here's one approach
data have ;
input ID Time Indicator ;
cards ;
1 1 1
1 2 1
1 3 1
1 4 1
1 5 0
1 6 0
1 7 0
1 8 1
1 9 1
1 10 1
1 11 1
2 1 1
2 2 1
2 3 1
2 4 1
2 5 1
2 6 1
2 7 0
2 8 0
2 9 0
2 10 0
2 11 0
2 12 1
2 13 1
2 14 1
2 15 1
;
run ;
proc sort data=have ;
by ID Time Indicator ;
data want ;
do until(Last.Indicator) ;
set have ;
by Id Indicator notsorted;
Wanted=sum(Wanted,1) ;
if first.Indicator then Start_Time=Time ;
if last.Indicator then End_Time=Time ;
end ;
do until(Last.Indicator) ;
set Have ;
by Id Indicator notsorted;
output ;
end ;
run ;
data have ;
input ID Time Indicator ;
cards ;
1 1 1
1 2 1
1 3 1
1 4 1
1 5 0
1 6 0
1 7 0
1 8 1
1 9 1
1 10 1
1 11 1
2 1 1
2 2 1
2 3 1
2 4 1
2 5 1
2 6 1
2 7 0
2 8 0
2 9 0
2 10 0
2 11 0
2 12 1
2 13 1
2 14 1
2 15 1
;
run ;
data temp;
set have ;
by Id Indicator notsorted;
if first.Indicator then group+1;
run;
proc sql;
create table want as
select *,count(*) as want,
min(Time) as start_time,
max(Time) as end_time
from temp
group by group;
quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.