Hello,
I am new to SAS (using SAS Enterprise 7.1) and I struggle to solve the following problem:
Example Dataset:
data have;
input ID Diff Duration;
datalines;
1 . 10
1 2 13
1 25 14
1 5 5
1 2 7
1 2 9
2 . 3
2 2 5
2 10 7
3 . 4
;
Task: I would like to sum up Duration and give an enumeration variable for a Sequence but only if IDs are similar and Diff is not . and ❤️ (so if Diff<3 I call it a Sequence). What makes the task difficult is that when comparing to subsequent rows the enumeration variable 'Sequence' should only continue counting if the next row does not belong to the current sequence (so if Diff is not . and <3) whereas the summation 'Cummulative_Duration' should start form the beginning in that case.
So the result should look like the following table:
ID Diff Duration Sequence Cummulative_Duration ;
1 . 10 1 10
1 2 13 1 23
1 25 14 2 14
1 5 5 3 5
1 2 7 3 12
1 2 9 3 21
2 . 3 1 3
2 2 5 1 8
2 10 7 2 7
3 . 4 1 4
;
Thanks in advance for taking your time to help me solving this (I tried already half a day ...)!
data have;
input ID Diff Duration;
datalines;
1 . 10
1 2 13
1 25 14
1 5 5
1 2 7
1 2 9
2 . 3
2 2 5
2 10 7
3 . 4
;
run;
data want;
set have;
by id;
retain Sequence Cummulative_Duration;
if first.id then do;
Sequence =1;
Cummulative_Duration=Duration;
end;
else if Diff < 3 then do;
Sequence =Sequence;
Cummulative_Duration+Duration;
end;
else do;
Sequence+1;
Cummulative_Duration=Duration;
end;
run;
data have;
input ID Diff Duration;
datalines;
1 . 10
1 2 13
1 25 14
1 5 5
1 2 7
1 2 9
2 . 3
2 2 5
2 10 7
3 . 4
;
run;
data want;
set have;
by id;
retain Sequence Cummulative_Duration;
if first.id then do;
Sequence =1;
Cummulative_Duration=Duration;
end;
else if Diff < 3 then do;
Sequence =Sequence;
Cummulative_Duration+Duration;
end;
else do;
Sequence+1;
Cummulative_Duration=Duration;
end;
run;
Assuming i understand your req:
data have;
input ID Diff Duration;
datalines;
1 . 10
1 2 13
1 25 14
1 5 5
1 2 7
1 2 9
2 . 3
2 2 5
2 10 7
3 . 4
;
run;
data want;
if 0 then set have;
sequence=1;
cumulative=0;
do until(last.id);
set have;
by id;
if diff>=3 then do; sequence+1;cumulative=0;end;
cumulative+Duration;
output;
end;
run;
x
data have;
input ID Diff Duration;
datalines;
1 . 10
1 2 13
1 25 14
1 5 5
1 2 7
1 2 9
2 . 3
2 2 5
2 10 7
3 . 4
;
data want;
set have;
by id;
if diff<=3 then cumulative_duration+duration;
else cumulative_duration=duration;
sequence+(diff>3);
if first.id then do;
sequence=1;
cumulative_duration=duration;
end;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.