BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
haukewiegand
Fluorite | Level 6

Hello,

I am new to SAS (using SAS Enterprise 7.1) and I struggle to solve the following problem:

 

Example Dataset:

data have;
input ID Diff Duration;
datalines;
1 . 10
1 2 13
1 25 14
1 5 5
1 2 7
1 2 9
2 . 3
2 2 5
2 10 7
3 . 4
;

 

Task: I would like to sum up Duration and give an enumeration variable for a Sequence but only if IDs are similar and Diff is not . and ❤️ (so if Diff<3 I call it a Sequence). What makes the task difficult is that when comparing to subsequent rows the enumeration variable 'Sequence' should only continue counting if the next row does not belong to the current sequence (so if Diff is not . and <3) whereas the summation 'Cummulative_Duration' should start form the beginning in that case.

So the result should look like the following table:

 

ID Diff Duration Sequence Cummulative_Duration ;
1 . 10 1 10
1 2 13 1 23
1 25 14 2 14
1 5 5 3 5
1 2 7 3 12
1 2 9 3 21
2 . 3 1 3
2 2 5 1 8
2 10 7 2 7
3 . 4 1 4
;

Thanks in advance for taking your time to help me solving this (I tried already half a day ...)!

1 ACCEPTED SOLUTION

Accepted Solutions
r_behata
Barite | Level 11
data have;
input ID Diff Duration;
datalines;
1 . 10
1 2 13
1 25 14
1 5 5
1 2 7
1 2 9
2 . 3
2 2 5
2 10 7
3 . 4
;
run;


data want;


set have;
	by id;

retain Sequence Cummulative_Duration;

if first.id then do;
	Sequence =1;
	Cummulative_Duration=Duration;
end;

else if Diff < 3 then do;
	Sequence =Sequence;
	Cummulative_Duration+Duration;
end;
	else do;
		Sequence+1;
		Cummulative_Duration=Duration;
	end;


run;

View solution in original post

4 REPLIES 4
r_behata
Barite | Level 11
data have;
input ID Diff Duration;
datalines;
1 . 10
1 2 13
1 25 14
1 5 5
1 2 7
1 2 9
2 . 3
2 2 5
2 10 7
3 . 4
;
run;


data want;


set have;
	by id;

retain Sequence Cummulative_Duration;

if first.id then do;
	Sequence =1;
	Cummulative_Duration=Duration;
end;

else if Diff < 3 then do;
	Sequence =Sequence;
	Cummulative_Duration+Duration;
end;
	else do;
		Sequence+1;
		Cummulative_Duration=Duration;
	end;


run;
novinosrin
Tourmaline | Level 20

Assuming i understand your req:

 

data have;
input ID Diff Duration;
datalines;
1 . 10
1 2 13
1 25 14
1 5 5
1 2 7
1 2 9
2 . 3
2 2 5
2 10 7
3 . 4
;
run;

data want;
if 0 then set have;
sequence=1;
cumulative=0;
do until(last.id);
set have;
by id;
if diff>=3 then do; sequence+1;cumulative=0;end;
cumulative+Duration;
output;
end;
run;
mkeintz
PROC Star

x

 

data have;
input ID Diff Duration;
datalines;
1 . 10
1 2 13
1 25 14
1 5 5
1 2 7
1 2 9
2 . 3
2 2 5
2 10 7
3 . 4
;
data want;
  set have;
  by id;
  if diff<=3 then cumulative_duration+duration;
  else cumulative_duration=duration;
  sequence+(diff>3);
  if first.id then do;
    sequence=1;
    cumulative_duration=duration;
  end;
run;

 

  1. Notice that the "if diff<=3 then cumulative_duration+duration;" statement doesn't have an equal sign, typically needed to assign a value.   This means the expression "cumulative_duration+duration" is a summing expression, not an assignment.  The summing expression means (1) perform the summation, (2) using a RETAINED value of cumulative_duration, and (3) if cumulative_duration is missing, treat it as a zero.
  2. The "sequence+(diff>3)" is a summing statement, adding 1 to sequence whenever diff>3).
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
haukewiegand
Fluorite | Level 6
Thanks for the explanations! Probably that was where my trials failed ...

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 926 views
  • 0 likes
  • 4 in conversation