DATA Step, Macro, Functions and more

Calculated 'end' and use as 'start' for next observation

Accepted Solution Solved
Reply
Contributor
Posts: 39
Accepted Solution

Calculated 'end' and use as 'start' for next observation

Hi,

 

I could really use some help. I have the following Excel-data

 Input

date id vendorid start in out end
01-01-2017 1000001 001 177893 4800 3100  
01-02-2017 1000001 001   4400 2800  
01-03-2017 1000001 001   4100 2800  
01-04-2017 1000001 001   4300 2700  
01-05-2017 1000001 001   3500 2500  
01-01-2017 1000001 002        
01-02-2017 1000001 002        
01-03-2017 1000001 002        
01-04-2017 1000001 002        
01-05-2017 1000001 002        
01-01-2017 1000002 001 5000 550 500  
01-02-2017 1000002 001   152 250  
01-03-2017 1000002 001   354 354  
01-04-2017 1000002 001   1234 986  
01-05-2017 1000002 001   854 785  

 

I want to:

1. Calculate 'end' --> = start + (in - out)

2. use the 'end' as 'start' in the next observation (within the same id & vendorid)

 

Output 

date id vendorid start in out end
01-01-2017 1000001 001 177893 4800 3100 179593
01-02-2017 1000001 001 179593 4400 2800 181193
01-03-2017 1000001 001 181193 4100 2800 182493
01-04-2017 1000001 001 182493 4300 2700 184093
01-05-2017 1000001 001 184093 3500 2500 185093
01-01-2017 1000001 002        
01-02-2017 1000001 002        
01-03-2017 1000001 002        
01-04-2017 1000001 002        
01-05-2017 1000001 002        
01-01-2017 1000002 001 5000 550 500 5050
01-02-2017 1000002 001 5050 152 250 4952
01-03-2017 1000002 001 4952 354 354 4952
01-04-2017 1000002 001 4952 1234 986 5200
01-05-2017 1000002 001 5200 854 785 5269

 

 

What is the best way to do this in SAS? Normally i would provide my code, but my results so far aren't even close.

 

Thank you for your time and advise.

 

Matthijs


Accepted Solutions
Solution
‎06-24-2016 09:29 AM
Super User
Posts: 7,773

Re: Calculated 'end' and use as 'start' for next observation

data want (drop=prev_end);
set have;
by id;
retain prev_end;
if not first.id
then start = prev_end;
end = start + in - out;
prev_end = end;
run;
  
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Solution
‎06-24-2016 09:29 AM
Super User
Posts: 7,773

Re: Calculated 'end' and use as 'start' for next observation

data want (drop=prev_end);
set have;
by id;
retain prev_end;
if not first.id
then start = prev_end;
end = start + in - out;
prev_end = end;
run;
  
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 39

Re: Calculated 'end' and use as 'start' for next observation

Posted in reply to KurtBremser

Thank you!

Now I know what i did wrong in my attempts. I made it way too complex.

Occasional Contributor
Posts: 13

Re: Calculated 'end' and use as 'start' for next observation

Posted in reply to KurtBremser
I was too slow on the draw.

The only thing I would add to the solution is to make sure your data is properly sorted. Specifically be careful with date sorting. Based on the current format 02-03-2016 would be sorted as earlier than 11-03-2010.
Super Contributor
Posts: 308

Re: Calculated 'end' and use as 'start' for next observation

Hello,

 

Assuming your data is correctly sorted:

 

data want;
set have;
by id vendorid;

if first.vendorid then end_keep=start;
end_keep+in-out;

run;
Occasional Contributor
Posts: 13

Re: Calculated 'end' and use as 'start' for next observation

Here's a solution that works for the situation.  One thing to note about it: when reading in the date variable I converted it to numeric to make  sure it would always be sorted chronologically in the sort statement (otherwise we would have, for example, 02-03-2016 being considered earlier than 11-03-2010).

 

data ds1;
	infile datalines missover;
	input date id vendorid in out start;
	informat date mmddyy10.;
	format date date9.;

datalines;
01-01-2017	1000001	001 4800 3100 177893	 
01-02-2017	1000001	001 4400 2800	 
01-03-2017	1000001	001 4100 2800	 
01-04-2017	1000001	001 4300 2700	 
01-05-2017	1000001	001 3500 2500	 
01-01-2017	1000001	002	 	 	 	 
01-02-2017	1000001	002	 	 	 	 
01-03-2017	1000001	002	 	 	 	 
01-04-2017	1000001	002	 	 	 	 
01-05-2017	1000001	002	 	 	 	 
01-01-2017	1000002	001 550 500 5000 
01-02-2017	1000002	001 152 250	 
01-03-2017	1000002	001 354	354	 
01-04-2017	1000002	001 1234 986	 
01-05-2017	1000002	001 854	785	 
;
run;

proc sort data=ds1; by id vendorid date; run;

data ds2(drop=currend);
	retain currend;
	set ds1;
	by id vendorid;

	if first.vendorid then
		do;
			currend=.;
		end;
	else
		do;
			start=currend;
		end;

	if ^missing(in) and ^missing(out) then 
		do;
			currend=start+(in-out);
			end=currend;
		end;

	output;
run;
	
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 347 views
  • 1 like
  • 4 in conversation