Hi,
I could really use some help. I have the following Excel-data
Input
date | id | vendorid | start | in | out | end |
01-01-2017 | 1000001 | 001 | 177893 | 4800 | 3100 | |
01-02-2017 | 1000001 | 001 | 4400 | 2800 | ||
01-03-2017 | 1000001 | 001 | 4100 | 2800 | ||
01-04-2017 | 1000001 | 001 | 4300 | 2700 | ||
01-05-2017 | 1000001 | 001 | 3500 | 2500 | ||
01-01-2017 | 1000001 | 002 | ||||
01-02-2017 | 1000001 | 002 | ||||
01-03-2017 | 1000001 | 002 | ||||
01-04-2017 | 1000001 | 002 | ||||
01-05-2017 | 1000001 | 002 | ||||
01-01-2017 | 1000002 | 001 | 5000 | 550 | 500 | |
01-02-2017 | 1000002 | 001 | 152 | 250 | ||
01-03-2017 | 1000002 | 001 | 354 | 354 | ||
01-04-2017 | 1000002 | 001 | 1234 | 986 | ||
01-05-2017 | 1000002 | 001 | 854 | 785 |
I want to:
1. Calculate 'end' --> = start + (in - out)
2. use the 'end' as 'start' in the next observation (within the same id & vendorid)
Output
date | id | vendorid | start | in | out | end |
01-01-2017 | 1000001 | 001 | 177893 | 4800 | 3100 | 179593 |
01-02-2017 | 1000001 | 001 | 179593 | 4400 | 2800 | 181193 |
01-03-2017 | 1000001 | 001 | 181193 | 4100 | 2800 | 182493 |
01-04-2017 | 1000001 | 001 | 182493 | 4300 | 2700 | 184093 |
01-05-2017 | 1000001 | 001 | 184093 | 3500 | 2500 | 185093 |
01-01-2017 | 1000001 | 002 | ||||
01-02-2017 | 1000001 | 002 | ||||
01-03-2017 | 1000001 | 002 | ||||
01-04-2017 | 1000001 | 002 | ||||
01-05-2017 | 1000001 | 002 | ||||
01-01-2017 | 1000002 | 001 | 5000 | 550 | 500 | 5050 |
01-02-2017 | 1000002 | 001 | 5050 | 152 | 250 | 4952 |
01-03-2017 | 1000002 | 001 | 4952 | 354 | 354 | 4952 |
01-04-2017 | 1000002 | 001 | 4952 | 1234 | 986 | 5200 |
01-05-2017 | 1000002 | 001 | 5200 | 854 | 785 | 5269 |
What is the best way to do this in SAS? Normally i would provide my code, but my results so far aren't even close.
Thank you for your time and advise.
Matthijs
data want (drop=prev_end);
set have;
by id;
retain prev_end;
if not first.id
then start = prev_end;
end = start + in - out;
prev_end = end;
run;
data want (drop=prev_end);
set have;
by id;
retain prev_end;
if not first.id
then start = prev_end;
end = start + in - out;
prev_end = end;
run;
Thank you!
Now I know what i did wrong in my attempts. I made it way too complex.
Hello,
Assuming your data is correctly sorted:
data want;
set have;
by id vendorid;
if first.vendorid then end_keep=start;
end_keep+in-out;
run;
Here's a solution that works for the situation. One thing to note about it: when reading in the date variable I converted it to numeric to make sure it would always be sorted chronologically in the sort statement (otherwise we would have, for example, 02-03-2016 being considered earlier than 11-03-2010).
data ds1;
infile datalines missover;
input date id vendorid in out start;
informat date mmddyy10.;
format date date9.;
datalines;
01-01-2017 1000001 001 4800 3100 177893
01-02-2017 1000001 001 4400 2800
01-03-2017 1000001 001 4100 2800
01-04-2017 1000001 001 4300 2700
01-05-2017 1000001 001 3500 2500
01-01-2017 1000001 002
01-02-2017 1000001 002
01-03-2017 1000001 002
01-04-2017 1000001 002
01-05-2017 1000001 002
01-01-2017 1000002 001 550 500 5000
01-02-2017 1000002 001 152 250
01-03-2017 1000002 001 354 354
01-04-2017 1000002 001 1234 986
01-05-2017 1000002 001 854 785
;
run;
proc sort data=ds1; by id vendorid date; run;
data ds2(drop=currend);
retain currend;
set ds1;
by id vendorid;
if first.vendorid then
do;
currend=.;
end;
else
do;
start=currend;
end;
if ^missing(in) and ^missing(out) then
do;
currend=start+(in-out);
end=currend;
end;
output;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.