I would like sas to subtract the second start date minus the first end date based on ID. For example there is an ID with the number 1 with two different start dates and end dates. I would like for ID 1 to subtract the SECOND start date (17May2017) from the FIRST end date (09MAR2015). This was from an excel file. Please see file below. I would prefer SAS code as opposed to sql. I am new to sas and I do not know where to begin. Also the original data has about 4K IDs. I have SAS EG 7.15. Any help would be greatly appreciated. Thanks.
startYear | ID | Duplicates | startDate | endDate | 2nd Start Date-First End Date | |
1 | 2012 | 1 | 2 | 04Sep2014 | 09Mar2015 | D2-E1 |
2 | 2018 | 1 | 17May2017 | 26Nov2017 | ||
3 | 2012 | 2 | 2 | 03Jan2014 | 15Jun2014 | D5-E4 |
4 | 2018 | 2 | 26May2017 | 08Nov2017 | ||
5 | 2016 | 3 | 2 | 03Jan2015 | 16Jun2015 | D7-E6 |
6 | 2017 | 3 | 02May2017 | 14Oct2017 |
Assuming you have only two records for each ID and everything is sorted properly
UNTESTED CODE
data want;
set have;
by id;
prev_enddate=lag(enddate);
if last.id then delta=startdate-prev_enddate;
run;
Are your ID values consistently sets of 2 like your sample?
No they are 2,3,4,5
Assuming you have only two records for each ID and everything is sorted properly
UNTESTED CODE
data want;
set have;
by id;
prev_enddate=lag(enddate);
if last.id then delta=startdate-prev_enddate;
run;
Well then, I would also go with the same assumption 🙂
data have;
infile cards truncover;
input obs startYear ID Duplicates (startDate endDate) (:date9.);
format startDate endDate date9.;
cards;
1 2012 1 2 04Sep2014 09Mar2015
2 2018 1 . 17May2017 26Nov2017
3 2012 2 2 03Jan2014 15Jun2014
4 2018 2 . 26May2017 08Nov2017
5 2016 3 2 03Jan2015 16Jun2015
6 2017 3 . 02May2017
;
proc sql;
create table want as
select *, max(startDate) - min(endDate)+Duplicates-Duplicates as delta
from have
group by id
order by id, startYear;
quit;
This program calculates differences for the entire sequence of records, but sets the difference to missing for the first record of each id:
data have;
infile cards truncover;
input obs startYear ID Duplicates (startDate endDate) (:date9.);
format startDate endDate date9.;
cards;
1 2012 1 2 04Sep2014 09Mar2015
2 2018 1 . 17May2017 26Nov2017
3 2012 2 2 03Jan2014 15Jun2014
4 2018 2 . 26May2017 08Nov2017
5 2016 3 2 03Jan2015 16Jun2015
6 2017 3 . 02May2017
;
data want;
set have;
by id;
difference=startdate-lag(enddate);
if first.id then difference=.;
run;
This works also! Thank you!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.