I would like sas to subtract the second start date minus the first end date based on ID. For example there is an ID with the number 1 with two different start dates and end dates. I would like for ID 1 to subtract the SECOND start date (17May2017) from the FIRST end date (09MAR2015). This was from an excel file. Please see file below. I would prefer SAS code as opposed to sql. I am new to sas and I do not know where to begin. Also the original data has about 4K IDs. I have SAS EG 7.15. Any help would be greatly appreciated. Thanks.
startYear | ID | Duplicates | startDate | endDate | 2nd Start Date-First End Date | |
1 | 2012 | 1 | 2 | 04Sep2014 | 09Mar2015 | D2-E1 |
2 | 2018 | 1 | 17May2017 | 26Nov2017 | ||
3 | 2012 | 2 | 2 | 03Jan2014 | 15Jun2014 | D5-E4 |
4 | 2018 | 2 | 26May2017 | 08Nov2017 | ||
5 | 2016 | 3 | 2 | 03Jan2015 | 16Jun2015 | D7-E6 |
6 | 2017 | 3 | 02May2017 | 14Oct2017 |
Assuming you have only two records for each ID and everything is sorted properly
UNTESTED CODE
data want;
set have;
by id;
prev_enddate=lag(enddate);
if last.id then delta=startdate-prev_enddate;
run;
Are your ID values consistently sets of 2 like your sample?
No they are 2,3,4,5
Assuming you have only two records for each ID and everything is sorted properly
UNTESTED CODE
data want;
set have;
by id;
prev_enddate=lag(enddate);
if last.id then delta=startdate-prev_enddate;
run;
Well then, I would also go with the same assumption 🙂
data have;
infile cards truncover;
input obs startYear ID Duplicates (startDate endDate) (:date9.);
format startDate endDate date9.;
cards;
1 2012 1 2 04Sep2014 09Mar2015
2 2018 1 . 17May2017 26Nov2017
3 2012 2 2 03Jan2014 15Jun2014
4 2018 2 . 26May2017 08Nov2017
5 2016 3 2 03Jan2015 16Jun2015
6 2017 3 . 02May2017
;
proc sql;
create table want as
select *, max(startDate) - min(endDate)+Duplicates-Duplicates as delta
from have
group by id
order by id, startYear;
quit;
This program calculates differences for the entire sequence of records, but sets the difference to missing for the first record of each id:
data have;
infile cards truncover;
input obs startYear ID Duplicates (startDate endDate) (:date9.);
format startDate endDate date9.;
cards;
1 2012 1 2 04Sep2014 09Mar2015
2 2018 1 . 17May2017 26Nov2017
3 2012 2 2 03Jan2014 15Jun2014
4 2018 2 . 26May2017 08Nov2017
5 2016 3 2 03Jan2015 16Jun2015
6 2017 3 . 02May2017
;
data want;
set have;
by id;
difference=startdate-lag(enddate);
if first.id then difference=.;
run;
This works also! Thank you!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.