I have a dataset with a unique ID, a start date, and an end date. I'm trying to calculate the number of days within a given year that the dates fall within.
Year 1 = 01JAN2015 to 31Dec2015
Year 2 = 01JAN2016 to 31DEC2016
I have the dataset:
ID Start END
1 23AUG2014 13DEC2017
2 01JAN2015 31NOV2015
3 01JAN2015 01FEB2016
And I would like to create:
ID Start END year1_count year2_count
1 23AUG2014 13DEC2017 365 365
2 01JAN2015 31NOV2015 335 0
3 01JAN2015 01FEB2016 365 30
DATA test;
input ID startdate date9. enddate date9.;
format startdate enddate date9.;
datalines;
1 23AUG2014 13DEC2017
2 01JAN2015 30NOV2015
3 01JAN2015 01FEB2016
;
RUN;
Play around with min() and max() functions. Note that there never is a 31st of November, and 2016 was a leap year. And the 2nd of February is always the 32nd day of a year.
%let year1=2015;
%let year2=2016;
data have;
input id start :date9. end :date9.;
format start end date9.;
cards;
1 23AUG2014 13DEC2017
2 01JAN2015 30NOV2015
3 01JAN2015 01FEB2016
;
run;
data want;
set have;
year1_count = max(min(end,"31dec&year1"d) - max(start,"01jan&year1"d) + 1,0);
year2_count = max(min(end,"31dec&year2"d) - max(start,"01jan&year2"d) + 1,0);
run;
proc print data=want noobs;
run;
Result:
year1_ year2_ id start end count count 1 23AUG2014 13DEC2017 365 366 2 01JAN2015 30NOV2015 334 0 3 01JAN2015 01FEB2016 365 32
Not tested (post test data in the form of a datastep!):
data inter; set have; do i=year(start) to year(end); days=365; output; end; tmp=ymd(year(end),month(start),day(start)); i+1; days=intnx('days',tmp,end); output; run; proc transpose data=inter out=want; by id start end; var days; id i; idlabel i; run;
So effectively output a row for year diffs, then one row for day diff, then transpose.
Play around with min() and max() functions. Note that there never is a 31st of November, and 2016 was a leap year. And the 2nd of February is always the 32nd day of a year.
%let year1=2015;
%let year2=2016;
data have;
input id start :date9. end :date9.;
format start end date9.;
cards;
1 23AUG2014 13DEC2017
2 01JAN2015 30NOV2015
3 01JAN2015 01FEB2016
;
run;
data want;
set have;
year1_count = max(min(end,"31dec&year1"d) - max(start,"01jan&year1"d) + 1,0);
year2_count = max(min(end,"31dec&year2"d) - max(start,"01jan&year2"d) + 1,0);
run;
proc print data=want noobs;
run;
Result:
year1_ year2_ id start end count count 1 23AUG2014 13DEC2017 365 366 2 01JAN2015 30NOV2015 334 0 3 01JAN2015 01FEB2016 365 32
A better solution is of course this (building from my previous example data):
data want;
set have;
do year = year(start) to year(end);
count = max(min(end,mdy(12,31,year)) - max(start,mdy(1,1,year)) + 1,0);
output;
end;
run;
proc print data=want noobs;
run;
Result:
id start end year count 1 23AUG2014 13DEC2017 2014 131 1 23AUG2014 13DEC2017 2015 365 1 23AUG2014 13DEC2017 2016 366 1 23AUG2014 13DEC2017 2017 347 2 01JAN2015 30NOV2015 2015 334 3 01JAN2015 01FEB2016 2015 365 3 01JAN2015 01FEB2016 2016 32
Now you have data (years) as data, and not in structure (column names).
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.