I have a dataset with a unique ID, a start date, and an end date. I'm trying to calculate the number of days within a given year that the dates fall within.
Year 1 = 01JAN2015 to 31Dec2015
Year 2 = 01JAN2016 to 31DEC2016
I have the dataset:
ID Start END
1 23AUG2014 13DEC2017
2 01JAN2015 31NOV2015
3 01JAN2015 01FEB2016
And I would like to create:
ID Start END year1_count year2_count
1 23AUG2014 13DEC2017 365 365
2 01JAN2015 31NOV2015 335 0
3 01JAN2015 01FEB2016 365 30
DATA test;
input ID startdate date9. enddate date9.;
format startdate enddate date9.;
datalines;
1 23AUG2014 13DEC2017
2 01JAN2015 30NOV2015
3 01JAN2015 01FEB2016
;
RUN;
Play around with min() and max() functions. Note that there never is a 31st of November, and 2016 was a leap year. And the 2nd of February is always the 32nd day of a year.
%let year1=2015;
%let year2=2016;
data have;
input id start :date9. end :date9.;
format start end date9.;
cards;
1 23AUG2014 13DEC2017
2 01JAN2015 30NOV2015
3 01JAN2015 01FEB2016
;
run;
data want;
set have;
year1_count = max(min(end,"31dec&year1"d) - max(start,"01jan&year1"d) + 1,0);
year2_count = max(min(end,"31dec&year2"d) - max(start,"01jan&year2"d) + 1,0);
run;
proc print data=want noobs;
run;
Result:
year1_ year2_ id start end count count 1 23AUG2014 13DEC2017 365 366 2 01JAN2015 30NOV2015 334 0 3 01JAN2015 01FEB2016 365 32
Not tested (post test data in the form of a datastep!):
data inter;
set have;
do i=year(start) to year(end);
days=365;
output;
end;
tmp=ymd(year(end),month(start),day(start));
i+1;
days=intnx('days',tmp,end);
output;
run;
proc transpose data=inter out=want;
by id start end;
var days;
id i;
idlabel i;
run;
So effectively output a row for year diffs, then one row for day diff, then transpose.
Play around with min() and max() functions. Note that there never is a 31st of November, and 2016 was a leap year. And the 2nd of February is always the 32nd day of a year.
%let year1=2015;
%let year2=2016;
data have;
input id start :date9. end :date9.;
format start end date9.;
cards;
1 23AUG2014 13DEC2017
2 01JAN2015 30NOV2015
3 01JAN2015 01FEB2016
;
run;
data want;
set have;
year1_count = max(min(end,"31dec&year1"d) - max(start,"01jan&year1"d) + 1,0);
year2_count = max(min(end,"31dec&year2"d) - max(start,"01jan&year2"d) + 1,0);
run;
proc print data=want noobs;
run;
Result:
year1_ year2_ id start end count count 1 23AUG2014 13DEC2017 365 366 2 01JAN2015 30NOV2015 334 0 3 01JAN2015 01FEB2016 365 32
A better solution is of course this (building from my previous example data):
data want;
set have;
do year = year(start) to year(end);
count = max(min(end,mdy(12,31,year)) - max(start,mdy(1,1,year)) + 1,0);
output;
end;
run;
proc print data=want noobs;
run;
Result:
id start end year count 1 23AUG2014 13DEC2017 2014 131 1 23AUG2014 13DEC2017 2015 365 1 23AUG2014 13DEC2017 2016 366 1 23AUG2014 13DEC2017 2017 347 2 01JAN2015 30NOV2015 2015 334 3 01JAN2015 01FEB2016 2015 365 3 01JAN2015 01FEB2016 2016 32
Now you have data (years) as data, and not in structure (column names).
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.