I have a dataset with a unique ID, a start date, and an end date. I'm trying to calculate the number of days within a given year that the dates fall within.
Year 1 = 01JAN2015 to 31Dec2015
Year 2 = 01JAN2016 to 31DEC2016
I have the dataset:
ID Start END
1 23AUG2014 13DEC2017
2 01JAN2015 31NOV2015
3 01JAN2015 01FEB2016
And I would like to create:
ID Start END year1_count year2_count
1 23AUG2014 13DEC2017 365 365
2 01JAN2015 31NOV2015 335 0
3 01JAN2015 01FEB2016 365 30
DATA test;
input ID startdate date9. enddate date9.;
format startdate enddate date9.;
datalines;
1 23AUG2014 13DEC2017
2 01JAN2015 30NOV2015
3 01JAN2015 01FEB2016
;
RUN;
Play around with min() and max() functions. Note that there never is a 31st of November, and 2016 was a leap year. And the 2nd of February is always the 32nd day of a year.
%let year1=2015;
%let year2=2016;
data have;
input id start :date9. end :date9.;
format start end date9.;
cards;
1 23AUG2014 13DEC2017
2 01JAN2015 30NOV2015
3 01JAN2015 01FEB2016
;
run;
data want;
set have;
year1_count = max(min(end,"31dec&year1"d) - max(start,"01jan&year1"d) + 1,0);
year2_count = max(min(end,"31dec&year2"d) - max(start,"01jan&year2"d) + 1,0);
run;
proc print data=want noobs;
run;
Result:
year1_ year2_ id start end count count 1 23AUG2014 13DEC2017 365 366 2 01JAN2015 30NOV2015 334 0 3 01JAN2015 01FEB2016 365 32
Not tested (post test data in the form of a datastep!):
data inter; set have; do i=year(start) to year(end); days=365; output; end; tmp=ymd(year(end),month(start),day(start)); i+1; days=intnx('days',tmp,end); output; run; proc transpose data=inter out=want; by id start end; var days; id i; idlabel i; run;
So effectively output a row for year diffs, then one row for day diff, then transpose.
Play around with min() and max() functions. Note that there never is a 31st of November, and 2016 was a leap year. And the 2nd of February is always the 32nd day of a year.
%let year1=2015;
%let year2=2016;
data have;
input id start :date9. end :date9.;
format start end date9.;
cards;
1 23AUG2014 13DEC2017
2 01JAN2015 30NOV2015
3 01JAN2015 01FEB2016
;
run;
data want;
set have;
year1_count = max(min(end,"31dec&year1"d) - max(start,"01jan&year1"d) + 1,0);
year2_count = max(min(end,"31dec&year2"d) - max(start,"01jan&year2"d) + 1,0);
run;
proc print data=want noobs;
run;
Result:
year1_ year2_ id start end count count 1 23AUG2014 13DEC2017 365 366 2 01JAN2015 30NOV2015 334 0 3 01JAN2015 01FEB2016 365 32
A better solution is of course this (building from my previous example data):
data want;
set have;
do year = year(start) to year(end);
count = max(min(end,mdy(12,31,year)) - max(start,mdy(1,1,year)) + 1,0);
output;
end;
run;
proc print data=want noobs;
run;
Result:
id start end year count 1 23AUG2014 13DEC2017 2014 131 1 23AUG2014 13DEC2017 2015 365 1 23AUG2014 13DEC2017 2016 366 1 23AUG2014 13DEC2017 2017 347 2 01JAN2015 30NOV2015 2015 334 3 01JAN2015 01FEB2016 2015 365 3 01JAN2015 01FEB2016 2016 32
Now you have data (years) as data, and not in structure (column names).
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.