Hello,
Example of data:
ID TOTAL_VALUE START_DATE END_DATE
1 1000 15-11-04 17-09-30
2 3000 16-11-29 17-12-15
3 5000 15-12-08 16-01-11
(all dates are in SAS date)
I would like to get pro rata transaction value for each year involved. For example, for ID1, the transaction starts in Nov 2015 and ends in Sep 2017. It has 2 months in 2015, 12 months in 2016, and 9 months in 2017. So for 2015, the assigned value = 2/23 * 1000, for 2016, 12/23 * 1000, for 2017, 9/23 * 1000.
My desired output:
ID YEAR VALUE
1 2015 86.96
1 2016 521.73
1 2017 391.30
2 2016 230.77
2 2017 2769.23
3 2015 2500.00
3 2016 2500.00
Calculation in terms of number of days, or round down to number of months as I have shown in the example are both acceptable.
I'm very new to SAS, and I can only think of extracting years from start_date and end_date and then create an array of years out of it. But I can't figure out how to do the prorata part. Any thoughts would help. Thank you!
Unless I misunderstand the OP request, that is way too much code. I believe a single loop will do:
data have;
input ID TOTAL_VALUE (START_DATE END_DATE) (:yymmdd10.);
format START_DATE END_DATE yymmdd10.;
cards;
1 1000 15-11-04 17-09-30
2 3000 16-11-29 17-12-15
3 5000 15-12-08 16-01-11
;
data want (drop=_:);
set have;
_ndays=1+end_date-start_date;
do year=year(start_date) to year(end_date);
value= total_value*(1+ min(end_date,mdy(12,31,year))-max(start_date,mdy(1,1,year)) )/_ndays;
output;
end;
run;
The above is in days. If you want months (where each month in a range counts as a whole month):
data want2 (drop=_:);
set have;
_nmonths=1+intck('month',start_date,end_date);
do year=year(start_date) to year(end_date);
value= total_value
*(1+intck('month',max(start_date,mdy(1,1,year)),min(end_date,mdy(12,31,year))))
/_nmonths;
output;
end;
run;
Edited additional note: this problem fits nicely in sas base programming forum, where it would get the widest audience.
I'm afraid my result doesn;t seem to match your ID2,
Can you correct the logic and see where i am wrong
For ID2, my logic computes
2016 2/14*3000
2017 12/14*3000
data have;
input ID TOTAL_VALUE (START_DATE END_DATE) (:yymmdd10.);
format START_DATE END_DATE yymmdd10.;
cards;
1 1000 15-11-04 17-09-30
2 3000 16-11-29 17-12-15
3 5000 15-12-08 16-01-11
;
data want;
if _n_=1 then do;
dcl hash H (ordered: "A", multidata:'y') ;
h.definekey ("id") ;
h.definedata ('k','month') ;
h.definedone () ;
end;
set have;
_month=0;
do k=year(start_date) to year(end_date);
if k ne year(end_date) then do;
month=intck('month',start_date,intnx('year',start_date,0,'e'))+1;
h.add();
_month+month;
start_date=intnx('year',start_date,1,'b');
end;
else if k=year(end_date) then do; month=intck('month',mdy(1,1,k),end_date)+1;
h.add();
_month+month;
end;
end;
do while(h.do_over(key:id) eq 0);
value=month/_month*TOTAL_VALUE;
output;
end;
keep id k value;
run;
SAS Output
1 | 2015 | 86.96 |
1 | 2016 | 521.74 |
1 | 2017 | 391.30 |
2 | 2016 | 428.57 |
2 | 2017 | 2571.43 |
3 | 2015 | 2500.00 |
3 | 2016 | 2500.00 |
Unless I misunderstand the OP request, that is way too much code. I believe a single loop will do:
data have;
input ID TOTAL_VALUE (START_DATE END_DATE) (:yymmdd10.);
format START_DATE END_DATE yymmdd10.;
cards;
1 1000 15-11-04 17-09-30
2 3000 16-11-29 17-12-15
3 5000 15-12-08 16-01-11
;
data want (drop=_:);
set have;
_ndays=1+end_date-start_date;
do year=year(start_date) to year(end_date);
value= total_value*(1+ min(end_date,mdy(12,31,year))-max(start_date,mdy(1,1,year)) )/_ndays;
output;
end;
run;
The above is in days. If you want months (where each month in a range counts as a whole month):
data want2 (drop=_:);
set have;
_nmonths=1+intck('month',start_date,end_date);
do year=year(start_date) to year(end_date);
value= total_value
*(1+intck('month',max(start_date,mdy(1,1,year)),min(end_date,mdy(12,31,year))))
/_nmonths;
output;
end;
run;
Edited additional note: this problem fits nicely in sas base programming forum, where it would get the widest audience.
@mkeintz Thank you sir for helping me learn so much and your points have always been amazing, not just this thread. Paying attention to your points, I cleaned my code a little bit, though yours is classy
data have;
input ID TOTAL_VALUE (START_DATE END_DATE) (:yymmdd10.);
format START_DATE END_DATE yymmdd10.;
cards;
1 1000 15-11-04 17-09-30
2 3000 16-11-29 17-12-15
3 5000 15-12-08 16-01-11
;
data want;
set have;
_month=intck('month',start_date,end_Date)+1;
do year=year(start_date) to year(end_date);
if year=year(start_date) then months=intck('month',start_date,intnx('year',start_date,0,'e'))+1;
else if year=year(end_date) then months=intck('month',intnx('year',end_date,0,'b'),end_date)+1;
else months=12;
value=months/_month*TOTAL_VALUE;
output;
end;
keep id year value;
run;
or
data want;
set have;
_month=intck('month',start_date,end_Date)+1;
do year=year(start_date) to year(end_date);
if year=year(start_date) then months=12-month(start_date)+1;
else if year=year(end_date) then months=month(end_date);
else months=12;
value=months/_month*TOTAL_VALUE;
output;
end;
keep id year value;
run;
Your code makes the logic clearer to the viewer than my dense single line.
But it assumes that all intervals are truly multi-year. If the interval starts and ends in the same year and is less than 12 months, you'll need to make some changes. Otherwise, the title of this discussion better be 100% correct.
data have;
input ID TOTAL_VALUE (START_DATE END_DATE) (:yymmdd10.);
format START_DATE END_DATE yymmdd10.;
cards;
1 1000 15-11-04 17-09-30
2 3000 16-11-29 17-12-15
3 5000 15-12-08 16-01-11
;
data temp;
set have;
range=end_date-start_date;
do date=start_date to end_date;
year=year(date);
output;
end;
format date yymmdd10.;
run;
proc summary data=temp;
by id total_value range year;
output out=temp1;
run;
data want;
set temp1;
value=total_value*_freq_/range;
keep id year value;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.