Hello,
Example of data:
ID TOTAL_VALUE START_DATE END_DATE
1 1000 15-11-04 17-09-30
2 3000 16-11-29 17-12-15
3 5000 15-12-08 16-01-11
(all dates are in SAS date)
I would like to get pro rata transaction value for each year involved. For example, for ID1, the transaction starts in Nov 2015 and ends in Sep 2017. It has 2 months in 2015, 12 months in 2016, and 9 months in 2017. So for 2015, the assigned value = 2/23 * 1000, for 2016, 12/23 * 1000, for 2017, 9/23 * 1000.
My desired output:
ID YEAR VALUE
1 2015 86.96
1 2016 521.73
1 2017 391.30
2 2016 230.77
2 2017 2769.23
3 2015 2500.00
3 2016 2500.00
Calculation in terms of number of days, or round down to number of months as I have shown in the example are both acceptable.
I'm very new to SAS, and I can only think of extracting years from start_date and end_date and then create an array of years out of it. But I can't figure out how to do the prorata part. Any thoughts would help. Thank you!
Unless I misunderstand the OP request, that is way too much code. I believe a single loop will do:
data have;
input ID TOTAL_VALUE (START_DATE END_DATE) (:yymmdd10.);
format START_DATE END_DATE yymmdd10.;
cards;
1 1000 15-11-04 17-09-30
2 3000 16-11-29 17-12-15
3 5000 15-12-08 16-01-11
;
data want (drop=_:);
set have;
_ndays=1+end_date-start_date;
do year=year(start_date) to year(end_date);
value= total_value*(1+ min(end_date,mdy(12,31,year))-max(start_date,mdy(1,1,year)) )/_ndays;
output;
end;
run;
The above is in days. If you want months (where each month in a range counts as a whole month):
data want2 (drop=_:);
set have;
_nmonths=1+intck('month',start_date,end_date);
do year=year(start_date) to year(end_date);
value= total_value
*(1+intck('month',max(start_date,mdy(1,1,year)),min(end_date,mdy(12,31,year))))
/_nmonths;
output;
end;
run;
Edited additional note: this problem fits nicely in sas base programming forum, where it would get the widest audience.
I'm afraid my result doesn;t seem to match your ID2,
Can you correct the logic and see where i am wrong
For ID2, my logic computes
2016 2/14*3000
2017 12/14*3000
data have;
input ID TOTAL_VALUE (START_DATE END_DATE) (:yymmdd10.);
format START_DATE END_DATE yymmdd10.;
cards;
1 1000 15-11-04 17-09-30
2 3000 16-11-29 17-12-15
3 5000 15-12-08 16-01-11
;
data want;
if _n_=1 then do;
dcl hash H (ordered: "A", multidata:'y') ;
h.definekey ("id") ;
h.definedata ('k','month') ;
h.definedone () ;
end;
set have;
_month=0;
do k=year(start_date) to year(end_date);
if k ne year(end_date) then do;
month=intck('month',start_date,intnx('year',start_date,0,'e'))+1;
h.add();
_month+month;
start_date=intnx('year',start_date,1,'b');
end;
else if k=year(end_date) then do; month=intck('month',mdy(1,1,k),end_date)+1;
h.add();
_month+month;
end;
end;
do while(h.do_over(key:id) eq 0);
value=month/_month*TOTAL_VALUE;
output;
end;
keep id k value;
run;
SAS Output
1 | 2015 | 86.96 |
1 | 2016 | 521.74 |
1 | 2017 | 391.30 |
2 | 2016 | 428.57 |
2 | 2017 | 2571.43 |
3 | 2015 | 2500.00 |
3 | 2016 | 2500.00 |
Unless I misunderstand the OP request, that is way too much code. I believe a single loop will do:
data have;
input ID TOTAL_VALUE (START_DATE END_DATE) (:yymmdd10.);
format START_DATE END_DATE yymmdd10.;
cards;
1 1000 15-11-04 17-09-30
2 3000 16-11-29 17-12-15
3 5000 15-12-08 16-01-11
;
data want (drop=_:);
set have;
_ndays=1+end_date-start_date;
do year=year(start_date) to year(end_date);
value= total_value*(1+ min(end_date,mdy(12,31,year))-max(start_date,mdy(1,1,year)) )/_ndays;
output;
end;
run;
The above is in days. If you want months (where each month in a range counts as a whole month):
data want2 (drop=_:);
set have;
_nmonths=1+intck('month',start_date,end_date);
do year=year(start_date) to year(end_date);
value= total_value
*(1+intck('month',max(start_date,mdy(1,1,year)),min(end_date,mdy(12,31,year))))
/_nmonths;
output;
end;
run;
Edited additional note: this problem fits nicely in sas base programming forum, where it would get the widest audience.
@mkeintz Thank you sir for helping me learn so much and your points have always been amazing, not just this thread. Paying attention to your points, I cleaned my code a little bit, though yours is classy
data have;
input ID TOTAL_VALUE (START_DATE END_DATE) (:yymmdd10.);
format START_DATE END_DATE yymmdd10.;
cards;
1 1000 15-11-04 17-09-30
2 3000 16-11-29 17-12-15
3 5000 15-12-08 16-01-11
;
data want;
set have;
_month=intck('month',start_date,end_Date)+1;
do year=year(start_date) to year(end_date);
if year=year(start_date) then months=intck('month',start_date,intnx('year',start_date,0,'e'))+1;
else if year=year(end_date) then months=intck('month',intnx('year',end_date,0,'b'),end_date)+1;
else months=12;
value=months/_month*TOTAL_VALUE;
output;
end;
keep id year value;
run;
or
data want;
set have;
_month=intck('month',start_date,end_Date)+1;
do year=year(start_date) to year(end_date);
if year=year(start_date) then months=12-month(start_date)+1;
else if year=year(end_date) then months=month(end_date);
else months=12;
value=months/_month*TOTAL_VALUE;
output;
end;
keep id year value;
run;
Your code makes the logic clearer to the viewer than my dense single line.
But it assumes that all intervals are truly multi-year. If the interval starts and ends in the same year and is less than 12 months, you'll need to make some changes. Otherwise, the title of this discussion better be 100% correct.
data have;
input ID TOTAL_VALUE (START_DATE END_DATE) (:yymmdd10.);
format START_DATE END_DATE yymmdd10.;
cards;
1 1000 15-11-04 17-09-30
2 3000 16-11-29 17-12-15
3 5000 15-12-08 16-01-11
;
data temp;
set have;
range=end_date-start_date;
do date=start_date to end_date;
year=year(date);
output;
end;
format date yymmdd10.;
run;
proc summary data=temp;
by id total_value range year;
output out=temp1;
run;
data want;
set temp1;
value=total_value*_freq_/range;
keep id year value;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.