DATA Step, Macro, Functions and more

How to extract duration per year from a multi-year time period?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

How to extract duration per year from a multi-year time period?

Hello,

 

Example of data:

 

 

ID  TOTAL_VALUE   START_DATE   END_DATE

1   1000          15-11-04     17-09-30

2   3000          16-11-29     17-12-15

3   5000          15-12-08     16-01-11

 

(all dates are in SAS date)

 

I would like to get pro rata transaction value for each year involved. For example, for ID1, the transaction starts in Nov 2015 and ends in Sep 2017. It has 2 months in 2015, 12 months in 2016, and 9 months in 2017. So for 2015, the assigned value = 2/23 * 1000, for 2016, 12/23 * 1000, for 2017, 9/23 * 1000. 

 

My desired output:

 

ID  YEAR    VALUE   

1   2015    86.96

1   2016    521.73

1   2017    391.30

2   2016    230.77

2   2017    2769.23

3   2015    2500.00

3   2016    2500.00

 

Calculation in terms of number of days, or round down to number of months as I have shown in the example are both acceptable. 

 

I'm very new to SAS, and I can only think of extracting years from start_date and end_date and then create an array of years out of it. But I can't figure out how to do the prorata part. Any thoughts would help. Thank you!

 


Accepted Solutions
Solution
Monday
Trusted Advisor
Posts: 1,337

Re: How to extract duration per year from a multi-year time period?

[ Edited ]
Posted in reply to novinosrin

Unless I misunderstand the OP request, that is way too much code.  I believe a single loop will do:

 

data have;
  input ID  TOTAL_VALUE   (START_DATE   END_DATE) (:yymmdd10.);
  format START_DATE   END_DATE yymmdd10.;
cards;
1   1000          15-11-04     17-09-30
2   3000          16-11-29     17-12-15
3   5000          15-12-08     16-01-11
;
data want (drop=_:);
  set have;
  _ndays=1+end_date-start_date;
  do year=year(start_date) to year(end_date);
    value=  total_value*(1+ min(end_date,mdy(12,31,year))-max(start_date,mdy(1,1,year)) )/_ndays;
    output;
  end;
run;

 

The above is in days.  If you want months (where each month in a range counts as a whole month):

 

data want2 (drop=_:);
  set have;
  _nmonths=1+intck('month',start_date,end_date);

  do year=year(start_date) to year(end_date);
    value= total_value
         *(1+intck('month',max(start_date,mdy(1,1,year)),min(end_date,mdy(12,31,year))))
         /_nmonths;
    output;
  end;
run;

 

Edited additional note: this problem fits nicely in sas base programming forum, where it would get the widest audience.

View solution in original post


All Replies
PROC Star
Posts: 1,768

Re: How to extract duration per year from a multi-year time period?

I'm afraid my result doesn;t seem to match your ID2, 

Can you correct the logic and see where i am wrong

For ID2, my logic computes 

 

2016  2/14*3000

2017 12/14*3000

 

data have;
input ID  TOTAL_VALUE   (START_DATE   END_DATE) (:yymmdd10.);
format START_DATE   END_DATE yymmdd10.;
cards;
1   1000          15-11-04     17-09-30
2   3000          16-11-29     17-12-15
3   5000          15-12-08     16-01-11
;
data want;
if _n_=1 then do;
 dcl hash H (ordered: "A", multidata:'y') ;
   h.definekey  ("id") ;
   h.definedata ('k','month') ;
   h.definedone () ;
end;
set have;
_month=0;
do k=year(start_date) to year(end_date);
if k ne year(end_date) then do;
month=intck('month',start_date,intnx('year',start_date,0,'e'))+1;
h.add();
_month+month;
start_date=intnx('year',start_date,1,'b');
end;
else if k=year(end_date) then do; month=intck('month',mdy(1,1,k),end_date)+1;
h.add();
_month+month;
end;
end;
do while(h.do_over(key:id) eq 0);
value=month/_month*TOTAL_VALUE;
output;
end;
keep id k value;
run;

 

SAS Output

Obs ID k value1234567
1201586.96
12016521.74
12017391.30
22016428.57
220172571.43
320152500.00
320162500.00
Solution
Monday
Trusted Advisor
Posts: 1,337

Re: How to extract duration per year from a multi-year time period?

[ Edited ]
Posted in reply to novinosrin

Unless I misunderstand the OP request, that is way too much code.  I believe a single loop will do:

 

data have;
  input ID  TOTAL_VALUE   (START_DATE   END_DATE) (:yymmdd10.);
  format START_DATE   END_DATE yymmdd10.;
cards;
1   1000          15-11-04     17-09-30
2   3000          16-11-29     17-12-15
3   5000          15-12-08     16-01-11
;
data want (drop=_:);
  set have;
  _ndays=1+end_date-start_date;
  do year=year(start_date) to year(end_date);
    value=  total_value*(1+ min(end_date,mdy(12,31,year))-max(start_date,mdy(1,1,year)) )/_ndays;
    output;
  end;
run;

 

The above is in days.  If you want months (where each month in a range counts as a whole month):

 

data want2 (drop=_:);
  set have;
  _nmonths=1+intck('month',start_date,end_date);

  do year=year(start_date) to year(end_date);
    value= total_value
         *(1+intck('month',max(start_date,mdy(1,1,year)),min(end_date,mdy(12,31,year))))
         /_nmonths;
    output;
  end;
run;

 

Edited additional note: this problem fits nicely in sas base programming forum, where it would get the widest audience.

Occasional Contributor
Posts: 7

Re: How to extract duration per year from a multi-year time period?

Thank you! Your solution is elegant and works like a charm, especially in handling single-year situation.
I'll also take a look at sas base programming forum. Thanks!
PROC Star
Posts: 1,768

Re: How to extract duration per year from a multi-year time period?

[ Edited ]

@mkeintz Thank you sir for helping me learn so much and your points have always been amazing, not just this thread. Paying attention to your points, I cleaned my code a little bit, though yours is classy

 

data have;
input ID  TOTAL_VALUE   (START_DATE   END_DATE) (:yymmdd10.);
format START_DATE   END_DATE yymmdd10.;
cards;
1   1000          15-11-04     17-09-30
2   3000          16-11-29     17-12-15
3   5000          15-12-08     16-01-11
;
data want;
set have;
_month=intck('month',start_date,end_Date)+1; 
do year=year(start_date) to year(end_date);
if year=year(start_date) then months=intck('month',start_date,intnx('year',start_date,0,'e'))+1; 
else if year=year(end_date) then months=intck('month',intnx('year',end_date,0,'b'),end_date)+1; 
else months=12;
value=months/_month*TOTAL_VALUE;
output;
end;
keep id year value;
run;

 or

 

data want;
set have;
_month=intck('month',start_date,end_Date)+1; 
do year=year(start_date) to year(end_date);
if year=year(start_date) then months=12-month(start_date)+1; 
else if year=year(end_date) then months=month(end_date); 
else months=12;
value=months/_month*TOTAL_VALUE;
output;
end;
keep id year value;
run;

 

Trusted Advisor
Posts: 1,337

Re: How to extract duration per year from a multi-year time period?

Posted in reply to novinosrin

@novinosrin

 

Your code makes the logic clearer to the viewer than my dense single line.

 

But it assumes that all intervals are truly multi-year.  If the interval starts and ends in the same year and is less than 12 months, you'll need to make some changes.  Otherwise, the title of this discussion better be 100% correct.

 

Occasional Contributor
Posts: 7

Re: How to extract duration per year from a multi-year time period?

Posted in reply to novinosrin
Hi! Your solution works for multi-year situation, but doesn't work for duration less than one year. Sorry that I didn't make it clear in description.
I took a second thought on number of months - for example, if end date is Oct 1, the entire Oct would be taken into account in your code, but I excluded Oct when I eyeballed the data. That's why our results differ. Your logic would work perfectly if we both counted number of days.
Thank you for the answer!
Super User
Posts: 10,761

Re: How to extract duration per year from a multi-year time period?

data have;
input ID  TOTAL_VALUE   (START_DATE   END_DATE) (:yymmdd10.);
format START_DATE   END_DATE yymmdd10.;
cards;
1   1000          15-11-04     17-09-30
2   3000          16-11-29     17-12-15
3   5000          15-12-08     16-01-11
;
data temp;
 set have;
 range=end_date-start_date;
 do date=start_date to end_date;
  year=year(date);
  output;
 end;
 format date yymmdd10.;
run;
proc summary data=temp;
by id total_value range year;
output out=temp1;
run;
data want;
 set temp1;
 value=total_value*_freq_/range;
 keep id year value;
run;
Occasional Contributor
Posts: 7

Re: How to extract duration per year from a multi-year time period?

Hi! I think your solution works too, but mkeintz answered earlier and his/her solution runs faster so I accepted that one. Thank you for answering!
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 143 views
  • 1 like
  • 4 in conversation