BookmarkSubscribeRSS Feed
Ronein
Onyx | Level 15

Hello
Let's say I have a data set in long structure. for each customer ID +loan need to calculate max offer in last 7 days prior to loan date
What is the way to do it?
For example for customer 111 there us one loan and max 7 days before is 22000( offer in 01jul is not relevant because more than 7 days).
For customer 222 loan 13000 max offer is 40000 and for loan 35000 max is 28000 ( Offer 40000 is not taking into account because more than 7 days prior loan)

In desire table will have 3 rows because have 3 loans.

What is way yo do it??
 

Data have;
Format date ddmmyy10.;
Input custid date :date9. loansAmnt offerAmnt;
Cards;
111 01JUL2025 . 50000
111 03jul2025 . 22000
111 08jul2025 . 19000
111 09jul2025 5000 .
222 01jul2025 . 40000
222 03jul2025 . 28000
222 04jul2025 13000 .
222 08jul2025 . 27000
223 09jul2025 35000 .
;
Run:
11 REPLIES 11
Kurt_Bremser
Super User

Use a date-indexed array to keep the offer amounts:

data want;
set have;
by custid;
array ams [1:30000] _temporary_;
if first.custid
then do _date = 1 to 30000;
  ams[_date] = 0;
end;
if offeramnt ne . then ams[date] = offeramnt;
if loansamnt ne .
then do _date = date - 7 to date;
  put _date ams[_date];
  maxamnt = max(maxamnt,ams[_date]);
end;
if loansamnt;
drop _date offeramnt;
run;
Ronein
Onyx | Level 15
Why array have 30000 arguments? How dud you know to choose this number ?
Kurt_Bremser
Super User

As I said, the array is date-indexed. Now, with your knowledge about SAS dates, think which dates a range of 1 to 30000 covers.

The array needs to span the range of dates present in your data, or more.

mkeintz
PROC Star

@Ronein wrote:
Why array have 30000 arguments? How dud you know to choose this number ?

If you are using an array from 0 to 30000, it covers Jan 1, 1960 through 19feb2024.  You can customize the date range per below (say for 01jan2009 through 31dec2020):

 

%let begdate=01jan2009 ;
%let enddate=31dec2020 ;

data;
  * other sas code ;
  array history {%sysevalf("&begdate"d):%sysevalf("&enddate"d)} _temporary_ ;
  * other sas code ;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Kurt_Bremser
Super User

Your calculation is off by quite some years. 30000 days is equivalent to ~ 82 years (30000 / 365.25 = 82.1355), so the end date must be in the 2040's, not in 2024.

mkeintz
PROC Star

Here's a data step that effectively maintains a moving window of offer amounts for the prior 7 days.
  With each obs, first weed out stale (over 7 days prior) data.

  Then just take the maximum of that small array.

  Then update the arrays with the new obs

 

data have;
  format date ddmmyy10.;
  input custid date :date9. loansAmnt offerAmnt;
cards;
111 01JUL2025 .     50000
111 03jul2025 .     22000
111 08jul2025 .     19000
111 09jul2025 5000  .
222 01jul2025 .     40000
222 03jul2025 .     28000
222 04jul2025 13000 .
222 08jul2025 .     27000
223 09jul2025 35000 .
run;
data want (drop=_:);
  set have;
  by custid;

  array tempdat{0:6} _temporary_; /*Observed dates within prior 7 days*/
  array tempval{0:6} _temporary_; /*Observed offers within prior 7 days*/  

  /* Weed stale data first */
  if first.custid=1 then call missing(of tempval{*},of tempdat{*});
  else do while (.< min(of tempdat{*}) < date-7);    
    _d=mod(min(of tempdat{*}),7);
    call missing(tempval{_d},tempdat{_d});
  end;

  /* Get the maximum */
  if n(of tempdat{*})>0 then max_prior_7days=max(of tempval{*});

  /*Update the arrays with current obs */
  _d=mod(date,7);   
  tempdat{_d}=date;
  tempval{_d}=offeramnt;
run;

 

Ignore the previously sumitted code below, which didn't weed before taking the maximum:

 

data want (drop=_:);

  set have; by custid;

  array tempdat{0:6} _temporary_; /*Observed dates within prior 7 days*/

  array tempval{0:6} _temporary_; /*Observed offers within prior 7 days*/

 

  /* Get the maximum, when there is a 7-day window available */

  if first.custid=0 and dif(date)<=7 then max_prior_7days=max(of tempval{*});

  else call missing(of tempval{*},of tempdat{*});

 

  /*Update the arrays with current obs */

  _d=mod(date,7); tempdat{_d}=date;

  tempval{_d}=offeramnt;

 

  /*Weed out stale data */

  do while (min(of tempdat{*})<date-7);

    _d=mod(min(of tempdat{*}),7);

    call missing(tempval{_d},tempdat{_d});

  end;

 

run;

Two points:

  1. The temporary arrays are indexed from 0 to 6, effectively representing a day-of-week index.  So the data in the arrays are not ordered strictly chronologically, but order doesn't matter for getting the maximum.  The task is to be sure to eliminate stale values.
  2. The DIF(x) function is   x-lag(x).
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ronein
Onyx | Level 15
Why array have 6 arguments?
PaigeMiller
Diamond | Level 26

Two simple steps

 

  1. Find date of loan and find those offers within 7 days of loan (much simpler data step than those shown by others)
  2. Find max of offers within last 7 days (PROC SUMMARY);
/* step 1 */
data intermediate;
    merge have(where=(loansamnt=.))  have(where=(loansamnt>.) drop=offeramnt rename=(date=loandate));
    by custid;
    delta_date=loandate-date;
run;

/* step 2 */
proc summary data=intermediate nway;
    where delta_date between 0 and 7 or missing(delta_date);
    class custid;
    var offeramnt;
    output out=max_offer max=max_offer;
run;
--
Paige Miller
dxiao2017
Lapis Lazuli | Level 10

Hi @Ronein , I think the major or first problem for solve this question is, create one more date column for your loan date. You cannot put the date for loans and date for offer in only one date column, this is not good, because both dates will be needed for calculating and further analysis. Put two different kind of date in one column will make date calculation difficult and cause trouble in future data analysis, regardless it is business or health data. Therefore, I suggest that create a loandate and an offerdate  column for your dataset first, and the problem will become much easier by using whatever procs with an calcualtion such as offerdate-loandate<=7 days; statement.

Ronein
Onyx | Level 15
Can you show full code how to create desired output using 2 dare columns?

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 346 views
  • 6 likes
  • 5 in conversation