Hello
Let's say I have a data set in long structure. for each customer ID +loan need to calculate max offer in last 7 days prior to loan date
What is the way to do it?
For example for customer 111 there us one loan and max 7 days before is 22000( offer in 01jul is not relevant because more than 7 days).
For customer 222 loan 13000 max offer is 40000 and for loan 35000 max is 28000 ( Offer 40000 is not taking into account because more than 7 days prior loan)
In desire table will have 3 rows because have 3 loans.
What is way yo do it??
Data have;
Format date ddmmyy10.;
Input custid date :date9. loansAmnt offerAmnt;
Cards;
111 01JUL2025 . 50000
111 03jul2025 . 22000
111 08jul2025 . 19000
111 09jul2025 5000 .
222 01jul2025 . 40000
222 03jul2025 . 28000
222 04jul2025 13000 .
222 08jul2025 . 27000
223 09jul2025 35000 .
;
Run:
Use a date-indexed array to keep the offer amounts:
data want;
set have;
by custid;
array ams [1:30000] _temporary_;
if first.custid
then do _date = 1 to 30000;
ams[_date] = 0;
end;
if offeramnt ne . then ams[date] = offeramnt;
if loansamnt ne .
then do _date = date - 7 to date;
put _date ams[_date];
maxamnt = max(maxamnt,ams[_date]);
end;
if loansamnt;
drop _date offeramnt;
run;
As I said, the array is date-indexed. Now, with your knowledge about SAS dates, think which dates a range of 1 to 30000 covers.
The array needs to span the range of dates present in your data, or more.
@Ronein wrote:
Why array have 30000 arguments? How dud you know to choose this number ?
If you are using an array from 0 to 30000, it covers Jan 1, 1960 through 19feb2024. You can customize the date range per below (say for 01jan2009 through 31dec2020):
%let begdate=01jan2009 ;
%let enddate=31dec2020 ;
data;
* other sas code ;
array history {%sysevalf("&begdate"d):%sysevalf("&enddate"d)} _temporary_ ;
* other sas code ;
run;
Your calculation is off by quite some years. 30000 days is equivalent to ~ 82 years (30000 / 365.25 = 82.1355), so the end date must be in the 2040's, not in 2024.
PS if you want a non-zero value for the last observation, custid should be 222 there, not 223.
Here's a data step that effectively maintains a moving window of offer amounts for the prior 7 days.
With each obs, first weed out stale (over 7 days prior) data.
Then just take the maximum of that small array.
Then update the arrays with the new obs
data have;
format date ddmmyy10.;
input custid date :date9. loansAmnt offerAmnt;
cards;
111 01JUL2025 . 50000
111 03jul2025 . 22000
111 08jul2025 . 19000
111 09jul2025 5000 .
222 01jul2025 . 40000
222 03jul2025 . 28000
222 04jul2025 13000 .
222 08jul2025 . 27000
223 09jul2025 35000 .
run;
data want (drop=_:);
set have;
by custid;
array tempdat{0:6} _temporary_; /*Observed dates within prior 7 days*/
array tempval{0:6} _temporary_; /*Observed offers within prior 7 days*/
/* Weed stale data first */
if first.custid=1 then call missing(of tempval{*},of tempdat{*});
else do while (.< min(of tempdat{*}) < date-7);
_d=mod(min(of tempdat{*}),7);
call missing(tempval{_d},tempdat{_d});
end;
/* Get the maximum */
if n(of tempdat{*})>0 then max_prior_7days=max(of tempval{*});
/*Update the arrays with current obs */
_d=mod(date,7);
tempdat{_d}=date;
tempval{_d}=offeramnt;
run;
Ignore the previously sumitted code below, which didn't weed before taking the maximum:
data want (drop=_:);
set have; by custid;
array tempdat{0:6} _temporary_; /*Observed dates within prior 7 days*/
array tempval{0:6} _temporary_; /*Observed offers within prior 7 days*/
/* Get the maximum, when there is a 7-day window available */
if first.custid=0 and dif(date)<=7 then max_prior_7days=max(of tempval{*});
else call missing(of tempval{*},of tempdat{*});
/*Update the arrays with current obs */
_d=mod(date,7); tempdat{_d}=date;
tempval{_d}=offeramnt;
/*Weed out stale data */
do while (min(of tempdat{*})<date-7);
_d=mod(min(of tempdat{*}),7);
call missing(tempval{_d},tempdat{_d});
end;
run;
Two points:
Two simple steps
/* step 1 */
data intermediate;
merge have(where=(loansamnt=.)) have(where=(loansamnt>.) drop=offeramnt rename=(date=loandate));
by custid;
delta_date=loandate-date;
run;
/* step 2 */
proc summary data=intermediate nway;
where delta_date between 0 and 7 or missing(delta_date);
class custid;
var offeramnt;
output out=max_offer max=max_offer;
run;
Hi @Ronein , I think the major or first problem for solve this question is, create one more date column for your loan date. You cannot put the date for loans and date for offer in only one date column, this is not good, because both dates will be needed for calculating and further analysis. Put two different kind of date in one column will make date calculation difficult and cause trouble in future data analysis, regardless it is business or health data. Therefore, I suggest that create a loandate and an offerdate column for your dataset first, and the problem will become much easier by using whatever procs with an calcualtion such as offerdate-loandate<=7 days; statement.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.