I am trying to identify "valid" defaulted borrowers from the default listing span from 2012 to 2019. If a borrower re-default within the default's tracking period (win_end_date) then that particular default observation need to be removed. On contrary, If a borrower redefault after expiry of tracking period then that default event will be recognised as "valid" again.
Here is my sample and expected results:
Raw Data | |||
Cust_NO | DEF_DATE | Win_End_Date | Remark |
10106212 | 31-Dec-12 | 31-Dec-14 | 1st Default (tracked performance up to 24 momths) |
10106212 | 31-Dec-13 | 31-Dec-15 | Overlapped with 1st Default tracking period (Re-Defaulted within tracking periods of 1st Default) |
10106212 | 31-Dec-14 | 31-Dec-16 | Overlapped with 1st Default tracking period (Re-Defaulted within tracking periods of 1st Default) |
10106212 | 31-Dec-15 | 31-Dec-17 | 2nd Default (happen after expiry of tracking period of 1st Default) |
10106212 | 31-Dec-17 | 31-Dec-19 | Overlapped with 2nd Default tracking period (Re-Defaulted within tracking periods of 2nd Default) |
10106212 | 31-Dec-18 | 31-Dec-20 | 3rd Default |
10378803 | 30-Jun-14 | 30-Jun-16 | 1st Default |
10378803 | 31-Dec-15 | 31-Dec-17 | Overlapped with 1st Default tracking period (Re-Defaulted within tracking periods of 1st Default) |
Expected Output | |||
Cust_NO | DEF_DATE | Win_End_Date | |
10106212 | 31-Dec-12 | 31-Dec-14 | |
10106212 | 31-Dec-15 | 31-Dec-17 | |
10106212 | 31-Dec-18 | 31-Dec-20 | |
10378803 | 30-Jun-14 | 30-Jun-16 |
data have;
format cust_no 8. def_date win_end_date date9.;
input cust_no :8. def_date :date11. win_end_date :date11.;
datalines;
10106212 31-Dec-12 31-Dec-14
10106212 31-Dec-13 31-Dec-15
10106212 31-Dec-14 31-Dec-16
10106212 31-Dec-15 31-Dec-17
10106212 31-Dec-17 31-Dec-19
10106212 31-Dec-18 31-Dec-20
10378803 30-Jun-14 30-Jun-16
10378803 31-Dec-15 31-Dec-17
;
run;
data want;
set have;
by cust_no;
retain curr_win_end_date;
if first.cust_no or def_date gt curr_win_end_date then curr_win_end_date=win_end_date;
else if not first.cust_no and def_date le curr_win_end_date then delete;
drop curr_win_end_date;
run;
cust_no def_date win_end_date 10106212 31DEC2012 31DEC2014 10106212 31DEC2015 31DEC2017 10106212 31DEC2018 31DEC2020 10378803 30JUN2014 30JUN2016
data have;
format cust_no 8. def_date win_end_date date9.;
input cust_no :8. def_date :date11. win_end_date :date11.;
datalines;
10106212 31-Dec-12 31-Dec-14
10106212 31-Dec-13 31-Dec-15
10106212 31-Dec-14 31-Dec-16
10106212 31-Dec-15 31-Dec-17
10106212 31-Dec-17 31-Dec-19
10106212 31-Dec-18 31-Dec-20
10378803 30-Jun-14 30-Jun-16
10378803 31-Dec-15 31-Dec-17
;
run;
data want;
set have;
by cust_no;
retain curr_win_end_date;
if first.cust_no or def_date gt curr_win_end_date then curr_win_end_date=win_end_date;
else if not first.cust_no and def_date le curr_win_end_date then delete;
drop curr_win_end_date;
run;
cust_no def_date win_end_date 10106212 31DEC2012 31DEC2014 10106212 31DEC2015 31DEC2017 10106212 31DEC2018 31DEC2020 10378803 30JUN2014 30JUN2016
I think this can be done in a straightforward way by using a date cutpoint updated each time the CUST_NO changes or a qualifying record is output.
data have;
format cust_no 8. def_date win_end_date date9.;
input cust_no :8. def_date :date11. win_end_date :date11.;
datalines;
10106212 31-Dec-12 31-Dec-14
10106212 31-Dec-13 31-Dec-15
10106212 31-Dec-14 31-Dec-16
10106212 31-Dec-15 31-Dec-17
10106212 31-Dec-17 31-Dec-19
10106212 31-Dec-18 31-Dec-20
10378803 30-Jun-14 30-Jun-16
10378803 31-Dec-15 31-Dec-17
run;
data want (drop=cut_point);
set have;
by cust_no;
retain cut_point;
if first.cust_no=1 then cut_point=.;
if def_date > cut_point;
cut_point=win_end_date;
run;
This takes advantage of the fact that when cut_point=., it will be considered lower than all valid date values.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.