Backfilling dates

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 121
Accepted Solution

Backfilling dates

My data currently looks as follows. What I want to do is stated in the narrative below; any help would be extremely appreciated. Many thanks in advance.

 

LoanID    Cycle_Date    Default_Date  Payoff_Date

34565     12/31/2008          .

34565     12/31/2007       5/8/2008          .

34565     12/31/2008          .                    .

34565     12/31/2009          .                    .

34565     12/31/2010       6/1/2011          .

34565     12/31/2011          .                    .

12345     12/31/2006          .                    .

12345     12/31/2007          .                    .

12345     12/31/2008       4/5/2009          .

12345     12/31/2009          .                    .

12345     12/31/2010          .                    .

12345     12/31/2011          .                    .

12345     12/31/2012                              5/3/2013

 

In the case of LoanID 34565, I have two default dates, so I am treating the loan as having become active again between the first and second default dates. I am looking to backfill the Payoff_Date in the records between the two dates with date 1/1/9999 and to also delete any outstanding records that come after the second default date.

 

In the case of LoanID 12345, I have a payoff date, so I am treating the loan as having become active again between the default date and the payoff date. I am looking to backfill the Payoff_Date in the records between the default date and the Payoff_Date with date 1/1/9999. 

 

My final dataset would, ideally, look like the following.

 

LoanID    Cycle_Date    Default_Date  Payoff_Date

34565     12/31/2008          .

34565     12/31/2007       5/8/2008          .

34565     12/31/2008          .                  1/1/9999

34565     12/31/2009          .                  1/1/9999 .

34565     12/31/2010       6/1/2011          .

12345     12/31/2006          .                  

12345     12/31/2007          .                    .

12345     12/31/2008       4/5/2009         

12345     12/31/2009          .                   1/1/9999

12345     12/31/2010          .                   1/1/9999

12345     12/31/2011          .                   1/1/9999

12345     12/31/2012                              5/3/2013


Accepted Solutions
Solution
‎09-25-2015 06:23 AM
Super User
Posts: 7,447

Re: Backfilling dates

[ Edited ]

So you already solved the problem of expanding your initial dataset Smiley Wink

 

data have;
infile cards;
input
  loanid
  cycle_date:mmddyy10.
  default_date:mmddyy10.
  payoff_date:mmddyy10.
;
format cycle_date default_date payoff_date mmddyy10.;
cards;
34565 12/31/2006 . .
34565 12/31/2007 5/8/2008 .
34565 12/31/2008 . .
34565 12/31/2009 . .
34565 12/31/2010 6/1/2011 .
34565 12/31/2011 . .
12345 12/31/2006 . .
12345 12/31/2007 . .
12345 12/31/2008 4/5/2009 .
12345 12/31/2009 . .
12345 12/31/2010 . .
12345 12/31/2011 . .
12345 12/31/2012 . 5/3/2013
;
run;

data int (drop=def_flag);
set have;
by loanid notsorted;
retain def_flag;
if first.loanid then def_flag = 0;
if def_flag = 1 and payoff_date = . and default_date = .
then payoff_date = mdy(1,1,9999);
if default_date ne . then def_flag = 1;
run;

proc sort data=int;
by loanid descending cycle_date;
run;

/* now get rid of records after default_date */
data want (drop=rem_flag);
set int;
by loanid;
if first.loanid
then do;
  if payoff_date ne mdy(1,1,9999)
  then rem_flag = 0;
  else rem_flag = 1;
end;
if default_date ne . then rem_flag = 0;
if rem_flag = 1 then delete;
run;

proc sort data=want;
by loanid cycle_date;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Solution
‎09-25-2015 06:23 AM
Super User
Posts: 7,447

Re: Backfilling dates

[ Edited ]

So you already solved the problem of expanding your initial dataset Smiley Wink

 

data have;
infile cards;
input
  loanid
  cycle_date:mmddyy10.
  default_date:mmddyy10.
  payoff_date:mmddyy10.
;
format cycle_date default_date payoff_date mmddyy10.;
cards;
34565 12/31/2006 . .
34565 12/31/2007 5/8/2008 .
34565 12/31/2008 . .
34565 12/31/2009 . .
34565 12/31/2010 6/1/2011 .
34565 12/31/2011 . .
12345 12/31/2006 . .
12345 12/31/2007 . .
12345 12/31/2008 4/5/2009 .
12345 12/31/2009 . .
12345 12/31/2010 . .
12345 12/31/2011 . .
12345 12/31/2012 . 5/3/2013
;
run;

data int (drop=def_flag);
set have;
by loanid notsorted;
retain def_flag;
if first.loanid then def_flag = 0;
if def_flag = 1 and payoff_date = . and default_date = .
then payoff_date = mdy(1,1,9999);
if default_date ne . then def_flag = 1;
run;

proc sort data=int;
by loanid descending cycle_date;
run;

/* now get rid of records after default_date */
data want (drop=rem_flag);
set int;
by loanid;
if first.loanid
then do;
  if payoff_date ne mdy(1,1,9999)
  then rem_flag = 0;
  else rem_flag = 1;
end;
if default_date ne . then rem_flag = 0;
if rem_flag = 1 then delete;
run;

proc sort data=want;
by loanid cycle_date;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Valued Guide
Posts: 858

Re: Backfilling dates

I've had projects like this before and have run into more nuances than you have here.  So with a large population of multiple defaulters you might have to tweak this but it works for your data:

 


data have;
infile cards;
informat loanid $5. cycle_date default_date payoff_date mmddyy10.;
format loanid $5. cycle_date default_date payoff_date mmddyy10.;
input LoanID$    Cycle_Date    Default_Date  Payoff_Date;
cards;
34565     12/31/2008          .              .
34565     12/31/2007       5/8/2008          .
34565     12/31/2008          .              .
34565     12/31/2009          .              .
34565     12/31/2010       6/1/2011          .
34565     12/31/2011          .              .
12345     12/31/2006          .              .
12345     12/31/2007          .              .
12345     12/31/2008       4/5/2009          .
12345     12/31/2009          .              .
12345     12/31/2010          .              .
12345     12/31/2011          .              .
12345     12/31/2012          .              5/3/2013
;

proc sort data=have;by loanid cycle_date default_date;

data prep;
set have;
by loanid cycle_date notsorted;
if not missing(default_date) then _dcount +1;
if first.loanid then _dcount=1;
if first.loanid and missing(default_date) then _dcount = 0;

if not missing(payoff_date) then _pcount +1;
if first.loanid then _pcount=1;
if first.loanid and missing(payoff_date) then _pcount = 0;
run;

proc sql;
create table counts as
select *,min(_dcount) as _min_d,max(_dcount) as _max_d,min(_pcount) as _min_p,max(_pcount) as _max_p
from prep
group by loanid
order by loanid,cycle_date;

data want;
set counts;
by loanid;
if _min_d < _max_d and _max_d > 1 then do;
    if _dcount > 0 and missing(default_date) and missing(payoff_date) then payoff_date = '01JAN9999'd;
    if _dcount = _max_d and missing(default_date) and _max_p = 0 then delete;
end;
if _dcount = _max_d and _max_p > 0 and missing(default_date) and missing(payoff_date) then payoff_date = '01JAN9999'd;
drop _:;
run;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 253 views
  • 0 likes
  • 3 in conversation