Desktop productivity for business analysts and programmers

Credit Risk Model - Data Preparation using SAS Code

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 15
Accepted Solution

Credit Risk Model - Data Preparation using SAS Code

Kindly note that I have a query which has 3 parts. I am in look for three different SAS code to deal with them.
 
1) For each observation date corresponding to a given acct_id, I am expected to check in the next 12 months including the given observation month  for BQR , if it gets a value 9 then declare that acct_id at the given date to have a default_flag = 1 else 0
 
2) As soon as an acct_id gets a default_flag = 1, I am expected to check further if in the next 60 months, BQR gets a value "W". If it gets a value "W", then I have to declare that acct_id at that given month to have write_off_flag = 1 else 0
 
3) I am expected to create variables BQR_1 to BQR_12 and populate values from the variable BQR for the next 12 months
 
Please find the attached spreadsheet for sample query.
 
Also note:
 
1) The start date of each acct_id need not be the same
2) There is no record available for the acct_id as soon as it gets BQR status as "W"

Accepted Solutions
Solution
3 weeks ago
Respected Advisor
Posts: 3,831

Re: Credit Risk Model - Data Preparation using SAS Code

[ Edited ]

@Skb19121985

Below some code which should return the result you're after.

Looking into your Excel I don't understand why the write_off_flg gets only so late set to '1'. That's certainly later than 60 months ahead of the date where BQR gets a value of 'W'.

 

data have;
  input date:anydtdte. acct_id:$16. BQR:$1.;
  format date date9.;
datalines;
1-Jan-90 123 4
1-Feb-90 123 5
1-Mar-90 123 6
1-Apr-90 123 7
1-May-90 123 8
1-Jun-90 123 4
1-Jul-90 123 4
1-Aug-90 123 5
1-Sep-90 123 6
1-Oct-90 123 7
1-Nov-90 123 8
1-Dec-90 123 3
1-Jan-91 123 1
1-Feb-91 123 2
1-Mar-91 123 3
1-Apr-91 123 4
1-May-91 123 5
1-Jun-91 123 4
1-Jul-91 123 5
1-Aug-91 123 7
1-Sep-91 123 9
1-Oct-91 123 9
1-Nov-91 123 9
1-Dec-91 123 9
1-Jan-92 123 9
1-Feb-92 123 9
1-Mar-92 123 9
1-Apr-92 123 9
1-May-92 123 9
1-Jun-92 123 9
1-Jul-92 123 9
1-Aug-92 123 9
1-Sep-92 123 9
1-Oct-92 123 9
1-Nov-92 123 W
1-Sep-90 133 1
1-Oct-90 133 1
1-Nov-90 133 1
1-Dec-90 133 1
1-Jan-91 133 2
1-Feb-91 133 3
1-Mar-91 133 4
1-Apr-91 133 5
1-May-91 133 4
1-Jun-91 133 3
1-Jul-91 133 5
1-Aug-91 133 6
1-Sep-91 133 7
1-Oct-91 133 8
1-Nov-91 133 7
1-Dec-91 133 6
1-Jan-92 133 5
1-Feb-92 133 4
1-Mar-92 133 5
1-Apr-92 133 6
1-May-92 133 7
1-Jun-92 133 9
1-Jul-92 133 9
1-Aug-92 133 9
1-Sep-92 133 9
1-Oct-92 133 9
1-Nov-92 133 9
1-Dec-92 133 9
1-Jan-93 133 8
1-Feb-93 133 8
1-Mar-93 133 8
1-Apr-93 133 8
1-May-93 133 8
1-Jun-93 133 8
1-Jul-93 133 9
1-Aug-93 133 W
;
run;

%let write_of_months=60;
data answers(drop=_:);
  set have;
  by acct_id date;

  retain default_flg '0' write_off_flag '0';

  if first.acct_id=1 then
    do;
      dcl hash h1(dataset:cats('have (where=(acct_id="',acct_id,'"))'),ordered:'y');
      h1.defineKey('acct_id','date');
      h1.defineData('bqr');
      h1.defineDone();
      declare hiter hh1 ('h1');

      default_flg='0';
      write_off_flag='0';
    end;
  
  /* populate bqr1 to bqr12 and default_flg */
  array _arr_bqr(*) $1 bqr1-bqr12;
  call missing(of _arr_bqr(*));
  _rc = hh1.setcur(key:acct_id,key:date);
  do _i=1 to 12 while (_rc = 0);
    _arr_bqr[_i]=bqr; 
    if bqr='9' then default_flg='1';
    _rc = hh1.next();
  end;

  /* populate write_off_flag */
  if write_off_flag='0' and default_flg='1' then
    do;
      _rc = hh1.setcur(key:acct_id,key:date);
      do _i=1 to &write_of_months while (_rc = 0);
        if bqr='W' then 
          do;
            write_off_flag='1';
            leave;
          end; 
        _rc = hh1.next();
      end;
    end;

  _rc = hh1.setcur(key:acct_id,key:date);

  if last.acct_id then h1.delete();
run;

Should you be dealing with large data volumes then tweaking for performance would be possible for the posted code - but it would complicate the syntax so it's only something I'd be doing if really necessary.

 

View solution in original post


All Replies
Grand Advisor
Posts: 17,332

Re: Credit Risk Model - Data Preparation using SAS Code

So what is your question for us? This is your work, asking us to do it isn't a fair expectation and you haven't clearly illustrated what you need help with or what you've tried so far. 

Occasional Contributor
Posts: 15

Re: Credit Risk Model - Data Preparation using SAS Code

Hi Reeza

 

Thanks for replying back.

Actually, I have 3 questions as listed out.

To reteriate , I wanted to flag an acct_id as default at a given observation date, if it is discovered to have defaulted with BQR=9 in the next 12 months.

Secondly, once, the defaulted acct_id at a given observation date is identified, I wanted to flag it as written-off if the BQR="W" in the next 12 months

Thirdly, I wanted to create variables BQR_1 to BQR_12 against each record (row) and populate the values of BQR as observed in next 12 months from the date of observation.

 

Hope the query is clear by now.

 

I am not able to think of the way to attempt at them so I have reached out directly on the forum.

 

Regards,

 

Sandeep

 

Grand Advisor
Posts: 10,210

Re: Credit Risk Model - Data Preparation using SAS Code


Skb19121985 wrote:

Hi Reeza

 

Thanks for replying back.

Actually, I have 3 questions as listed out.

 

Secondly, once, the defaulted acct_id at a given observation date is identified, I wanted to flag it as written-off if the BQR="W" in the next 12 months

 


"Next 12 months" from when, the period that you may be looking at or 12 months from the date of default.

 

I hope you provide examples of the input and output. I'm not going to look in the Excel for the reasons in my other response.

Grand Advisor
Posts: 17,332

Re: Credit Risk Model - Data Preparation using SAS Code

Solve Q3, then 1 and 2 become trivial. 

 

Q3 -> LAG<n> function

Q2/Q1 -> WHICHC function. 

 

Untested code below

 

Data answers;
Set have;
By ID;

Array _bqr(*) $ bqr1-bqr12;

If first.id then count=1;
Else count + 1;

Bqr1 = lag(bqr);
Bqr2 = lag2(bqr2);
...
Bqr12 = lag12(bqr);

If count < 12 then do;
Do I=12 to count by -1;
_bqr(I) =.;
End;
End;

X=whichc('W', of _bqr(*));
Y=whichc('9', of _bqr(*));

Run;
Occasional Contributor
Posts: 15

Re: Credit Risk Model - Data Preparation using SAS Code

Hi Reeza

 

Thanks for the reply, but this is not what I wanted.

 

Let me clarify further. Let me focus on 1st query. I will take up remaining 2 queries in next thread one-by-one.

 

Query1:

 

I want to create default_flag for row1 - acct_id = 123 BQR=4 and Snapshot_dte = 01JAN1990

This default flag is created in such a way that in next 12 month , that is SAS would read all records related to this acct_id from 01JAN1990 to 01DEC1990 and check if BQR=9 ever in this 12 month time and flag the variable default_flag = 1 , if found else 0 if not found against row1

 

Similarly, when row2 - acct_id = 123 BQR=5 and Snapshot_dte = 01FEB1990 is read, the default flag is created in such a way that in next 12 month , that is SAS would read all records related to this acct_id from 01FEB1990 to 01JAN1991 and check if BQR=9 ever in this 12 month time and flag the variable default_flag = 1 , if found else 0 if not found against row2

 

This process is to be executed for each row forward looking into 12 months data and creating default_flag

 

Hope the query is clear.

 

Let me know if it is not clear

 

That means, the SAS code should be such that internally it loop for next 12 months per row and create default flag

 

Grand Advisor
Posts: 17,332

Re: Credit Risk Model - Data Preparation using SAS Code

@Skb19121985 That is how I understood your question. I guess you don't understand my response. 

Occasional Contributor
Posts: 15

Re: Credit Risk Model - Data Preparation using SAS Code

Hi Reeza

 

I am almost near to what I need.

 

Please have a look at the given SAS Code:

%macro zest();
Data answers;
Set credit_data;
By acct_id;
Array _bqr(*) $ BQR1-BQR12;
If first.acct_id then count=1;
Else count + 1;

%do i = 1 %to 12 ;
if eof&i. = 0 then
set credit_data (firstobs=&i. keep=BQR rename=(BQR=BQR&i.)) end=eof&i. ;
if last.acct_id then BQR&i. = "";
%end ;
write_off_flag=whichc('W', of _bqr(*));
default_flag=whichc('9', of _bqr(*));
count_max = max(count);
Run ;
%mend;

%zest

 

However you will notice that in the dataset that gets created, row number 25 to 34 , all the values below the diagonal W are picked up from the acct_id 133 data, when they should all be blank

 

Similarly, for row number 61 to 71, all the values below the diagonal W are W, when they should all be blank

 

Can u please rectify the code ?

Grand Advisor
Posts: 10,210

Re: Credit Risk Model - Data Preparation using SAS Code

Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

Occasional Contributor
Posts: 15

Re: Credit Risk Model - Data Preparation using SAS Code

Please find the SAS dataset code here:

 

data credit_data ;
input acct_id $3. snapshot_date 6. BQR $8. ;
cards;
123 9001 4
123 9002 5
123 9003 6
123 9004 7
123 9005 8
123 9006 4
123 9007 4
123 9008 5
123 9009 6
123 9010 7
123 9011 8
123 9012 3
123 9101 1
123 9102 2
123 9103 3
123 9104 4
123 9105 5
123 9106 4
123 9107 5
123 9108 7
123 9109 9
123 9110 9
123 9111 9
123 9112 9
123 9201 9
123 9202 9
123 9203 9
123 9204 9
123 9205 9
123 9206 9
123 9207 9
123 9208 9
123 9209 9
123 9210 9
123 9211 W
133 9101 1
133 9102 1
133 9103 1
133 9104 1
133 9105 2
133 9106 3
133 9107 4
133 9108 5
133 9109 4
133 9110 3
133 9111 5
133 9112 6
133 9201 7
133 9202 8
133 9203 7
133 9204 6
133 9205 5
133 9206 4
133 9207 5
133 9208 6
133 9209 7
133 9210 9
133 9211 9
133 9212 9
133 9301 9
133 9302 9
133 9303 9
133 9304 9
133 9305 8
133 9306 8
133 9307 8
133 9308 8
133 9309 8
133 9310 8
133 9311 9
133 9312 W
;
Run ;

Data credit_data (keep=snapshot_dte acct_id BQR BQR_1-BQR_12 default_flag write_off_flag) ;
set credit_data  ;
snapshot_dte = mdy(mod(snapshot_date,100),1,int(snapshot_date/100));
format snapshot_dte date9. ;
default_flag = . ;
write_off_flag = . ;
array BQRD(12) BQR_1-BQR_12 ;
Run ;

 

Hope this helps in proceeding further.

Respected Advisor
Posts: 3,063

Re: Credit Risk Model - Data Preparation using SAS Code

I work with this type of data a lot and find that SQL is a pretty good way of handling it. I would start by deriving default and writeoff dates for each account and go from there:

 

proc sql;
  create table want as 
  select  H1.acct_id
         ,min(H1.snapshot_date) as Start_Date format = date9.
         ,H2.Default_Date format = date9.
         ,H3.Writeoff_Date format = date9.
  from credit_data as H1

  left join
  (select acct_id 
         ,min(snapshot_date) as Default_Date
   from credit_data
   where BQR = '9'
   group by acct_id
  )  as H2
  on H1.acct_id = H2.acct_id

  left join
  (select acct_id 
         ,min(snapshot_date) as Writeoff_Date
   from credit_data
   where BQR = 'W'
   group by acct_id
  )  as H3
  on H1.acct_id = H3.acct_id

  group by H1.acct_id
  ;
quit;
Occasional Contributor
Posts: 15

Re: Credit Risk Model - Data Preparation using SAS Code

Hi

 

Thanks for the reply.

It is great to see that the default_date and write_off_date could be figured out for each acct_id through this code.

 

Please note that my query is a bit different.

1) For each observation date corresponding to a given acct_id (consider this as first date of observation and then look forward into next 12 months data), I am expected to check in the next 12 months including the given observation month  for BQR , if it gets a value 9 then declare that acct_id at the given date to have a default_flag = 1 else 0 against that record
 
2) As soon as an acct_id gets a default_flag = 1, I am expected to check further if in the next 60 months, BQR gets a value "W". If it gets a value "W", then I have to declare that acct_id at that given month to have write_off_flag = 1 else 0
 
3) I am also expected to create forward looking variables BQR_1 to BQR_12 and populate values from the variable BQR for the next 12 months
 
I have tried to create a macro, however there is an issue.
 

%macro zest();
Data answers;
Set credit_data;
By acct_id;
Array _bqr(*) $ BQR1-BQR12;
If first.acct_id then count=1;
Else count + 1;

%do i = 1 %to 12 ;
if eof&i. = 0 then
set credit_data (firstobs=&i. keep=BQR rename=(BQR=BQR&i.)) end=eof&i. ;
if last.acct_id then BQR&i. = "";
%end ;
write_off_flag=whichc('W', of _bqr(*));
default_flag=whichc('9', of _bqr(*));
count_max = max(count);
Run ;
%mend;

%zest

 

Whereever the difference between the date of observance and the last date for each acct_id is less than 12 months, the code is trying to overstep and evaluate the data from the next acct_id.
 
I want this code to be suitably modified such that forward looking variables BQR1-BQR12 is correctly populated. To further clarify, if the difference between the date of observance (nth record) & last date of acct_id is less than 12, say 11 then in this case BQR12 should be populated as BQR12 = . against nth record.
 
Hope my query is clear.
 
If you run the above macro on the dataset SAS code given in the chain message, and check for the dataset named "answers" and observe BQR1-BQR12 aginst each nth record in the dataset, you will get to know what is exactly happening.
 
 
Solution
3 weeks ago
Respected Advisor
Posts: 3,831

Re: Credit Risk Model - Data Preparation using SAS Code

[ Edited ]

@Skb19121985

Below some code which should return the result you're after.

Looking into your Excel I don't understand why the write_off_flg gets only so late set to '1'. That's certainly later than 60 months ahead of the date where BQR gets a value of 'W'.

 

data have;
  input date:anydtdte. acct_id:$16. BQR:$1.;
  format date date9.;
datalines;
1-Jan-90 123 4
1-Feb-90 123 5
1-Mar-90 123 6
1-Apr-90 123 7
1-May-90 123 8
1-Jun-90 123 4
1-Jul-90 123 4
1-Aug-90 123 5
1-Sep-90 123 6
1-Oct-90 123 7
1-Nov-90 123 8
1-Dec-90 123 3
1-Jan-91 123 1
1-Feb-91 123 2
1-Mar-91 123 3
1-Apr-91 123 4
1-May-91 123 5
1-Jun-91 123 4
1-Jul-91 123 5
1-Aug-91 123 7
1-Sep-91 123 9
1-Oct-91 123 9
1-Nov-91 123 9
1-Dec-91 123 9
1-Jan-92 123 9
1-Feb-92 123 9
1-Mar-92 123 9
1-Apr-92 123 9
1-May-92 123 9
1-Jun-92 123 9
1-Jul-92 123 9
1-Aug-92 123 9
1-Sep-92 123 9
1-Oct-92 123 9
1-Nov-92 123 W
1-Sep-90 133 1
1-Oct-90 133 1
1-Nov-90 133 1
1-Dec-90 133 1
1-Jan-91 133 2
1-Feb-91 133 3
1-Mar-91 133 4
1-Apr-91 133 5
1-May-91 133 4
1-Jun-91 133 3
1-Jul-91 133 5
1-Aug-91 133 6
1-Sep-91 133 7
1-Oct-91 133 8
1-Nov-91 133 7
1-Dec-91 133 6
1-Jan-92 133 5
1-Feb-92 133 4
1-Mar-92 133 5
1-Apr-92 133 6
1-May-92 133 7
1-Jun-92 133 9
1-Jul-92 133 9
1-Aug-92 133 9
1-Sep-92 133 9
1-Oct-92 133 9
1-Nov-92 133 9
1-Dec-92 133 9
1-Jan-93 133 8
1-Feb-93 133 8
1-Mar-93 133 8
1-Apr-93 133 8
1-May-93 133 8
1-Jun-93 133 8
1-Jul-93 133 9
1-Aug-93 133 W
;
run;

%let write_of_months=60;
data answers(drop=_:);
  set have;
  by acct_id date;

  retain default_flg '0' write_off_flag '0';

  if first.acct_id=1 then
    do;
      dcl hash h1(dataset:cats('have (where=(acct_id="',acct_id,'"))'),ordered:'y');
      h1.defineKey('acct_id','date');
      h1.defineData('bqr');
      h1.defineDone();
      declare hiter hh1 ('h1');

      default_flg='0';
      write_off_flag='0';
    end;
  
  /* populate bqr1 to bqr12 and default_flg */
  array _arr_bqr(*) $1 bqr1-bqr12;
  call missing(of _arr_bqr(*));
  _rc = hh1.setcur(key:acct_id,key:date);
  do _i=1 to 12 while (_rc = 0);
    _arr_bqr[_i]=bqr; 
    if bqr='9' then default_flg='1';
    _rc = hh1.next();
  end;

  /* populate write_off_flag */
  if write_off_flag='0' and default_flg='1' then
    do;
      _rc = hh1.setcur(key:acct_id,key:date);
      do _i=1 to &write_of_months while (_rc = 0);
        if bqr='W' then 
          do;
            write_off_flag='1';
            leave;
          end; 
        _rc = hh1.next();
      end;
    end;

  _rc = hh1.setcur(key:acct_id,key:date);

  if last.acct_id then h1.delete();
run;

Should you be dealing with large data volumes then tweaking for performance would be possible for the posted code - but it would complicate the syntax so it's only something I'd be doing if really necessary.

 

Occasional Contributor
Posts: 15

Re: Credit Risk Model - Data Preparation using SAS Code

Hi Patrick

 

You are incredible !

Thanks for the help.

This is what is required.

Actually, to determine if an acct_id  that has already been declared defaulted (basis BQR getting "9" in 12 months performance window) is written-off at a given observation date, BQR is observed for "W" in the next 12 months beginning the observation date with default_flag = 1in our practice instead of 60 as stated in my query.

 

I am thinking of twigging the SAS statement such that the following is included in the program:

 

%let write_of_months=12;

 

The SAS code is working as required.

 

Yes, I will be using this SAS code on 300 GB dataset. Will it be time-taking ? If Yes, kindly help me with complicated syntax that you have stated in your reply, if possible.

 

Also, guide me with the study material that would help me become an expert SAS programmer just like you. As of now, I am trying to get hold of key words that you have used in the SAS program that you have shared on my post :-)

 

Regards.

Respected Advisor
Posts: 3,063

Re: Credit Risk Model - Data Preparation using SAS Code

IMO I don't see the need to create variables BQR_1 to BQR_12 for the next 12 months when a simple default date and writeoff date will give you the same information but take up less disk space. If you are required to do this because of a data spec that's fine - just pointing out there are other simpler ways to approach this.

 

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 17 replies
  • 251 views
  • 3 likes
  • 5 in conversation