Solved: Re: How could I identify and remove accounts whose arrears bands remai...

Mirisage · Posted 10-26-2012 05:20 PM

Hi SAS Community,

Thank every one of you for all supports.

I am asking a help again.

In this data set, I have 8 distinct accounts in several banks (colored for clarity).

data have;

informat current_date date9.;

input Bank_number Account_number $ 5-7 Current_date Arrears_Band $22-28;

format current_date date9.;

cards;

10 111 30SEP2010 NPNA

10 111 31OCT2010 Current

10 111 30NOV2010 NPNA

10 111 30JUN2011 NPNA

10 111 01JAN2012 writoff

70 111 31DEC2011 NPNA

70 111 31JAN2012 NPNA

40 333 30NOV2010 Current

40 333 31DEC2010 NPNA

50 333 31JAN2011 NPNA

50 333 28FEB2011 writoff

20 333 31MAR2011 NPNA

20 333 30APR2011 NPNA

20 333 31MAY2011 NPNA

20 333 31AUG2011 NPNA

100 111 30APR2011 90 +

100 111 31MAY2011 90 +

100 111 31JUL2011 NPNA

100 111 31AUG2011 NPNA

100 111 04JAN2012 NPNA

80 111 31DEC2011 NPNA

40 555 08FEB2010 30 - 60

40 555 31MAR2010 Current

40 555 30APR2010 60 - 90

40 555 31MAY2010 Current

40 666 03JAN2012 1 - 30

;

run;

Question 1:

First I need to identify what are the accounts whose arrears_band = ‘NPNA’ throughout their life.

/*Answer to question 1*/

Lets call this table as Table_for_Q1

70 111 31DEC2011 NPNA

70 111 31JAN2012 NPNA

20 333 31MAR2011 NPNA

20 333 30APR2011 NPNA

20 333 31MAY2011 NPNA

20 333 31AUG2011 NPNA

80 111 31DEC2011 NPNA

Question 2:

Then I need to get a table having removed all of those account_numbers identified under Question 1 from original "have" table.

/*Answer to question 2*/

Lets call this table as Table_for_Q2

10 111 30SEP2010 NPNA

10 111 31OCT2010 Current

10 111 30NOV2010 NPNA

10 111 30JUN2011 NPNA

10 111 01JAN2012 writoff

40 333 30NOV2010 Current

40 333 31DEC2010 NPNA

50 333 31JAN2011 NPNA

50 333 28FEB2011 writoff

100 111 30APR2011 90 +

100 111 31MAY2011 90 +

100 111 31JUL2011 NPNA

100 111 31AUG2011 NPNA

100 111 04JAN2012 NPNA

40 555 08FEB2010 30 - 60

40 555 31MAR2010 Current

40 555 30APR2010 60 - 90

40 555 31MAY2010 Current

40 666 03JAN2012 1 - 30

What I have tried

/*Sorting by 3 variables that makes a unique record*/

proc sort data=have out=have_sorted;

by bank_number account_number current_date;

run;

/*flagging*/

data want;

set have_sorted;

if arrears_band ='NPNA' then flag=1;

else flag=0;

run;

Actually I cannot imagine where to go from here……

Just I only know I can do the following merging step to get the answer to question 2 which is Table_for_Q2, if I can find Table_for_Q1.

/*Merging*/

Proc sort data = have;

by bank_number account_number current_date;

run;

Proc sort data = Table_for_Q1;

by bank_number account_number current_date;

run;

Data Table_for_Q2;

Merge have (in=a)

Table_for_Q1 (in=b)

;

By bank_number account_number current_date;

If a and not Table_for_Q1

;

Run;

I would really appreciate if anyone of you could help me.

Thanks

Mirisage

Haikuo · Posted 10-26-2012 05:38 PM

Seems to me it can be done using straightforward SQL:

data have;

informat current_date date9.;

input Bank_number Account_number $ 5-7 Current_date Arrears_Band $22-28;

format current_date date9.;

cards;

10 111 30SEP2010 NPNA

10 111 31OCT2010 Current

10 111 30NOV2010 NPNA

10 111 30JUN2011 NPNA

10 111 01JAN2012 writoff

70 111 31DEC2011 NPNA

70 111 31JAN2012 NPNA

40 333 30NOV2010 Current

40 333 31DEC2010 NPNA

50 333 31JAN2011 NPNA

50 333 28FEB2011 writoff

20 333 31MAR2011 NPNA

20 333 30APR2011 NPNA

20 333 31MAY2011 NPNA

20 333 31AUG2011 NPNA

100 111 30APR2011 90 +

100 111 31MAY2011 90 +

100 111 31JUL2011 NPNA

100 111 31AUG2011 NPNA

100 111 04JAN2012 NPNA

80 111 31DEC2011 NPNA

40 555 08FEB2010 30 - 60

40 555 31MAR2010 Current

40 555 30APR2010 60 - 90

40 555 31MAY2010 Current

40 666 03JAN2012 1 - 30

;

/*Q1*/

proc sql;

select * from have group by Bank_number having sum(Arrears_Band ne 'NPNA')=0;quit;

/*Q2*/

proc sql;

select * from have

except

(select * from have group by Bank_number having sum(Arrears_Band ne 'NPNA')=0)

order by Bank_number;

quit;

Haikuo

View solution in original post

Haikuo · Posted 10-26-2012 05:38 PM

Seems to me it can be done using straightforward SQL:

data have;

informat current_date date9.;

input Bank_number Account_number $ 5-7 Current_date Arrears_Band $22-28;

format current_date date9.;

cards;

10 111 30SEP2010 NPNA

10 111 31OCT2010 Current

10 111 30NOV2010 NPNA

10 111 30JUN2011 NPNA

10 111 01JAN2012 writoff

70 111 31DEC2011 NPNA

70 111 31JAN2012 NPNA

40 333 30NOV2010 Current

40 333 31DEC2010 NPNA

50 333 31JAN2011 NPNA

50 333 28FEB2011 writoff

20 333 31MAR2011 NPNA

20 333 30APR2011 NPNA

20 333 31MAY2011 NPNA

20 333 31AUG2011 NPNA

100 111 30APR2011 90 +

100 111 31MAY2011 90 +

100 111 31JUL2011 NPNA

100 111 31AUG2011 NPNA

100 111 04JAN2012 NPNA

80 111 31DEC2011 NPNA

40 555 08FEB2010 30 - 60

40 555 31MAR2010 Current

40 555 30APR2010 60 - 90

40 555 31MAY2010 Current

40 666 03JAN2012 1 - 30

;

/*Q1*/

proc sql;

select * from have group by Bank_number having sum(Arrears_Band ne 'NPNA')=0;quit;

/*Q2*/

proc sql;

select * from have

except

(select * from have group by Bank_number having sum(Arrears_Band ne 'NPNA')=0)

order by Bank_number;

quit;

Haikuo

Linlin · Posted 10-26-2012 06:06 PM

Haikuo,

How are you doing? I like your code. I added a group variable "account_number" to your code.

/*Q1*/

proc sql;

create table q1 as

select * from have group by Bank_number,account_number

having sum(Arrears_Band ne 'NPNA')=0;quit;

/*Q2*/

proc sql;

create table q2 as

select * from have

except

(select * from have group by Bank_number,account_number having sum(Arrears_Band ne 'NPNA')=0)

order by Bank_number;

quit;

Mirisage · Posted 10-29-2012 10:43 AM

Hi Haikuo and Linlin,

I have applied the two codes to the follwing test data set named 'one' but it is strange that follwoing 3 accoutns are not identified as continuing NPNA accounts.

/*

Is there any reason?

10 28Feb2010 2222 9476 Personal OD NPNA

10 31Mar2010 2222 9635 Personal OD NPNA

10 30Apr2010 2222 9791 Personal OD NPNA*/

data one;

informat Current_date date9.;

input Bank_number Current_date Account_number $ 14 -17 Balance Product $ 24 -36 Arrears_Band $ 38-44;

format Current_date date9.;

cards;

10 28Feb2010 1111 2216 Personal Loan NPNA

10 31Mar2010 1111 2216 Personal Loan NPNA

10 30Apr2010 1111 2216 Personal Loan NPNA

10 31May2010 1111 2216 Personal Loan NPNA

10 30Jun2010 1111 2216 Personal Loan NPNA

10 06Jul2010 1111 2216 Personal Loan writoff

10 28Feb2010 2222 9476 Personal OD NPNA

10 31Mar2010 2222 9635 Personal OD NPNA

10 30Apr2010 2222 9791 Personal OD NPNA

30 28Feb2010 1111 3420 Res. Mortgage NPNA

30 31Mar2010 1111 3308 Res. Mortgage NPNA

30 30Jun2011 1111 2614 Res. Mortgage NPNA

30 31Jul2011 1111 2599 Res. Mortgage 1 - 30

30 31Aug2011 1111 2599 Res. Mortgage Current

30 30Sep2011 1111 2599 Res. Mortgage Current

30 31Oct2011 1111 2599 Res. Mortgage 1 - 30

30 30Nov2011 1111 2591 Res. Mortgage Current

30 31Dec2011 1111 2584 Res. Mortgage Current

30 31Jan2012 1111 2584 Res. Mortgage Current

40 28Feb2010 1111 1492 Personal Loan NPNA

40 31Mar2010 1111 1487 Personal Loan NPNA

40 31Aug2010 1111 1425 Personal Loan NPNA

40 30Sep2010 1111 7226 Personal Loan writoff

40 28Feb2011 1111 3510 Personal Loan NPNA

40 31Mar2011 1111 3510 Personal Loan NPNA

40 30Apr2011 1111 3510 Personal Loan NPNA

40 31May2011 1111 3510 Personal Loan NPNA

;

run;

/*identification of continuing NPNA accounts*/

proc sql;

create table q1 as

select * from one group by Bank_number,account_number

having sum(Arrears_Band ne 'NPNA')=0;quit;

/*Removing of continuing NPNA accounts*/

proc sql;

create table temp_1 as

select * from one

except

(select * from one group by Bank_number,account_number having sum(Arrears_Band ne 'NPNA')=0)

order by Bank_number;

quit;

Thanks

Mirisage

Linlin · Posted 10-29-2012 11:55 AM

Hi Mirisage,

I ran the code. Below are the output and log file:

Current_ Bank_ Account_ Arrears_

Obs date number number Balance Product Band

1 28FEB2010 10 2222 9476 Personal OD NPNA

2 30APR2010 10 2222 9791 Personal OD NPNA

3 31MAR2010 10 2222 9635 Personal OD NPNA

1 data one;

2 informat Current_date date9.;

3 input Bank_number Current_date Account_number $ 14 -17 Balance

3 ! Product $ 24 -36 Arrears_Band $ 38-44;

4 format Current_date date9.;

5 cards;

NOTE: The data set WORK.ONE has 27 observations and 6 variables.

NOTE: DATA statement used (Total process time):

real time 0.02 seconds

cpu time 0.01 seconds

33 ;

34 run;

35

36 /*identification of continuing NPNA accounts*/

37 proc sql;

38 create table q1 as

39 select * from one group by Bank_number,account_number

40 having sum(Arrears_Band ne 'NPNA')=0;

NOTE: The query requires remerging summary statistics back with the original data.

NOTE: Table WORK.Q1 created, with 3 rows and 6 columns.

40 ! quit;

NOTE: PROCEDURE SQL used (Total process time):

real time 0.04 seconds

cpu time 0.04 seconds

41

42 proc print data=q1;run;

NOTE: There were 3 observations read from the data set WORK.Q1.

NOTE: PROCEDURE PRINT used (Total process time):

real time 0.19 seconds

cpu time 0.04 seconds

Mirisage · Posted 10-29-2012 09:41 PM

Hi Linlin and Haikuo,

My bad. I apologize. Just after you (linlin) posted your results of running the code on my new data set, I have tried it. It worked like anything.

I really do not know what happened to me in my previous occasion.

Many thanks to both of you for this great code help.

Regards

Mirisage

How could I identify and remove accounts whose arrears bands remain as NPNA?

Re: How could I identify and remove accounts whose arrears bands remain as NPNA?

Re: How could I identify and remove accounts whose arrears bands remain as NPNA?

Re: How could I identify and remove accounts whose arrears bands remain as NPNA?

Re: How could I identify and remove accounts whose arrears bands remain as NPNA?

Re: How could I identify and remove accounts whose arrears bands remain as NPNA?

Re: How could I identify and remove accounts whose arrears bands remain as NPNA?

Catch up on SAS Innovate 2026

How could I identify and remove accounts whose arrears bands remain as NPNA?

Re: How could I identify and remove accounts whose arrears bands remain as NPNA?

Re: How could I identify and remove accounts whose arrears bands remain as NPNA?

Re: How could I identify and remove accounts whose arrears bands remain as NPNA?

Re: How could I identify and remove accounts whose arrears bands remain as NPNA?

Re: How could I identify and remove accounts whose arrears bands remain as NPNA?

Re: How could I identify and remove accounts whose arrears bands remain as NPNA?

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away