Help using Base SAS procedures

How to subset a dataset when multiple records exist per subject

Reply
Super Contributor
Posts: 338

How to subset a dataset when multiple records exist per subject

Hi SAS Community,

I need to separate the account numbers of the attached file with this condition.

if Account_number ^= "0" and transit="Retail" and product in ("Personal Loan", "Personal OD", "Res. Mortgage");

The complexity is some accounts satisfy this condition only in certain months of the account's life. For e.g. account_number  No. 1111111111 satisfies the condition only in months of 31OCT2010 and 31JAN2012.

But I need only accounts those satisfy our condition across their entire life cycle.

Only account 44444444 satisfies this condition. So, a code is required to separate only the four records belong to account 44444444.

I have tried the “subsetting if” like below but obviously it is not clever enough.

data want;

     set have;

     if Account_number ^= "0" and transit="Retail" and product in ("Personal Loan", "Personal OD", "Res. Mortgage");

run;

Could anyone help me.

Mirisage

Thank you

Mirisage

Attachment
Super User
Posts: 10,046

Re: How to subset a dataset when multiple records exist per subject

It looks like you are trying to filter obs and also want to filter groups. SQL is a good tool.

data have;
informat current_date date9.;
input current_date account_number $ 13-22 transit $ 25-32 product $ 37-49  Arrears_Band $ 53-59;
format current_date date9.;
datalines;
28FEB2010   0           Retail      Personal Loan   Current
31MAR2010   0           Retail      Personal Loan   Current
31OCT2010   1111111111  Retail      Personal Loan   1 - 30 
30NOV2010   1111111111  Small Bu    Personal Loan   Current
31DEC2010   1111111111  Small Bu    Personal Loan   Current
31JAN2011   1111111111  Small Bu    Personal Loan   Current
28FEB2011   1111111111  Small Bu    Personal Loan   Current
31MAR2011   1111111111  Small Bu    Personal Loan   Current
30APR2011   1111111111  Small Bu    Personal Loan   Current
31MAY2011   1111111111  Wealth      Personal Loan   Current
30JUN2011   1111111111  Small Bu    Personal Loan   Current
31JUL2011   1111111111  Small Bu    Personal Loan   1 - 30
31AUG2011   1111111111  Small Bu    Personal Loan   Current
30SEP2011   1111111111  Small Bu    Personal Loan   Current
31OCT2011   1111111111  Small Bu    Personal Loan   Current
30NOV2011   1111111111  Small Bu    Personal Loan   Current
31DEC2011   1111111111  Retail      Corporate       Current
31JAN2012   1111111111  Retail      Personal Loan   Current
28FEB2010   2222222     Wealth      Personal OD     Current
31MAR2010   2222222     Wealth      Personal OD     Current
30APR2010   2222222     Wealth      Personal OD         Current
31MAY2010   2222222     Wealth      Personal OD         Current
30JUN2010   2222222     Wealth      Personal OD         Current
31JUL2010   2222222     Wealth      Personal OD         Current
31AUG2010   2222222     Wealth      Personal OD         Current
30SEP2010   2222222     Wealth      Personal OD         Current
30NOV2010   2222222     Retail      Personal OD         Current
31DEC2010   2222222     Retail      Personal OD         Current
31JAN2011   2222222     Retail      Personal OD         Current
28FEB2011   2222222     Wealth      Personal OD         Current
31MAR2011   2222222     Wealth      Personal OD         Current
30APR2011   2222222     Retail      Personal OD         Current
31MAY2011   2222222     Retail      Personal OD         Current
30JUN2011   2222222     Retail      Personal OD         Current
31JUL2011   2222222     Retail      Personal OD         Current
31AUG2011   2222222     Retail      Personal OD         Current
30SEP2011   2222222     Small Bu    Personal OD     Current
28FEB2010   3333333333  Wealth      Corporate       Current
31MAR2010   3333333333  Wealth      Corporate       Current
30APR2010   3333333333  Wealth      Corporate       Current
31MAY2010   3333333333  Retail      Res. Mortgage   Current
30JUN2010   3333333333  Retail      Res. Mortgage   Current
31JUL2010   3333333333  Retail      Res. Mortgage   Current
31AUG2010   3333333333  Retail      Res. Mortgage   Current
30SEP2010   3333333333  Retail      Res. Mortgage   Current
31OCT2010   3333333333  Retail      Res. Mortgage   Current
30NOV2010   3333333333  Retail      Res. Mortgage   Current
31DEC2010   3333333333  Retail      Res. Mortgage   Current
31JAN2011   3333333333  Retail      Res. Mortgage   Current
28FEB2011   3333333333  Retail      Res. Mortgage   Current
31MAR2011   3333333333  Retail      Res. Mortgage   Current
30APR2011   3333333333  Retail      Res. Mortgage   Current
31MAY2011   3333333333  Retail      Res. Mortgage   Current
30JUN2011   3333333333  Retail      Res. Mortgage   Current
31JUL2011   3333333333  Retail      Res. Mortgage   Current
31AUG2011   3333333333  Retail      Res. Mortgage   Current
30SEP2011   3333333333  Retail      Res. Mortgage   Current
31OCT2011   4444444444  Retail      Personal Loan   Current
30NOV2011   4444444444  Retail      Res. Mortgage   Current
31DEC2011   4444444444  Retail      Personal OD     Current
31JAN2012   4444444444  Retail      Personal OD     Current
;
run;

proc sql;
create table want as
 select *
  from have 
   where Account_number ^= "0"
    group by  Account_number
     having count(*) eq  sum(transit="Retail" and product in ("Personal Loan", "Personal OD", "Res. Mortgage")) ;
quit;







Ksharp

Super Contributor
Posts: 338

Re: How to subset a dataset when multiple records exist per subject

Hi Ksharp,

Thank you very much for this help.

Your code anabled me to identify shortcomings of "subsetting if" approach in this example.

Thanks again!

Mirisage


Ask a Question
Discussion stats
  • 2 replies
  • 191 views
  • 0 likes
  • 2 in conversation