BookmarkSubscribeRSS Feed
Ronein
Onyx | Level 15

Hello

I have a data set that show the ownership of bank accounts.

There are 2 columns: 

Account- Key number of the bank account

ID - Identification  number of the owner of the bank account

My task is to classify each bank account into one of the following groups:

1- Bank account has only one ownership and the ID doesn't appear in other bank accounts

2- Bank account has only one ownership and the ID  appear in other bank accounts

3-Bank account has more than one ID ownership and all ID's don't appear in other bank accounts.

4-Bank account has more than one ID ownership and at least one  ID  appear in other bank accounts.

May anyone show a code to do it please?

Data rawtbl;
input account ID $ ;
cards;
111 X
111 Y
222 X
333 Y
444 Y
444 R
789 R
555 Q
666 V
666 T
777 A
888 A
999 A
999 B
123 B
;
Run;


2-Bank account has more than one ownership and the ID's doesn't appear in other bank accounts.

 

 

5 REPLIES 5
Kurt_Bremser
Super User

PLEASE do us a big, big favor and name your example datasets "have", like we all do; it is really painful to realize that you name it "rawtbl" when the code crashes on the first test run.

First, create an intermediate table that contains all ID's and a flag for multiple usage:

proc sql;
create table id as
  select
    id,
    case
      when count(*) gt 1
      then 1
      else 0
    end as id_mult
  from have
  group by id
;
quit;

Next, sort the have dataset by account:

proc sort data=have;
by account;
run;

so we can use BY in the final data step:

data want;
set have;
by account;
if _n_ = 1
then do;
  declare hash idm (dataset:"id");
  idm.definekey("id");
  idm.definedata("id_mult");
  idm.definedone();
  call missing(id_mult);
end;
retain mult_id;
if first.account then mult_id = 0;
rc = idm.find();
mult_id = max(mult_id,id_mult);
if last.account;
mult_acct = not first.account;
keep account mult_acct mult_id;
run;

The intermediate table is loaded into a hash object for lookup.

sbxkoenk
SAS Super FREQ

Hello,

PROC FREQ gives all the info you are looking for.

Can you go on from there?

Data rawtbl;
input account ID $ ;
cards;
111 X
111 Y
222 X
333 Y
444 Y
444 R
789 R
555 Q
666 V
666 T
777 A
888 A
999 A
999 B
123 B
;
run;

/*
1- Bank account has only one ownership and the ID doesn't appear in other bank accounts
2- Bank account has only one ownership and the ID  appear in other bank accounts
3- Bank account has more than one ID ownership and all ID's don't appear in other bank accounts.
4- Bank account has more than one ID ownership and at least one  ID  appear in other bank accounts.
*/
ods trace off;
ods output CrossTabFreqs=work.CrossTabFreqs;
PROC FREQ data=rawtbl noprint;
 tables account * ID / missing out=count_account_ID;
run;
/* end of program */

Good luck,

Koen

Patrick
Opal | Level 21

@sbxkoenk I feel you need to explain your line of thought a bit more. The OP basically asks for 4 flag variables. What you propose is the result of a proc freq that has the same number of rows than the source table. Not sure how that helps.

sbxkoenk
SAS Super FREQ

Hello @Patrick ,

 

You should build upon the ODS output dataset.

ods output CrossTabFreqs=work.CrossTabFreqs;

If not solved this evening (after the work day) I will complete my program.

I haven't thought it through for the full 100%, so I maybe wrong that all 4 flag-variables can be derived from CrossTabFreqs.

 

Cheers,

Koen

andreas_lds
Jade | Level 19

Not sure what do you expect as result exactly, so i created a variable group having the values 1 to 4 to identify the groups you have defined.

And i assumed, that the data is grouped by account.

/* Get number of accounts per ID */
proc summary data= work.rawtbl nway;
   class id;
   output out= work.Ids(drop= _type_ rename= (_freq_ = count));
run;

/* Using work.Ids as hash-object */
data want;
   if 0 then set work.Ids;

   set work.rawtbl;
   by account notsorted;

   length group 8 multiOwner multiAccount 8;
   retain multiAccount;

   if _n_ = 1 then do;
      declare hash owner(dataset: 'work.Ids');
      owner.defineKey('id');
      owner.defineData('count');
      owner.defineDone();
   end;

   if first.account then do;
      multiAccount = 1;
   end;

   multiOwner = not (first.account and last.account);
   rc = owner.find();
   multiAccount = multiAccount and (count > 1);

   if last.account then do;
      group = input(cats(multiOwner, multiAccount), binary2.) + 1;
      output;
   end;
   
   drop count rc id;
run;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1389 views
  • 1 like
  • 5 in conversation