BookmarkSubscribeRSS Feed
Ronein
Onyx | Level 15

Hello

I have a data set that show the ownership of bank accounts.

There are 2 columns: 

Account- Key number of the bank account

ID - Identification  number of the owner of the bank account

My task is to classify each bank account into one of the following groups:

1- Bank account has only one ownership and the ID doesn't appear in other bank accounts

2- Bank account has only one ownership and the ID  appear in other bank accounts

3-Bank account has more than one ID ownership and all ID's don't appear in other bank accounts.

4-Bank account has more than one ID ownership and at least one  ID  appear in other bank accounts.

May anyone show a code to do it please?

Data rawtbl;
input account ID $ ;
cards;
111 X
111 Y
222 X
333 Y
444 Y
444 R
789 R
555 Q
666 V
666 T
777 A
888 A
999 A
999 B
123 B
;
Run;


2-Bank account has more than one ownership and the ID's doesn't appear in other bank accounts.

 

 

5 REPLIES 5
Kurt_Bremser
Super User

PLEASE do us a big, big favor and name your example datasets "have", like we all do; it is really painful to realize that you name it "rawtbl" when the code crashes on the first test run.

First, create an intermediate table that contains all ID's and a flag for multiple usage:

proc sql;
create table id as
  select
    id,
    case
      when count(*) gt 1
      then 1
      else 0
    end as id_mult
  from have
  group by id
;
quit;

Next, sort the have dataset by account:

proc sort data=have;
by account;
run;

so we can use BY in the final data step:

data want;
set have;
by account;
if _n_ = 1
then do;
  declare hash idm (dataset:"id");
  idm.definekey("id");
  idm.definedata("id_mult");
  idm.definedone();
  call missing(id_mult);
end;
retain mult_id;
if first.account then mult_id = 0;
rc = idm.find();
mult_id = max(mult_id,id_mult);
if last.account;
mult_acct = not first.account;
keep account mult_acct mult_id;
run;

The intermediate table is loaded into a hash object for lookup.

sbxkoenk
SAS Super FREQ

Hello,

PROC FREQ gives all the info you are looking for.

Can you go on from there?

Data rawtbl;
input account ID $ ;
cards;
111 X
111 Y
222 X
333 Y
444 Y
444 R
789 R
555 Q
666 V
666 T
777 A
888 A
999 A
999 B
123 B
;
run;

/*
1- Bank account has only one ownership and the ID doesn't appear in other bank accounts
2- Bank account has only one ownership and the ID  appear in other bank accounts
3- Bank account has more than one ID ownership and all ID's don't appear in other bank accounts.
4- Bank account has more than one ID ownership and at least one  ID  appear in other bank accounts.
*/
ods trace off;
ods output CrossTabFreqs=work.CrossTabFreqs;
PROC FREQ data=rawtbl noprint;
 tables account * ID / missing out=count_account_ID;
run;
/* end of program */

Good luck,

Koen

Patrick
Opal | Level 21

@sbxkoenk I feel you need to explain your line of thought a bit more. The OP basically asks for 4 flag variables. What you propose is the result of a proc freq that has the same number of rows than the source table. Not sure how that helps.

sbxkoenk
SAS Super FREQ

Hello @Patrick ,

 

You should build upon the ODS output dataset.

ods output CrossTabFreqs=work.CrossTabFreqs;

If not solved this evening (after the work day) I will complete my program.

I haven't thought it through for the full 100%, so I maybe wrong that all 4 flag-variables can be derived from CrossTabFreqs.

 

Cheers,

Koen

andreas_lds
Jade | Level 19

Not sure what do you expect as result exactly, so i created a variable group having the values 1 to 4 to identify the groups you have defined.

And i assumed, that the data is grouped by account.

/* Get number of accounts per ID */
proc summary data= work.rawtbl nway;
   class id;
   output out= work.Ids(drop= _type_ rename= (_freq_ = count));
run;

/* Using work.Ids as hash-object */
data want;
   if 0 then set work.Ids;

   set work.rawtbl;
   by account notsorted;

   length group 8 multiOwner multiAccount 8;
   retain multiAccount;

   if _n_ = 1 then do;
      declare hash owner(dataset: 'work.Ids');
      owner.defineKey('id');
      owner.defineData('count');
      owner.defineDone();
   end;

   if first.account then do;
      multiAccount = 1;
   end;

   multiOwner = not (first.account and last.account);
   rc = owner.find();
   multiAccount = multiAccount and (count > 1);

   if last.account then do;
      group = input(cats(multiOwner, multiAccount), binary2.) + 1;
      output;
   end;
   
   drop count rc id;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1280 views
  • 1 like
  • 5 in conversation