Find Relative customers with different entity and similar ownership ID...

Ronein · Posted 01-17-2024 04:32 AM

Hello

I have a data set with list of customers.

For each customer there are 3 columns:

Customer_ID

entity_ID

Person_IDs (It is concatenation of people ID's who are owner )

As you can see specific entity can have multiple Customer_IDs

My target-

For each row (Customer_ID) check if there is any other Customer_ID with similar Person_IDs(but not same!!) that belong to different entity_ID . (It should be binary Var get value 1/0).

When I say similar Person_IDs I mean that at least one ID is same but not all of them.

I also want to add another new field that for each customer_ID concatenate the customer_IDs that were found as similar (similar personIDs and different entity)

For example:

For Customer_ID=1(Id's 999,888) we can see that there is another customer_ID(6) that have ownership with Id's 888 and can see that Person_IDs are not equal exactly but at least one of them is equal . so here new var will get value 1

For Customer_ID=2 same like for For Customer_ID=1 (new var will get value 1)

For Customer_ID=3 new var will get value 1 because another customer (7) have similar Person_IDs and different entity

For customer_ID=8 new var will get value 0

For customer_ID=9 new var will get value 0

Please note that in real world my data base contain 2 million rows.

I thought to do cartersain merge but i think it is not good because too many rows!!

Data have;
Input Customer_ID  entity_ID Person_IDs $;
cards;
1 1 999,888
2 1 999,888
3 2 777
4 3 666
5 4 222,111
6 5 888
7 7 777,333
8 0 444,555
9 0 444,555
;
Run;

Kurt_Bremser · Posted 01-17-2024 07:31 AM

As a first step, expand those combined person_ids to multiple observations with a single person_id. From there, it is easy to formulate the join condition.

proc sql;
create table mult as
  select distinct customer_id
  from have a, have b
  where
    a.customer_id = b.customer_id and
    a.person_id = b.person_id and
    a.entity_id ne b.entity_id
;
quit;

data want;
merge
  have (in=h)
  mult (in=m)
;
by customer_id;
value = m;
if last.customer_id;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Ksharp · Posted 01-18-2024 02:04 AM

Data have;
Input Customer_ID  entity_ID Person_IDs $;
cards;
1 1 999,888
2 1 999,888
3 2 777
4 3 666
5 4 222,111
6 5 888
7 7 777,333
8 0 444,555
9 0 444,555
;
Run;
data temp;
set have;
do i=1 to countw(Person_IDs,',');
 k=scan(Person_IDs,i,',');output;
end;
keep k entity_ID;
run;
proc sql;
create table k as
select distinct k,entity_ID as _entity_ID from temp;
quit;
data want;
 if _n_=1 then do;
  if 0 then set k;
  declare hash h(dataset:'k',multidata:'y',hashexp:20);
  h.definekey('k');
  h.definedata('_entity_ID');
  h.definedone();
 end;
set have;
want=0;
do i=1 to countw(Person_IDs,',');
 k=scan(Person_IDs,i,',');
 rc=h.find();
 do while(rc=0);
   if entity_ID ne _entity_ID then do;want=1;leave;end;
   rc=h.find_next();
 end;
 if want then leave;
end;
drop i k rc _entity_ID;
run;

Find Relative customers with different entity and similar ownership ID's (but not same)

Re: Find Relative customers with different entity and similar ownership ID's (but not same)

Re: Find Relative customers with different entity and similar ownership ID's (but not same)

Find Relative customers with different entity and similar ownership ID's (but not same)

Re: Find Relative customers with different entity and similar ownership ID's (but not same)

Re: Find Relative customers with different entity and similar ownership ID's (but not same)

Click image to register for webinar

Classroom Training Available!