BookmarkSubscribeRSS Feed
Kailash29
Fluorite | Level 6

I have a table where I have to find the values from each other columns and create a master key

My table :

ID1ID2
1111
1112
1113
1215
1216
1517
1818
1819
2021
2223
2224

 

 

Expected result : 

 

ID1ID2Key
111111
111211
111311
121511
121611
151711
181818
181918
202120
222322
222422

 

Key column needs to be generated as shown in column, how do I achieve using SAS EG.

14 REPLIES 14
LinusH
Tourmaline | Level 20

Sorry, I don't understand the rule how to set "Key".

Can you please elaborate?

Data never sleeps
Kailash29
Fluorite | Level 6
ID1 is the master key and values associated with it in ID2 will be linked values.

ex : ID1 = 15 and its key is 11 because

15 in ID1 traces back to 12 in ID1 through 15 in ID2 and 12 in ID1 traces back to 11 in ID1

Hence master key for 15 in ID1 will be 11
Kurt_Bremser
Super User
data have;
infile datalines dlm="09"x dsd;
input id1 id2;
datalines;
11	11
11	12
11	13
12	15
12	16
15	17
18	18
18	19
20	21
22	23
22	24
;

data want;
set have;
if _n_ = 1
then do;
  length key 8;
  declare hash keys();
  keys.definekey("id1");
  keys.definedata("key");
  keys.definedone();
end;
if keys.find() = 0
then do;
  output;
  id1 = id2;
  rc = keys.add();
end;
else do;
  key = id1;
  output;
  rc = keys.add();
end;
drop rc;
run;

proc print data=want noobs;
run;

Result:

id1	id2	key
11	11	11
11	12	11
11	13	11
12	15	11
12	16	11
15	17	11
18	18	18
18	19	18
20	21	20
22	23	22
22	24	22
Ksharp
Super User

What if you have the following data. What you are going to do ?

 

D1	ID2
11	11
11	12
11	13
12	15
12	16
16     16
16      11
15	17
18	18
18	19
20	21
22	23
22	24
Kailash29
Fluorite | Level 6
16 16 then key will be 11
16 11 scenario will not occur because ID1 value cannot be linked to its lower value in ID2
Tom
Super User Tom
Super User

@Kailash29 wrote:
16 16 then key will be 11
16 11 scenario will not occur because ID1 value cannot be linked to its lower value in ID2

So if the data is sorted by ID1 and the values are integers you could do it with a simple temporary array.

data have;
  input ID1	ID2;
cards;
11	11
11	12
11	13
12	15
12	16
15	17
18	18
18	19
20	21
22	23
22	24
;

data want;
  array keys [11:24] _temporary_;
  set have ;
  by id1 ;
  key = keys[id1];
  if missing(key) then key=id1;
  keys[id2]=key;
run;

proc print;
run;

To generalize either make the array larger than the largest possible value:

  array keys [1000000] _temporary_;

Or to mimic the way I did it in the example pre-calculate the min and max actual values.

proc sql noprint;
select min(min(id1,id2)),max(max(id2,id2))
  into :lbound,:ubound
from have
;
quit;
....
array keys [&lbound,&ubound] _temporary_;

 

Kailash29
Fluorite | Level 6
Hi Ksharp.. Thanks for your quick response..

When I saw deep into the data, I did find the scenario which you mention:

if 16 16 occurs I need the key as 11
and 16 11 occurs as well the key needs to be 11
Kailash29
Fluorite | Level 6
Because 16 is already connect to 12 and 12 back tracts towards 11
Ksharp
Super User
Then what you gonna to do ?
If you have the follwing data ?

D1 ID2
11 11
11 12
11 13
12 15
12 16
12 11 <-----
16 16
16 11
Kailash29
Fluorite | Level 6
If 12 11 occurs then it would be 11 again because 12 is already connected to 11 earlier
Ksharp
Super User
/*
So you want the first value for ancestor ?
*/
data have;
infile cards expandtabs truncover;
input from $  to $ ;
cards;
11 11
11 12
11 13
12 15
12 16
16  11
15 17
18 18
18 19
20 21
22 23
22 24
;
run;
data full;
  set have end=last;
  if _n_ eq 1 then do;
   declare hash h();
    h.definekey('node');
     h.definedata('node');
     h.definedone();
  end;
  output;
  node=from; h.replace();
  from=to; to=node; output;
  node=from; h.replace();
  if last then h.output(dataset:'node');
  drop node;
run;


data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 16);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full',hashexp:20,multidata:'y');
 from_to.definekey('from');
 from_to.definedata('to');
 from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
 no.definekey('node');
 no.definedata('node');
 no.definedone();
 

do while(hi_no.next()=0);
 household+1; output;
 count=1;
 key=node;_ha.add();
 last=node;ha.add();
 rc=hi.first();
 do while(rc=0);
   from=last;rx=from_to.find();
   do while(rx=0);
     key=to;ry=_ha.check();
      if ry ne 0 then do;
       node=to;output;rr=no.remove(key:node);
       key=to;_ha.add();
       count+1;
       last=to;ha.add();
      end;
      rx=from_to.find_next();
   end;
   rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;

data final_want;
 if _n_=1 then do;
   if 0 then set want;
   declare hash h(dataset:'want');
   h.definekey('node');
   h.definedata('household');
   h.definedone();
 end;
set have;
call missing(household);
rc=h.find(key:from);
drop rc node;
run;
proc sort data=final_want;by household;run;
data final_want;
 set final_want;
 by household;
 retain key;
 if first.household then key=from;
run;
Patrick
Opal | Level 21

@Kailash29 I believe you've now already received code that resolved your problem so just fyi here an old discussion with several solutions to a related problem.

Ksharp
Super User

OK. Try this one :

 

data have;
infile cards expandtabs truncover;
/*from is parent, to is child*/
input from $  to $ ;
cards;
11	11
11	12
11	13
12	15
12	16
15	17
18	18
18	19
20	21
22	23
22	24
;
run;
data full;
  set have end=last;
  if _n_ eq 1 then do;
   declare hash h();
    h.definekey('node');
     h.definedata('node');
     h.definedone();
  end;
  output;
  node=from; h.replace();
  from=to; to=node; output;
  node=from; h.replace();
  if last then h.output(dataset:'node');
  drop node;
run;


data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 16);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full',hashexp:20,multidata:'y');
 from_to.definekey('from');
 from_to.definedata('to');
 from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
 no.definekey('node');
 no.definedata('node');
 no.definedone();
 

do while(hi_no.next()=0);
 household+1; output;
 count=1;
 key=node;_ha.add();
 last=node;ha.add();
 rc=hi.first();
 do while(rc=0);
   from=last;rx=from_to.find();
   do while(rx=0);
     key=to;ry=_ha.check();
      if ry ne 0 then do;
       node=to;output;rr=no.remove(key:node);
       key=to;_ha.add();
       count+1;
       last=to;ha.add();
      end;
      rx=from_to.find_next();
   end;
   rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;



proc sql;
create table temp as
select distinct * from have where from ne to;

create table ancestor as
select distinct from
 from temp
  where from not in (select to from temp);

create table ancestor2 as
select * from want where node in (select from from ancestor);

create table ancestor3 as
select a.*,b.node as key 
 from want as a left join ancestor2 as b
  on a.household=b.household;

create table final_want as
select a.*,b.key 
 from have as a left join ancestor3 as b
  on a.from=b.node;
quit;

Ksharp_0-1677324489805.png

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 4027 views
  • 4 likes
  • 6 in conversation