I have a SAS Datasets with multiple equalities: i.e. A=C, B=C etc. and I would like to make concatenate it into one line to A=B=C.
For example IDs in 1 and 2 represent equality,
ID1, ID2;
1,3;
2,3;
I would like to make these into
ID1, ID2, ID3;
1, 2, 3;
Would there be a good algorithm to do this?
The hash table is the best suited I reckon.
Assuming data set sorted and organised so that ID1 is smaller than ID2.
data HAVE;
input ID1 ID2 ;
cards;
1 3
2 3
5 9
6 9
;
run;
data _null_;
retain GRP;
if _N_=1 then do;
ID=.; GROUP=.;
dcl hash H(ordered:'a');
H.definekey('ID');
H.definedata('ID','GROUP');
H.definedone();
end;
set HAVE end=LASTOBS;;
RC1=H.find(key:ID1);
if RC1=0 then GRP1=GROUP;
RC2=H.find(key:ID2);
if RC2=0 then GRP2=GROUP;
if RC1=0 and RC2=0 then return;
else if RC1=0 then do;
H.add(KEY:ID2, data:ID2, data:GRP1);
end;
else if RC2=0 then do;
H.add(KEY:ID1, data:ID1, data:GRP2);
end;
else do;
GRP+1;
H.add(key:ID1, data:ID1, data:GRP);
H.add(key:ID2, data:ID2, data:GRP);
end;
if LASTOBS then H.output(dataset:'WANT');
run;
proc print data=WANT;
run;
Obs | ID | GROUP |
---|---|---|
1 | 1 | 1 |
2 | 2 | 1 |
3 | 3 | 1 |
4 | 5 | 2 |
5 | 6 | 2 |
6 | 9 | 2 |
The hash table is the best suited I reckon.
Assuming data set sorted and organised so that ID1 is smaller than ID2.
data HAVE;
input ID1 ID2 ;
cards;
1 3
2 3
5 9
6 9
;
run;
data _null_;
retain GRP;
if _N_=1 then do;
ID=.; GROUP=.;
dcl hash H(ordered:'a');
H.definekey('ID');
H.definedata('ID','GROUP');
H.definedone();
end;
set HAVE end=LASTOBS;;
RC1=H.find(key:ID1);
if RC1=0 then GRP1=GROUP;
RC2=H.find(key:ID2);
if RC2=0 then GRP2=GROUP;
if RC1=0 and RC2=0 then return;
else if RC1=0 then do;
H.add(KEY:ID2, data:ID2, data:GRP1);
end;
else if RC2=0 then do;
H.add(KEY:ID1, data:ID1, data:GRP2);
end;
else do;
GRP+1;
H.add(key:ID1, data:ID1, data:GRP);
H.add(key:ID2, data:ID2, data:GRP);
end;
if LASTOBS then H.output(dataset:'WANT');
run;
proc print data=WANT;
run;
Obs | ID | GROUP |
---|---|---|
1 | 1 | 1 |
2 | 2 | 1 |
3 | 3 | 1 |
4 | 5 | 2 |
5 | 6 | 2 |
6 | 9 | 2 |
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.