I have a SAS Datasets with multiple equalities: i.e. A=C, B=C etc. and I would like to make concatenate it into one line to A=B=C.
For example IDs in 1 and 2 represent equality,
ID1, ID2;
1,3;
2,3;
I would like to make these into
ID1, ID2, ID3;
1, 2, 3;
Would there be a good algorithm to do this?
The hash table is the best suited I reckon.
Assuming data set sorted and organised so that ID1 is smaller than ID2.
data HAVE;
input ID1 ID2 ;
cards;
1 3
2 3
5 9
6 9
;
run;
data _null_;
retain GRP;
if _N_=1 then do;
ID=.; GROUP=.;
dcl hash H(ordered:'a');
H.definekey('ID');
H.definedata('ID','GROUP');
H.definedone();
end;
set HAVE end=LASTOBS;;
RC1=H.find(key:ID1);
if RC1=0 then GRP1=GROUP;
RC2=H.find(key:ID2);
if RC2=0 then GRP2=GROUP;
if RC1=0 and RC2=0 then return;
else if RC1=0 then do;
H.add(KEY:ID2, data:ID2, data:GRP1);
end;
else if RC2=0 then do;
H.add(KEY:ID1, data:ID1, data:GRP2);
end;
else do;
GRP+1;
H.add(key:ID1, data:ID1, data:GRP);
H.add(key:ID2, data:ID2, data:GRP);
end;
if LASTOBS then H.output(dataset:'WANT');
run;
proc print data=WANT;
run;
Obs | ID | GROUP |
---|---|---|
1 | 1 | 1 |
2 | 2 | 1 |
3 | 3 | 1 |
4 | 5 | 2 |
5 | 6 | 2 |
6 | 9 | 2 |
The hash table is the best suited I reckon.
Assuming data set sorted and organised so that ID1 is smaller than ID2.
data HAVE;
input ID1 ID2 ;
cards;
1 3
2 3
5 9
6 9
;
run;
data _null_;
retain GRP;
if _N_=1 then do;
ID=.; GROUP=.;
dcl hash H(ordered:'a');
H.definekey('ID');
H.definedata('ID','GROUP');
H.definedone();
end;
set HAVE end=LASTOBS;;
RC1=H.find(key:ID1);
if RC1=0 then GRP1=GROUP;
RC2=H.find(key:ID2);
if RC2=0 then GRP2=GROUP;
if RC1=0 and RC2=0 then return;
else if RC1=0 then do;
H.add(KEY:ID2, data:ID2, data:GRP1);
end;
else if RC2=0 then do;
H.add(KEY:ID1, data:ID1, data:GRP2);
end;
else do;
GRP+1;
H.add(key:ID1, data:ID1, data:GRP);
H.add(key:ID2, data:ID2, data:GRP);
end;
if LASTOBS then H.output(dataset:'WANT');
run;
proc print data=WANT;
run;
Obs | ID | GROUP |
---|---|---|
1 | 1 | 1 |
2 | 2 | 1 |
3 | 3 | 1 |
4 | 5 | 2 |
5 | 6 | 2 |
6 | 9 | 2 |
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.