I have a SAS Datasets with multiple equalities: i.e. A=C, B=C etc. and I would like to make concatenate it into one line to A=B=C.
For example IDs in 1 and 2 represent equality,
ID1, ID2;
1,3;
2,3;
I would like to make these into
ID1, ID2, ID3;
1, 2, 3;
Would there be a good algorithm to do this?
The hash table is the best suited I reckon.
Assuming data set sorted and organised so that ID1 is smaller than ID2.
data HAVE;
input ID1 ID2 ;
cards;
1 3
2 3
5 9
6 9
;
run;
data _null_;
retain GRP;
if _N_=1 then do;
ID=.; GROUP=.;
dcl hash H(ordered:'a');
H.definekey('ID');
H.definedata('ID','GROUP');
H.definedone();
end;
set HAVE end=LASTOBS;;
RC1=H.find(key:ID1);
if RC1=0 then GRP1=GROUP;
RC2=H.find(key:ID2);
if RC2=0 then GRP2=GROUP;
if RC1=0 and RC2=0 then return;
else if RC1=0 then do;
H.add(KEY:ID2, data:ID2, data:GRP1);
end;
else if RC2=0 then do;
H.add(KEY:ID1, data:ID1, data:GRP2);
end;
else do;
GRP+1;
H.add(key:ID1, data:ID1, data:GRP);
H.add(key:ID2, data:ID2, data:GRP);
end;
if LASTOBS then H.output(dataset:'WANT');
run;
proc print data=WANT;
run;
Obs | ID | GROUP |
---|---|---|
1 | 1 | 1 |
2 | 2 | 1 |
3 | 3 | 1 |
4 | 5 | 2 |
5 | 6 | 2 |
6 | 9 | 2 |
The hash table is the best suited I reckon.
Assuming data set sorted and organised so that ID1 is smaller than ID2.
data HAVE;
input ID1 ID2 ;
cards;
1 3
2 3
5 9
6 9
;
run;
data _null_;
retain GRP;
if _N_=1 then do;
ID=.; GROUP=.;
dcl hash H(ordered:'a');
H.definekey('ID');
H.definedata('ID','GROUP');
H.definedone();
end;
set HAVE end=LASTOBS;;
RC1=H.find(key:ID1);
if RC1=0 then GRP1=GROUP;
RC2=H.find(key:ID2);
if RC2=0 then GRP2=GROUP;
if RC1=0 and RC2=0 then return;
else if RC1=0 then do;
H.add(KEY:ID2, data:ID2, data:GRP1);
end;
else if RC2=0 then do;
H.add(KEY:ID1, data:ID1, data:GRP2);
end;
else do;
GRP+1;
H.add(key:ID1, data:ID1, data:GRP);
H.add(key:ID2, data:ID2, data:GRP);
end;
if LASTOBS then H.output(dataset:'WANT');
run;
proc print data=WANT;
run;
Obs | ID | GROUP |
---|---|---|
1 | 1 | 1 |
2 | 2 | 1 |
3 | 3 | 1 |
4 | 5 | 2 |
5 | 6 | 2 |
6 | 9 | 2 |
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.