I have a data set where I want to find all related records, and provide an ID to group them together. I think an array, or some kind of loop, but I'm not 100% certain the way to go. Below is an example on what I am looking to do.
ID Corresponding_Match
1 X
2 1
3 Y
4 Z
5 2
6 3
In this example, record 1,2, and 5 would be my first related group; record 3 and 6 would also group together. Ideally, each grouping would get a unique ID corresponding to them. For example 1,2,5 would be UNIQUE1 and 3,6 would be UNIQUE2. I'm operating on SAS EG with no additional add-ons or licenses other than the base package. Any help is appreciated. Thank you!
These are all very similar to your question. Do any of them work for you?
https://communities.sas.com/t5/SAS-Data-Management/Recursive-Query/m-p/466861
https://communities.sas.com/t5/SAS-Data-Management/recursive-joins-with-PROC-SQL/m-p/411824
https://communities.sas.com/t5/SAS-Programming/Recursive-lookup-for-ID-s/m-p/259635
@adornodj wrote:
I have a data set where I want to find all related records, and provide an ID to group them together. I think an array, or some kind of loop, but I'm not 100% certain the way to go. Below is an example on what I am looking to do.
ID Corresponding_Match
1 X
2 1
3 Y
4 Z
5 2
6 3
In this example, record 1,2, and 5 would be my first related group; record 3 and 6 would also group together. Ideally, each grouping would get a unique ID corresponding to them. For example 1,2,5 would be UNIQUE1 and 3,6 would be UNIQUE2. I'm operating on SAS EG with no additional add-ons or licenses other than the base package. Any help is appreciated. Thank you!
These are all very similar to your question. Do any of them work for you?
https://communities.sas.com/t5/SAS-Data-Management/Recursive-Query/m-p/466861
https://communities.sas.com/t5/SAS-Data-Management/recursive-joins-with-PROC-SQL/m-p/411824
https://communities.sas.com/t5/SAS-Programming/Recursive-lookup-for-ID-s/m-p/259635
@adornodj wrote:
I have a data set where I want to find all related records, and provide an ID to group them together. I think an array, or some kind of loop, but I'm not 100% certain the way to go. Below is an example on what I am looking to do.
ID Corresponding_Match
1 X
2 1
3 Y
4 Z
5 2
6 3
In this example, record 1,2, and 5 would be my first related group; record 3 and 6 would also group together. Ideally, each grouping would get a unique ID corresponding to them. For example 1,2,5 would be UNIQUE1 and 3,6 would be UNIQUE2. I'm operating on SAS EG with no additional add-ons or licenses other than the base package. Any help is appreciated. Thank you!
I'm testing some code someone created from the 3rd link you provided me. if that does not work, I'll post a follow up. Thank you! Now that I know it's a "recursive match" that helps greatly.
Hi All - the 3rd link has code that worked to solve my problem. On my full dataset, I had records where the ID2 was blank (see below). I wound up setting those values equal to ID1, and that created clusters which were either 1 record (no matches), 2 matches (natural pair) or 3+ records within the cluster (which are outliers in my data).
Thanks to all who provided assistance.
HAVE
ID1 ID2
1
2 Y
Y Z
HAVE_UPDATED
ID1 ID2
1 1
2 Y
Y Z
data have;
infile cards ;
input from $ to $ ;
cards;
1 X
2 1
3 Y
4 Z
5 2
6 3
;
run;
data full;
set have end=last;
if _n_ eq 1 then do;
declare hash h();
h.definekey('node');
h.definedata('node');
h.definedone();
end;
output;
node=from; h.replace();
from=to; to=node;
output;
node=from; h.replace();
if last then h.output(dataset:'node');
drop node;
run;
data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 20);
_ha.definekey('key');
_ha.definedone();
if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
from_to.definekey('from');
from_to.definedata('to');
from_to.definedone();
if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
no.definekey('node');
no.definedata('node');
no.definedone();
do while(hi_no.next()=0);
household+1; output;
count=1;
key=node;_ha.add();
last=node;ha.add();
rc=hi.first();
do while(rc=0);
from=last;rx=from_to.find();
do while(rx=0);
key=to;ry=_ha.check();
if ry ne 0 then do;
node=to;output;rr=no.remove(key:node);
key=to;_ha.add();
count+1;
last=to;ha.add();
end;
rx=from_to.find_next();
end;
rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;
If you have SAS/OR ,try this one:
/* Same code as SAS/OR */
proc optnet data_links=have out_nodes=want GRAPH_DIRECTION=UNDIRECTED;
data_links_var from=from to=to;
concomp;
run;
Hello thank you for the response. I am testing out the logics today. Very grateful to the code you all have sent me. Truly.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.