Hello, I've been having issues with the following programming task.
There are 2 columns, 1 with the "From" node, 1 with the "To" node.
Example,
A > B
B > C
B > D
C > E
C > B
I need to find all of the distinct child for each parent within the route.
In this case,
A > B C D E
B > C D E
C > B D E
I can transpose the original dataset to get the direct child for each parent, but can't figure out a way to get the child of the child.
Of course, this is just a simple example. In real thing, it could have a lot more nodes and more complicated links.
Thanks in advance. I have been trying to figure this out for a few days now. Still no luck.
You are right that my previous code is buggy, and it could still be the case for the current one, so let me know how it suits your need. FYI, in term of Hash order direction, it was irrelevant.
data have;
input from $ to $;
cards;
A C
A B
B C
B A
C A
C B
D A
E D
F G
;
proc sql;
create table have1 as
select distinct from from have;quit;
data want;
if _n_=1 then do;
if 0 then set have (rename=(from=_from to=_to));
declare hash h(dataset:'have (rename=(from=_from to=_to))', multidata:'y');
h.definekey('_from');
h.definedata(all:'y');
h.definedone();
end;
declare hash h1(ordered:'a');
h1.definekey('new');
h1.definedata('new');
h1.definedone();
declare hiter hit('h1');
retain new ' ';
set have1;
length newvar $50;
do rc=h.find(key:from) by 0 while (rc=0);
if _to ne from then h1.replace(key:_to, data:_to);
rc=h.find_next(key:from);
end;
do rc=hit.first() by 0 while (rc=0);
rc=h.find(key:new);
do rc=0 by 0 while (rc=0);
if _to ne from and h1.find(key:_to) ne 0 then do;h1.replace(key:_to, data:_to); rc=hit.first(); go to outer;end;
else rc=h.find_next(key:new);
end;
rc=hit.next();
outer: end;
do rc=hit.first() by 0 while (rc=0);
newvar=catx(' ',newvar,new);
rc=hit.next();
end;
keep from newvar;
run;
Haikuo
The question is straightforward, while the answer is not.
data have;
input from $ to $;
cards;
A B
B C
B D
M A
C E
C B
;
proc sql;
create table have1 as
select distinct from from have;quit;
data want;
if _n_=1 then do;
if 0 then set have (rename=(from=_from to=_to));
declare hash h(dataset:'have (rename=(from=_from to=_to))', multidata:'y');
h.definekey('_from');
h.definedata(all:'y');
h.definedone();
end;
declare hash h1(ordered:'a');
h1.definekey('new');
h1.definedata('new');
h1.definedone();
declare hiter hit('h1');
retain new ' ';
set have1;
length newvar $50;
do rc=h.find(key:from) by 0 while (rc=0);
if _to ne from then h1.replace(key:_to, data:_to);
rc=h.find_next(key:from);
end;
rc=hit.first();
do rc=0 by 0 while (rc=0);
rc=h.find(key:new);
do rc=0 by 0 while (rc=0);
if _to ne from then h1.replace(key:_to, data:_to);
rc=h.find_next(key:new);
end;
rc=hit.next();
end;
do rc=hit.first() by 0 while (rc=0);
newvar=catx(' ',newvar,new);
rc=hit.next();
end;
keep from newvar;
run;
Good Luck,
Haikuo
thanks. it works great. however, there are some examples that this doesn't seem to work.
A C
A B
B C
B A
C A
C B
D A
E D
F G
i am thinking it's related to this part of the code declare hash h1(ordered:'a');
the new example works, if i choose "d" instead of "a" order.
is there a way to make order doesn't make any difference?
i don't know enough about hash to figure it out.
thanks a bunch.
You are right that my previous code is buggy, and it could still be the case for the current one, so let me know how it suits your need. FYI, in term of Hash order direction, it was irrelevant.
data have;
input from $ to $;
cards;
A C
A B
B C
B A
C A
C B
D A
E D
F G
;
proc sql;
create table have1 as
select distinct from from have;quit;
data want;
if _n_=1 then do;
if 0 then set have (rename=(from=_from to=_to));
declare hash h(dataset:'have (rename=(from=_from to=_to))', multidata:'y');
h.definekey('_from');
h.definedata(all:'y');
h.definedone();
end;
declare hash h1(ordered:'a');
h1.definekey('new');
h1.definedata('new');
h1.definedone();
declare hiter hit('h1');
retain new ' ';
set have1;
length newvar $50;
do rc=h.find(key:from) by 0 while (rc=0);
if _to ne from then h1.replace(key:_to, data:_to);
rc=h.find_next(key:from);
end;
do rc=hit.first() by 0 while (rc=0);
rc=h.find(key:new);
do rc=0 by 0 while (rc=0);
if _to ne from and h1.find(key:_to) ne 0 then do;h1.replace(key:_to, data:_to); rc=hit.first(); go to outer;end;
else rc=h.find_next(key:new);
end;
rc=hit.next();
outer: end;
do rc=hit.first() by 0 while (rc=0);
newvar=catx(' ',newvar,new);
rc=hit.next();
end;
keep from newvar;
run;
Haikuo
thanks a lot. i have tested many complicated examples. so far, it's working great.
one more thing, just getting more complicated. i don't suppose you can keep the order of the nodes right?
ie, a to b to c. it seems the updated version finds all of them, but in sorted order i believe.
I am not saying it is impossible, but it is getting strenuous, and honestly, I don't see the value of it. for example,
A C
A F
A B
D E
C D
What kind of rules you are expecting?
1). C-D-E-F-B, this is to do the streaming until exhaustion, then moving to next one, repeating the process.
2. C-F-B-D-E, this is to show the emerging orders of those values.
And there could be more possible rules depending on the structure of your data.
For 1) you will have the valuable streaming order until being interrupted by the next obs, and you don't know where it happens, therefore you will have trouble identifying the rank of those transferring.
For 2) It may be simply done, but what is value adding to the understanding of your data? It will only be the order of these value showing up from top down.
Just my 2 cents,
Haikuo
i see. it should follow rule 1). so it's not possible to do for this rule? thanks.
If you have SAS/OR licensed you may want to explore PROC BOM as well.
unfortunately, i don't have that.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.