BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ravimegharaj
Fluorite | Level 6

Hello,

 

I have 2 variables in a dataset viz child code and parent code. I want to filter for all the childrens within a parent code.

Eg:

Child codeMother code
21
32
42
52
63
76
87

 

From above sample dataset, If i select 2 I want all the childrens and grand childrens under 2 ie 3, 4, 5, 6, 7, 8 and if i select 3 as a parent code then 6, 7, 8 etc..

 

Kindly help..

 

Thanks,

Ravi

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Here is a simple solution using an index.

 

data have(index=(child / unique));
input Child Mother;
datalines;
2	1
3	2
4	2
5	2
6	3
7	6
8	7
;

data want;
set have;
ancestor = mother;
descendent = child;
output;
do i = 1 to 100 until(_error_); /* catch infinite loops */
    child = mother;
    set have key=child / unique;
    if not _error_ then do;
        ancestor = mother;
        output;
        end;
    end;
_error_ = 0;
keep ancestor descendent;
run;

proc sort data=want; by ancestor descendent; run;

proc print; run;
PG

View solution in original post

14 REPLIES 14
Reeza
Super User

How are you specifying the 2/3? Can you specify multiples ie 2 & 3?

 

What do you want the output to look like?

ravimegharaj
Fluorite | Level 6

Thanks for your reply Reeza.

 

Yes, there should be an ability to specify 2 and 3.

 

Output will look like:

Mother codeChild code
23
24
25
26
27
36
37
38
78

 

Hope this helps.

 

Thanks,

Ravi

PGStats
Opal | Level 21

Is there a limit to the number of generations between mother and descendant that you want? 

PG
ravimegharaj
Fluorite | Level 6

Thanks for your reply PG Stats.

 

There is no set limit as such for the generation of rows. The generations will be until the last child code is available in a relationship.

 

Thanks,

Ravi Megharaj

Reeza
Super User

This is a traverse the graph problem. 

 

One way to accomplish this is to identify all clusters and then you can filter what you need from final table. 

 

This code is @PGStats solution  for this type of problem. If it's not, he'll correct me Woman Wink

 

https://gist.github.com/statgeek/14e3aa2a9f718f551cd98134e9ceed30

PGStats
Opal | Level 21

Unfortunately, the subGraph macro cannot solve this problem because it is meant for undirected graphs. But as we all know the parent-child relationship is directed Man Happy. The subgraph macro would thus link a person to all members of his/her family.

PG
PGStats
Opal | Level 21

Here is a simple solution using an index.

 

data have(index=(child / unique));
input Child Mother;
datalines;
2	1
3	2
4	2
5	2
6	3
7	6
8	7
;

data want;
set have;
ancestor = mother;
descendent = child;
output;
do i = 1 to 100 until(_error_); /* catch infinite loops */
    child = mother;
    set have key=child / unique;
    if not _error_ then do;
        ancestor = mother;
        output;
        end;
    end;
_error_ = 0;
keep ancestor descendent;
run;

proc sort data=want; by ancestor descendent; run;

proc print; run;
PG
Ksharp
Super User
PG,
What if there is dead loop ? Like:

2 1
3 2
1 3

PGStats
Opal | Level 21

Looping will stop when i reaches 100. A check could (should) be added for this condition

 

if i > 99 then ERROR "Invalid data involving descendant " descendent;

else _ERROR_ = 0;

 

(untested)

PG
Ksharp
Super User
PG,

What if there are actually more than 100 strat/subject , i.e. 120 different subject ?
PGStats
Opal | Level 21

The loop travels the tree from child to its most ancient ancestor. So you would need data about more than 100 generations (not subjects) to reach the limit. That limit could be increased of course, but 100 seemed reasonable to me.

PG
ravimegharaj
Fluorite | Level 6

Thanks a lot PGStats. Your solution helps and resolves my issue.

RahulG
Barite | Level 11

I tried to implement FIFO (first in first out stack) concept to find all the child. User need to provide mother id in macro variable

search_val

 

The final desired result would come in work.output_list.

 

 

%let search_val=2;

data input;
input C_cd	M_cd;
datalines;
2 1
3 2
4 2
5 2
6 3
7 6
8 7
;



proc sql;
create table  parent_child as
select a.c_cd as ID,
a.M_cd as mother, b.C_cd as child,
put(a.M_cd,8.) || ", "||put(a.c_cd,8.)|| ", "||put(b.C_cd ,8.)as list
from input a left join input b
on a.C_cd = b.M_cd;
quit;
%let cnt_val =1;


proc sql;
drop table fifo_list ;
drop table output_list ;
drop table   new_list;
quit;

data fifo_list;
child=&search_val;
run;


%macro fifo;

%start: %put &cnt_val ;
data _null_;
set fifo_list   (obs=1);
call symputx('val',child);
run;


proc sql;
delete from fifo_list   where child=&val;

create table new_list as
select child from parent_child 
where ID=&val and child <> . ;
quit;

proc append base=fifo_list   data=new_list;run;

proc append base=output_list data=new_list;run;

proc sql;
select count(*) into :cnt_val 
from fifo_list   where child <> . ;  ;
quit;
%if &cnt_val > 0 %then %goto start;

%mend;

%fifo;

 

Ksharp
Super User
Here is mine. Check LOG after running code.



data have;
input _end  _start ;
cards;
2 1
3 2
4 2
5 2
6 3
7 6
8 7
;
run;
 
%let start=2;
data want(keep=path);
if _n_ eq 1 then do;
length path _path  $ 700 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y');
ha.definekey('_start');
ha.definedata('_end');
ha.definedone();

declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('n','path');
pa.definedone();

end;


set have(where=(_start=&start));
count=1;n=1;_n=1;
path=catx('|',_start,_end);
putlog 'WARNING:Found  ' _end;
   
pa.add();output;
do while(hi_path.next()=0);
 if n ne 1 then pa.remove(key:_n);_n=n;
 _path=path;   
 _start=scan(path,-1,'|');
 rc=ha.find();
 do while(rc=0);
  if not findw(path,strip(_end),'|') then do;
   if length(path)+length(_end)+1 gt lengthc(path) then do;
    putlog 'ERROR: The length of path and _path are set too short';
    stop;
   end;
   
   putlog 'WARNING:Found  ' _end;
   count+1;n=count;
   path=catx('|',path,_end);
   pa.add(); output;
   path=_path;
 end;
  rc=ha.find_next();
end;
end;
pa.clear();

run;

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 6405 views
  • 6 likes
  • 5 in conversation