data HIERARCHY;
input Parent Child;
datalines;
. 1
1 2
1 3
1 8
1 9
2 4
3 5
5 6
6 7
8 10
10 11
10 13;
run;
for the above dataset i want to find the ancestor lineage for each of the records. Can you please help.
Output would look as below.
datalines;
parent child ancestor_lineage
1 2 1
1 3 1
1 8 1
1 9 1
2 4 2,1
3 5 3.1
5 6 5,3,1
6 7 6,5,3,1
8 10 8
10 11 10, 8
10 13; 10,8
Using an index, inspired by @PGStats's solution here.
A hash table could be used in a similar manner if @hashman is inspired...
data WANT;
set HAVE;
if PARENT;
LINEAGE=cat(PARENT);
KEY=PARENT;
do I = 1 to 100 ; %* catch infinite loops ;
set HAVE(rename=(PARENT=P CHILD=KEY)) key=KEY / unique;
if _error_ or missing(P) then leave;
LINEAGE=catx(',',LINEAGE,P);
KEY=P;
end;
output;
_error_ = 0;
run;
PARENT | CHILD | LINEAGE |
1 | 2 | 1 |
1 | 3 | 1 |
1 | 8 | 1 |
1 | 9 | 1 |
2 | 4 | 2,1 |
3 | 5 | 3,1 |
5 | 6 | 5,3,1 |
6 | 7 | 6,5,3,1 |
8 | 10 | 8,1 |
10 | 11 | 10,8,1 |
10 | 13 | 10,8,1 |
Using an index, inspired by @PGStats's solution here.
A hash table could be used in a similar manner if @hashman is inspired...
data WANT;
set HAVE;
if PARENT;
LINEAGE=cat(PARENT);
KEY=PARENT;
do I = 1 to 100 ; %* catch infinite loops ;
set HAVE(rename=(PARENT=P CHILD=KEY)) key=KEY / unique;
if _error_ or missing(P) then leave;
LINEAGE=catx(',',LINEAGE,P);
KEY=P;
end;
output;
_error_ = 0;
run;
PARENT | CHILD | LINEAGE |
1 | 2 | 1 |
1 | 3 | 1 |
1 | 8 | 1 |
1 | 9 | 1 |
2 | 4 | 2,1 |
3 | 5 | 3,1 |
5 | 6 | 5,3,1 |
6 | 7 | 6,5,3,1 |
8 | 10 | 8,1 |
10 | 11 | 10,8,1 |
10 | 13 | 10,8,1 |
Thank you. I am able to do the same thing through another means using point and last in the data step
@anandbillava I'd like to see that code. Please post!
Here it is
data test;
input child parent;
datalines;
23 11
99 88
77 66
72 68
88 77
66 1
68 55
69 55
70 55
55 56
03 99
;
data lineage(keep=hist1-hist10 flag);
set test;
array hist(10);
count=1;
hist(count)=child;
do i=1 to last while (count<dim(hist)-1);
set test(rename=(child=child1 parent=parent1)) nobs=last point=i;
if parent=child1 then do;
count+1;
hist(count)=child1;
child=child1;
parent=parent1;
i=1;
end;
end;
count+1;
hist(count)=parent;
flag = parent;
run;
But of course:
data have ;
input parent child ;
cards ;
. 1
1 2
1 3
1 8
1 9
2 4
3 5
5 6
6 7
8 10
10 11
10 13
run ;
data want ;
if _n_ = 1 then do ;
dcl hash h (dataset: "have (where = (N (parent))") ;
h.defineKey ("child") ;
h.defineData ("parent") ;
h.defineDone () ;
end ;
set have (where = (N (parent))) ;
lineage = put (parent, 32.-L) ;
do while (not h.find(key:parent)) ;
lineage = catx (",", lineage, parent) ;
end ;
run ;
Best,
Paul D.
@hashman Almost! You are overwriting the parent 🙂
A good eye. Quite forgot to resurrect the killed parents:
data want ; if _n_ = 1 then do ; dcl hash h (dataset:"have(where=(N(parent))") ; h.defineKey ("child") ; h.defineData ("parent") ; h.defineDone () ; end ; set have (where=(N(parent))) ; _n_ = parent ; lineage = put (parent, 32.-L) ; do while (not h.find(key:parent)) ; lineage = catx (",", lineage, parent) ; end ; parent = _n_ ; run ;
Best,
Paul D.
What can't you do with the power of SAS 😉
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.