data HIERARCHY;
input Parent Child;
datalines;
. 1
1 2
1 3
1 8
1 9
2 4
3 5
5 6
6 7
8 10
10 11
10 13;
run;
for the above dataset i want to find the ancestor lineage for each of the records. Can you please help.
Output would look as below.
datalines;
parent child ancestor_lineage
1 2 1
1 3 1
1 8 1
1 9 1
2 4 2,1
3 5 3.1
5 6 5,3,1
6 7 6,5,3,1
8 10 8
10 11 10, 8
10 13; 10,8
Using an index, inspired by @PGStats's solution here.
A hash table could be used in a similar manner if @hashman is inspired...
data WANT;
set HAVE;
if PARENT;
LINEAGE=cat(PARENT);
KEY=PARENT;
do I = 1 to 100 ; %* catch infinite loops ;
set HAVE(rename=(PARENT=P CHILD=KEY)) key=KEY / unique;
if _error_ or missing(P) then leave;
LINEAGE=catx(',',LINEAGE,P);
KEY=P;
end;
output;
_error_ = 0;
run;
PARENT | CHILD | LINEAGE |
1 | 2 | 1 |
1 | 3 | 1 |
1 | 8 | 1 |
1 | 9 | 1 |
2 | 4 | 2,1 |
3 | 5 | 3,1 |
5 | 6 | 5,3,1 |
6 | 7 | 6,5,3,1 |
8 | 10 | 8,1 |
10 | 11 | 10,8,1 |
10 | 13 | 10,8,1 |
Using an index, inspired by @PGStats's solution here.
A hash table could be used in a similar manner if @hashman is inspired...
data WANT;
set HAVE;
if PARENT;
LINEAGE=cat(PARENT);
KEY=PARENT;
do I = 1 to 100 ; %* catch infinite loops ;
set HAVE(rename=(PARENT=P CHILD=KEY)) key=KEY / unique;
if _error_ or missing(P) then leave;
LINEAGE=catx(',',LINEAGE,P);
KEY=P;
end;
output;
_error_ = 0;
run;
PARENT | CHILD | LINEAGE |
1 | 2 | 1 |
1 | 3 | 1 |
1 | 8 | 1 |
1 | 9 | 1 |
2 | 4 | 2,1 |
3 | 5 | 3,1 |
5 | 6 | 5,3,1 |
6 | 7 | 6,5,3,1 |
8 | 10 | 8,1 |
10 | 11 | 10,8,1 |
10 | 13 | 10,8,1 |
Thank you. I am able to do the same thing through another means using point and last in the data step
@anandbillava I'd like to see that code. Please post!
Here it is
data test;
input child parent;
datalines;
23 11
99 88
77 66
72 68
88 77
66 1
68 55
69 55
70 55
55 56
03 99
;
data lineage(keep=hist1-hist10 flag);
set test;
array hist(10);
count=1;
hist(count)=child;
do i=1 to last while (count<dim(hist)-1);
set test(rename=(child=child1 parent=parent1)) nobs=last point=i;
if parent=child1 then do;
count+1;
hist(count)=child1;
child=child1;
parent=parent1;
i=1;
end;
end;
count+1;
hist(count)=parent;
flag = parent;
run;
But of course:
data have ;
input parent child ;
cards ;
. 1
1 2
1 3
1 8
1 9
2 4
3 5
5 6
6 7
8 10
10 11
10 13
run ;
data want ;
if _n_ = 1 then do ;
dcl hash h (dataset: "have (where = (N (parent))") ;
h.defineKey ("child") ;
h.defineData ("parent") ;
h.defineDone () ;
end ;
set have (where = (N (parent))) ;
lineage = put (parent, 32.-L) ;
do while (not h.find(key:parent)) ;
lineage = catx (",", lineage, parent) ;
end ;
run ;
Best,
Paul D.
@hashman Almost! You are overwriting the parent 🙂
A good eye. Quite forgot to resurrect the killed parents:
data want ; if _n_ = 1 then do ; dcl hash h (dataset:"have(where=(N(parent))") ; h.defineKey ("child") ; h.defineData ("parent") ; h.defineDone () ; end ; set have (where=(N(parent))) ; _n_ = parent ; lineage = put (parent, 32.-L) ; do while (not h.find(key:parent)) ; lineage = catx (",", lineage, parent) ; end ; parent = _n_ ; run ;
Best,
Paul D.
What can't you do with the power of SAS 😉
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.