I need to determine the parent of a value. I have a classification structure that is hierarchical in nature. The logic: for each group, the parent of the class is the first value from an undetermined row above that has a smaller length than the current length.
data have;
input group $ class $ class_length;
datalines;
68 1 1
68 11 2
68 1101 4
68 1102 4
68 1103 4
68 12 2
68 1201 4
68 1202 4
27 1 1
27 1101 4
27 2 1
27 2201 4
27 2202 4
;
The end result should look something like this:
data want;
input group $ class $ class_length parent $;
datalines;
68 1 1 .
68 11 2 1
68 1101 4 11
68 1102 4 11
68 1103 4 11
68 12 2 1
68 1201 4 12
68 1202 4 12
27 1 1 .
27 1101 4 1
27 2 1 .
27 2201 4 2
27 2202 4 2
;
I have tried using lag and first but I always run into trouble with the siblings (i.e. [11,12], [1101,1102,1103] etc.).
I would super appreciate any help someone could provide. Thanks in advance.
data have; input group $ class $ class_length; datalines; 68 1 1 68 11 2 68 1101 4 68 1102 4 68 1103 4 68 12 2 68 1201 4 68 1202 4 27 1 1 27 1101 4 27 2 1 27 2201 4 27 2202 4 ; data want; if _n_=1 then do; if 0 then set have; length parent par $ 80; par=' '; length n len 8; n=.;len=.; declare hash h(); h.definekey('n'); h.definedata('len','par'); h.definedone(); end; h.clear(); do i=1 by 1 until(last.group); set have ; by group notsorted; h.add(key:i,data:class_length,data:class); call missing(parent); do n=i-1 to 1 by -1; if h.find()=0 then do; if len < class_length then do;parent=par; leave; end;end; end; output; end; drop n len i par; run;
I think you may want to expand on your actual rule(s) for assigning the Parent values.
Which values for class_length are possible?
data have; input group $ class $ class_length; datalines; 68 1 1 68 11 2 68 1101 4 68 1102 4 68 1103 4 68 12 2 68 1201 4 68 1202 4 27 1 1 27 1101 4 27 2 1 27 2201 4 27 2202 4 ; data want; if _n_=1 then do; if 0 then set have; length parent par $ 80; par=' '; length n len 8; n=.;len=.; declare hash h(); h.definekey('n'); h.definedata('len','par'); h.definedone(); end; h.clear(); do i=1 by 1 until(last.group); set have ; by group notsorted; h.add(key:i,data:class_length,data:class); call missing(parent); do n=i-1 to 1 by -1; if h.find()=0 then do; if len < class_length then do;parent=par; leave; end;end; end; output; end; drop n len i par; run;
@Ksharp I'm hoping you can help me once again. I've been testing the code on various reports to ensure the results are what is expected and have hit a scenario which requires an additional condition. I have one report that should not have any parents because there are no aggregates in it and another report that has only one or two parents (aggregates).
The condition would be that it needs to move up until length is less than current length and also that the parent is a substring of the current class. Here is what I'm looking for with the new group 31 added.
data want;
input group $ class $ class_length parent $;
datalines;
68 1 1 .
68 11 2 1
68 1101 4 11
68 1102 4 11
68 1103 4 11
68 12 2 1
68 1201 4 12
68 1202 4 12
27 1 1 .
27 1101 4 1
27 2 1 .
27 2201 4 2
27 2202 4 2
31 1 1 .
31 2101 4 .
31 2102 4 .
31 220201 6 .
31 220202 6 .
31 22020201 8 220202
;
So with the original logic for group 31, class 220201 would end up with the parent 2102 but because 2102 is not a substring of 220201 then the parent should be blank. No matter how far up you move, 220201 has no parent. I hope this makes sense.
Please let me know if you know of a way to add this condition to the original code you supplied. Thanks in advance!
I guess you want CLASS start with PARENT , not a substring ,otherwise why
31 2101 4 .
doesn't have parent 1 , since 1 is a substring within 2101 ?
data have;
input group $ class $ class_length;
datalines;
68 1 1
68 11 2
68 1101 4
68 1102 4
68 1103 4
68 12 2
68 1201 4
68 1202 4
27 1 1
27 1101 4
27 2 1
27 2201 4
27 2202 4
31 1 1 .
31 2101 4 .
31 2102 4 .
31 220201 6 .
31 220202 6 .
31 22020201 8
;
data want;
if _n_=1 then do;
if 0 then set have;
length parent par $ 80; par=' ';
length n len 8; n=.;len=.;
declare hash h();
h.definekey('n');
h.definedata('len','par');
h.definedone();
end;
h.clear();
do i=1 by 1 until(last.group);
set have ;
by group notsorted;
h.add(key:i,data:class_length,data:class);
call missing(parent);
do n=i-1 to 1 by -1;
if h.find()=0 then do; if len < class_length and class =: strip(par) then do;parent=par; leave; end;end;
end;
output;
end;
drop n len i par;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.