BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sas-inquirer
Quartz | Level 8

I need to determine the parent of a value. I have a classification structure that is hierarchical in nature. The logic: for each group, the parent of the class is the first value from an undetermined row above that has a smaller length than the current length. 

 

data have;
input group $ class $ class_length;
datalines;
68 1 1
68 11 2
68 1101 4
68 1102 4
68 1103 4
68 12 2
68 1201 4
68 1202 4
27 1 1
27 1101 4 27 2 1 27 2201 4 27 2202 4 ;

The end result should look something like this:

data want;
input group $ class $ class_length parent $;
datalines;
68 1 1 .
68 11 2 1
68 1101 4 11
68 1102 4 11
68 1103 4 11
68 12 2 1
68 1201 4 12
68 1202 4 12
27 1 1 .
27 1101 4 1 27 2 1 . 27 2201 4 2 27 2202 4 2 ;

I have tried using lag and first but I always run into trouble with the siblings (i.e. [11,12], [1101,1102,1103] etc.).

 

I would super appreciate any help someone could provide. Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
input group $ class $ class_length;
datalines;
68 1 1
68 11 2
68 1101 4
68 1102 4
68 1103 4
68 12 2
68 1201 4
68 1202 4
27 1 1
27 1101 4
27 2 1
27 2201 4
27 2202 4
;
data want;
 if _n_=1 then do;
  if 0 then set have;
  length parent par $ 80; par=' ';
  length n len 8; n=.;len=.;
  declare hash h();
  h.definekey('n');
  h.definedata('len','par');
  h.definedone();
 end;
h.clear();
do i=1 by 1 until(last.group);
 set have ;
 by group notsorted;
 h.add(key:i,data:class_length,data:class);
 call missing(parent);
 do n=i-1 to 1 by -1;
   if h.find()=0 then do; if len < class_length then do;parent=par; leave; end;end;
 end;
 output;
end;
drop n len i par;
run;

View solution in original post

11 REPLIES 11
ballardw
Super User

I think you may want to expand on your actual rule(s) for assigning the Parent values.

sas-inquirer
Quartz | Level 8
Say you take class value 1103 from group 68 and you have to determine which class is the parent of it.
1) Look at the length of 1103 which is 4.
2) Move up the list until you reach a length that is less than 4 (so skip 1102 and 1101).
3) The first instance of a length less than 4 above 1103 is 11 with a length of 2.
4) Identify the parent of 1103 which is 11 because it is the first instance above 1103 with a length less than 4.

I hope this example helps.
sas-inquirer
Quartz | Level 8
@Kurt_Bremser the max class_length possible is 15.
Ksharp
Super User
data have;
input group $ class $ class_length;
datalines;
68 1 1
68 11 2
68 1101 4
68 1102 4
68 1103 4
68 12 2
68 1201 4
68 1202 4
27 1 1
27 1101 4
27 2 1
27 2201 4
27 2202 4
;
data want;
 if _n_=1 then do;
  if 0 then set have;
  length parent par $ 80; par=' ';
  length n len 8; n=.;len=.;
  declare hash h();
  h.definekey('n');
  h.definedata('len','par');
  h.definedone();
 end;
h.clear();
do i=1 by 1 until(last.group);
 set have ;
 by group notsorted;
 h.add(key:i,data:class_length,data:class);
 call missing(parent);
 do n=i-1 to 1 by -1;
   if h.find()=0 then do; if len < class_length then do;parent=par; leave; end;end;
 end;
 output;
end;
drop n len i par;
run;
sas-inquirer
Quartz | Level 8
@Ksharp!!!! You did it! Wow! It works exactly as I had hoped. I am so impressed. This isn't the first time you saved the day: You solved a really complicated problem for me many years ago. Thank you again. You are a rock star!
Ksharp
Super User
But your question is really weird . Does your data really look like this and the order of data is really as you showed ?
sas-inquirer
Quartz | Level 8
This is obviously mock data and the original data can have some 60K lines but for the most part, yes the data really looks like this. I'm working with a bunch of reports where each row represents a data point in a report with higher level classes (parent) being aggregates of the children below. Some reports take all of the class levels and other reports focus on varying levels of aggregates. The mock data I supplied here showcased the different scenarios I encountered. Thanks again. 🙂
sas-inquirer
Quartz | Level 8

@Ksharp  I'm hoping you can help me once again. I've been testing the code on various reports to ensure the results are what is expected and have hit a scenario which requires an additional condition. I have one report that should not have any parents because there are no aggregates in it and another report that has only one or two parents (aggregates).

The condition would be that it needs to move up until length is less than current length and also that the parent is a substring of the current class. Here is what I'm looking for with the new group 31 added.

data want;
input group $ class $ class_length parent $;
datalines;
68 1 1 .
68 11 2 1
68 1101 4 11
68 1102 4 11
68 1103 4 11
68 12 2 1
68 1201 4 12
68 1202 4 12
27 1 1 .
27 1101 4 1
27 2 1 .
27 2201 4 2
27 2202 4 2
31 1 1 .
31 2101 4 .
31 2102 4 .
31 220201 6 .
31 220202 6 .
31 22020201 8 220202
;

So with the original logic for group 31, class 220201 would end up with the parent 2102 but because 2102 is not a substring of 220201 then the parent should be blank. No matter how far up you move, 220201 has no parent. I hope this makes sense. 

 

Please let me know if you know of a way to add this condition to the original code you supplied. Thanks in advance!

Ksharp
Super User

I guess you want CLASS start with PARENT , not a substring ,otherwise why 

31 2101 4 . 

 doesn't have parent 1 , since 1 is a substring within 2101 ?

 

data have;
input group $ class $ class_length;
datalines;
68 1 1
68 11 2
68 1101 4
68 1102 4
68 1103 4
68 12 2
68 1201 4
68 1202 4
27 1 1
27 1101 4
27 2 1
27 2201 4
27 2202 4
31 1 1 .
31 2101 4 .
31 2102 4 .
31 220201 6 .
31 220202 6 .
31 22020201 8 
;
data want;
 if _n_=1 then do;
  if 0 then set have;
  length parent par $ 80; par=' ';
  length n len 8; n=.;len=.;
  declare hash h();
  h.definekey('n');
  h.definedata('len','par');
  h.definedone();
 end;
h.clear();
do i=1 by 1 until(last.group);
 set have ;
 by group notsorted;
 h.add(key:i,data:class_length,data:class);
 call missing(parent);
 do n=i-1 to 1 by -1;
   if h.find()=0 then do; if len < class_length and class =: strip(par) then do;parent=par; leave; end;end;
 end;
 output;
end;
drop n len i par;
run;
sas-inquirer
Quartz | Level 8
@Ksharp beautiful! That makes more sense than substring. I can't thank you enough. I have spent more time than I'd like to admit trying to figure this out.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 718 views
  • 5 likes
  • 4 in conversation