BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nrk1787db1
Obsidian | Level 7

How to implement Nested Set Model for traversing a hierarchy structure in Base SAS with respect to deriving Organizational Chart. ?

 

Looks like Nested set model will have one record per person and it has all nested sets flattened onto a number line.

 

The model in use has each person’s complete chain (up and down traversal organizational reporting chart details) to the highest determined person (CEO) as many records per person in database.

 

Nested model for a university example is given below, I would like to know if we can program to get the left and right nodes for each variable in dataset dynamically ?

System_1.pngSystem_2.png

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

Assuming you have data like this:

data have;
  infile cards truncover; 
  input id parent_id name$40.;
cards;
1 . System
2 1 Campus A
3 1 Campus B
4 1 Campus C
5 2 Division A
6 2 Division B
7 6 College A
8 6 College B
9 6 College C
10 9 Department A
11 9 Department B
12 9 Department C
;run;

Converting it to a nested set models seems a good way of having easy access to all the children or all the parents of a node.

In a lot of SQL implementations, though, you would rather answer such questions by recursive queries, (like in this example from SQL Server). Although SAS SQL has not yet implemented recursive queries, you can easily do something similar with a datastep.

 

First, put indexes on your data if you do not already have them: 

proc sql;
  create unique index id on have(id);
  create index parent_id on have(parent_id);
quit;

Then, you can easily find all the children of a node in a couple of datasteps, e.g.:

data Division_B;
  set have;
  where id=6;
  level=1;
run;

data Division_B;
  modify Division_B;
  parent_id=id;
  level=level+1;
  do until(0);
    set have key=parent_id;
    if _iorc_ then leave;
    output;
    end;
  _error_=0;
run;

Here, the recursion is accomplished by the OUTPUT statement in the second datastep, which appends the children to the end of the table, from which they are then read by the SET statement. A similar construction can be used to find the parents of a node.

 

If you want the nested set for other reasons, here is a way. First, create the complete list of parents for all ids (saved in the IDS array):

%let nlevels=10; /* max number of levels in org. tree */
data tree;
  set have;
  where parent_id is null;
  retain id1-id&nlevels . level 1;
  id1=id;
run;

data tree;
  modify tree;
  parent_id=id;
  level=level+1;
  array ids id1-id&nlevels;
  do until(0);
    set have key=parent_id;
    if _iorc_ then leave;
    ids(level)=id;
    output;
    end;
  _error_=0;
run;

proc sort data=tree;
  by id1-id&nlevels id;
run;

The NLEVELS macro variable does not have to be exactly the number of levels, any larger number is OK (although increasing the number will make the program run slightly slower).

 

Then the nested set can be constructed like this:

data nested_set;
  set have;
  retain left right .;
run;

proc sql;
  create index id on nested_set(id);
run;

data nested_set;
  set tree;
  by id1-id&nlevel;
  array ids id1-id&nlevels;
  array lasts last.id1-last.id&nlevels;
  if lasts(level) then do;
    link mod;
    counter+1;
    left=counter;
    counter+1;
    right=counter;
    replace;
    do level=level-1 to 1 by -1 while(lasts(level));
      id=ids(level);
      link mod;
      counter+1;
      right=counter;
      replace;
      end;
    end;
  else do;
    link mod;
    counter+1;
    left=counter;
    replace;
    end;
if 0 then do; /* as the modify statement is used in several places, it must be called in a LINK stmt. */
  mod:
    modify nested_set key=id/unique;
    return;
  end;
run;

But every time you add or delete a node in the tree, the whole nested set has to be reconstructed. So it may be easier to use a datastep like the one I demonstrated for Division B above to get at the child nodes.

 

View solution in original post

5 REPLIES 5
ballardw
Super User

Do you have access to SAS/OR procedures? If you aren't sure then run this code:

 

Proc Setinit;

Run;

 

and that will show the modules you have installed. OR has several procedures related to tasks like this.

You may have to provide some example of your existing data.

nrk1787db1
Obsidian | Level 7

Thank you for your response, unfortunately we dont have SAS/OR installed in our systems.

Ksharp
Super User

You may have to provide some example of your existing data( Better is data step).

and output you want .

s_lassen
Meteorite | Level 14

Assuming you have data like this:

data have;
  infile cards truncover; 
  input id parent_id name$40.;
cards;
1 . System
2 1 Campus A
3 1 Campus B
4 1 Campus C
5 2 Division A
6 2 Division B
7 6 College A
8 6 College B
9 6 College C
10 9 Department A
11 9 Department B
12 9 Department C
;run;

Converting it to a nested set models seems a good way of having easy access to all the children or all the parents of a node.

In a lot of SQL implementations, though, you would rather answer such questions by recursive queries, (like in this example from SQL Server). Although SAS SQL has not yet implemented recursive queries, you can easily do something similar with a datastep.

 

First, put indexes on your data if you do not already have them: 

proc sql;
  create unique index id on have(id);
  create index parent_id on have(parent_id);
quit;

Then, you can easily find all the children of a node in a couple of datasteps, e.g.:

data Division_B;
  set have;
  where id=6;
  level=1;
run;

data Division_B;
  modify Division_B;
  parent_id=id;
  level=level+1;
  do until(0);
    set have key=parent_id;
    if _iorc_ then leave;
    output;
    end;
  _error_=0;
run;

Here, the recursion is accomplished by the OUTPUT statement in the second datastep, which appends the children to the end of the table, from which they are then read by the SET statement. A similar construction can be used to find the parents of a node.

 

If you want the nested set for other reasons, here is a way. First, create the complete list of parents for all ids (saved in the IDS array):

%let nlevels=10; /* max number of levels in org. tree */
data tree;
  set have;
  where parent_id is null;
  retain id1-id&nlevels . level 1;
  id1=id;
run;

data tree;
  modify tree;
  parent_id=id;
  level=level+1;
  array ids id1-id&nlevels;
  do until(0);
    set have key=parent_id;
    if _iorc_ then leave;
    ids(level)=id;
    output;
    end;
  _error_=0;
run;

proc sort data=tree;
  by id1-id&nlevels id;
run;

The NLEVELS macro variable does not have to be exactly the number of levels, any larger number is OK (although increasing the number will make the program run slightly slower).

 

Then the nested set can be constructed like this:

data nested_set;
  set have;
  retain left right .;
run;

proc sql;
  create index id on nested_set(id);
run;

data nested_set;
  set tree;
  by id1-id&nlevel;
  array ids id1-id&nlevels;
  array lasts last.id1-last.id&nlevels;
  if lasts(level) then do;
    link mod;
    counter+1;
    left=counter;
    counter+1;
    right=counter;
    replace;
    do level=level-1 to 1 by -1 while(lasts(level));
      id=ids(level);
      link mod;
      counter+1;
      right=counter;
      replace;
      end;
    end;
  else do;
    link mod;
    counter+1;
    left=counter;
    replace;
    end;
if 0 then do; /* as the modify statement is used in several places, it must be called in a LINK stmt. */
  mod:
    modify nested_set key=id/unique;
    return;
  end;
run;

But every time you add or delete a node in the tree, the whole nested set has to be reconstructed. So it may be easier to use a datastep like the one I demonstrated for Division B above to get at the child nodes.

 

nrk1787db1
Obsidian | Level 7

A big thank you s_lassen, this is what I wanted.Smiley Happy

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1739 views
  • 2 likes
  • 4 in conversation