DATA Step, Macro, Functions and more

Nested Set Model

Accepted Solution Solved
Reply
Contributor
Posts: 25
Accepted Solution

Nested Set Model

How to implement Nested Set Model for traversing a hierarchy structure in Base SAS with respect to deriving Organizational Chart. ?

 

Looks like Nested set model will have one record per person and it has all nested sets flattened onto a number line.

 

The model in use has each person’s complete chain (up and down traversal organizational reporting chart details) to the highest determined person (CEO) as many records per person in database.

 

Nested model for a university example is given below, I would like to know if we can program to get the left and right nodes for each variable in dataset dynamically ?

System_1.pngSystem_2.png


Accepted Solutions
Solution
‎05-22-2017 07:46 AM
PROC Star
Posts: 63

Re: Nested Set Model

Assuming you have data like this:

data have;
  infile cards truncover; 
  input id parent_id name$40.;
cards;
1 . System
2 1 Campus A
3 1 Campus B
4 1 Campus C
5 2 Division A
6 2 Division B
7 6 College A
8 6 College B
9 6 College C
10 9 Department A
11 9 Department B
12 9 Department C
;run;

Converting it to a nested set models seems a good way of having easy access to all the children or all the parents of a node.

In a lot of SQL implementations, though, you would rather answer such questions by recursive queries, (like in this example from SQL Server). Although SAS SQL has not yet implemented recursive queries, you can easily do something similar with a datastep.

 

First, put indexes on your data if you do not already have them: 

proc sql;
  create unique index id on have(id);
  create index parent_id on have(parent_id);
quit;

Then, you can easily find all the children of a node in a couple of datasteps, e.g.:

data Division_B;
  set have;
  where id=6;
  level=1;
run;

data Division_B;
  modify Division_B;
  parent_id=id;
  level=level+1;
  do until(0);
    set have key=parent_id;
    if _iorc_ then leave;
    output;
    end;
  _error_=0;
run;

Here, the recursion is accomplished by the OUTPUT statement in the second datastep, which appends the children to the end of the table, from which they are then read by the SET statement. A similar construction can be used to find the parents of a node.

 

If you want the nested set for other reasons, here is a way. First, create the complete list of parents for all ids (saved in the IDS array):

%let nlevels=10; /* max number of levels in org. tree */
data tree;
  set have;
  where parent_id is null;
  retain id1-id&nlevels . level 1;
  id1=id;
run;

data tree;
  modify tree;
  parent_id=id;
  level=level+1;
  array ids id1-id&nlevels;
  do until(0);
    set have key=parent_id;
    if _iorc_ then leave;
    ids(level)=id;
    output;
    end;
  _error_=0;
run;

proc sort data=tree;
  by id1-id&nlevels id;
run;

The NLEVELS macro variable does not have to be exactly the number of levels, any larger number is OK (although increasing the number will make the program run slightly slower).

 

Then the nested set can be constructed like this:

data nested_set;
  set have;
  retain left right .;
run;

proc sql;
  create index id on nested_set(id);
run;

data nested_set;
  set tree;
  by id1-id&nlevel;
  array ids id1-id&nlevels;
  array lasts last.id1-last.id&nlevels;
  if lasts(level) then do;
    link mod;
    counter+1;
    left=counter;
    counter+1;
    right=counter;
    replace;
    do level=level-1 to 1 by -1 while(lasts(level));
      id=ids(level);
      link mod;
      counter+1;
      right=counter;
      replace;
      end;
    end;
  else do;
    link mod;
    counter+1;
    left=counter;
    replace;
    end;
if 0 then do; /* as the modify statement is used in several places, it must be called in a LINK stmt. */
  mod:
    modify nested_set key=id/unique;
    return;
  end;
run;

But every time you add or delete a node in the tree, the whole nested set has to be reconstructed. So it may be easier to use a datastep like the one I demonstrated for Division B above to get at the child nodes.

 

View solution in original post


All Replies
Super User
Posts: 10,500

Re: Nested Set Model

Do you have access to SAS/OR procedures? If you aren't sure then run this code:

 

Proc Setinit;

Run;

 

and that will show the modules you have installed. OR has several procedures related to tasks like this.

You may have to provide some example of your existing data.

Contributor
Posts: 25

Re: Nested Set Model

Thank you for your response, unfortunately we dont have SAS/OR installed in our systems.

Super User
Posts: 9,681

Re: Nested Set Model

You may have to provide some example of your existing data( Better is data step).

and output you want .

Solution
‎05-22-2017 07:46 AM
PROC Star
Posts: 63

Re: Nested Set Model

Assuming you have data like this:

data have;
  infile cards truncover; 
  input id parent_id name$40.;
cards;
1 . System
2 1 Campus A
3 1 Campus B
4 1 Campus C
5 2 Division A
6 2 Division B
7 6 College A
8 6 College B
9 6 College C
10 9 Department A
11 9 Department B
12 9 Department C
;run;

Converting it to a nested set models seems a good way of having easy access to all the children or all the parents of a node.

In a lot of SQL implementations, though, you would rather answer such questions by recursive queries, (like in this example from SQL Server). Although SAS SQL has not yet implemented recursive queries, you can easily do something similar with a datastep.

 

First, put indexes on your data if you do not already have them: 

proc sql;
  create unique index id on have(id);
  create index parent_id on have(parent_id);
quit;

Then, you can easily find all the children of a node in a couple of datasteps, e.g.:

data Division_B;
  set have;
  where id=6;
  level=1;
run;

data Division_B;
  modify Division_B;
  parent_id=id;
  level=level+1;
  do until(0);
    set have key=parent_id;
    if _iorc_ then leave;
    output;
    end;
  _error_=0;
run;

Here, the recursion is accomplished by the OUTPUT statement in the second datastep, which appends the children to the end of the table, from which they are then read by the SET statement. A similar construction can be used to find the parents of a node.

 

If you want the nested set for other reasons, here is a way. First, create the complete list of parents for all ids (saved in the IDS array):

%let nlevels=10; /* max number of levels in org. tree */
data tree;
  set have;
  where parent_id is null;
  retain id1-id&nlevels . level 1;
  id1=id;
run;

data tree;
  modify tree;
  parent_id=id;
  level=level+1;
  array ids id1-id&nlevels;
  do until(0);
    set have key=parent_id;
    if _iorc_ then leave;
    ids(level)=id;
    output;
    end;
  _error_=0;
run;

proc sort data=tree;
  by id1-id&nlevels id;
run;

The NLEVELS macro variable does not have to be exactly the number of levels, any larger number is OK (although increasing the number will make the program run slightly slower).

 

Then the nested set can be constructed like this:

data nested_set;
  set have;
  retain left right .;
run;

proc sql;
  create index id on nested_set(id);
run;

data nested_set;
  set tree;
  by id1-id&nlevel;
  array ids id1-id&nlevels;
  array lasts last.id1-last.id&nlevels;
  if lasts(level) then do;
    link mod;
    counter+1;
    left=counter;
    counter+1;
    right=counter;
    replace;
    do level=level-1 to 1 by -1 while(lasts(level));
      id=ids(level);
      link mod;
      counter+1;
      right=counter;
      replace;
      end;
    end;
  else do;
    link mod;
    counter+1;
    left=counter;
    replace;
    end;
if 0 then do; /* as the modify statement is used in several places, it must be called in a LINK stmt. */
  mod:
    modify nested_set key=id/unique;
    return;
  end;
run;

But every time you add or delete a node in the tree, the whole nested set has to be reconstructed. So it may be easier to use a datastep like the one I demonstrated for Division B above to get at the child nodes.

 

Contributor
Posts: 25

Re: Nested Set Model

A big thank you s_lassen, this is what I wanted.Smiley Happy

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 152 views
  • 2 likes
  • 4 in conversation