DATA Step, Macro, Functions and more

Need help on a hierarchical data

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 13
Accepted Solution

Need help on a hierarchical data

Hi There,

Faced with a problem where I need to convert a dataset to another form of dataset.

Current dataset structure is below.

Child    parent   Level
xxx                    1

yyy        xxx       2

zzz        yyy       3

...

...

...

max upto six levels

What I need is a structure like

Child(where level =1)   parent1  parent2  parent3 .... parent6

xxx                             yyy       zzz

Regards,

Suvi


Accepted Solutions
Solution
‎09-04-2013 06:24 AM
Occasional Contributor
Posts: 13

Re: Need help on a hierarchical data

Converted it to a more sensible dynamic Macro.

%Macro Tstruct_Flat();

/*****Check what is the lowest level LCD present in the structure*****/

proc sql;
  Select max(level) into :max_level from work.lcd_raw ;
quit;


/*****Collect the highest level LCD's in the table****/
proc sql;
  Create table Tstruct_temp as select Child,level from work.lcd_raw where level=1 ;
quit;

%do i  = 2 %to &max_level;

  %let j = %eval(&i. - 1); /*Macro do loop variables to be used inside the loop*/
  %let k = %eval(&i. - 2); /*Macro do loop variables to be used inside the loop*/
  %let l = %eval(&i. ); /*Macro do loop variables to be used inside the loop*/

  %if %eval(&i.*1) = 2 %then %do;
   proc sql;
   Create table Tstruct_temp as select a.*, b.child as parent&j. , b.level as level&l. from Tstruct_temp as a left join work.lcd_raw as b on a.child = b.parent and b.level = &i. ;
   quit;
  %end;

  %if %eval(&i.*1) gt 2 %then %do;
   proc sql;
   Create table Tstruct_temp as select a.*, b.child as parent&j. , b.level as level&l. from Tstruct_temp as a left join work.lcd_raw as b on a.parent&k. = b.parent and b.level = &i. ;
   quit;
  %end;
%end;
%mend;

Thanks,

Suvi

View solution in original post


All Replies
Super Contributor
Posts: 297

Re: Need help on a hierarchical data

Is this what you are looking to do?

DATA HAVE;

INFILE DATALINES DLM=",";

INPUT CHILD $ PARENT $ LEVEL;

DATALINES;

XXX,YYY,1

XXX,ZZZ,2

YYY,XXX,2

ZZZ,YYY,3

;

RUN;

PROC TRANSPOSE DATA=HAVE OUT=WANT PREFIX=PARENT;

ID LEVEL;

BY CHILD;

VAR PARENT;

RUN;

Super Contributor
Posts: 339

Re: Need help on a hierarchical data

Posted in reply to Scott_Mitchell

I don't think the data set has a record for each level of parent for any given child. You would have to somehow preprocess the data to create a record with a level1 child and his level 3/4/5/6 parents to appropriately use a transpose/by combination. At least so it felt with the data sample

Super User
Posts: 5,431

Re: Need help on a hierarchical data

Posted in reply to Vince28_Statcan

Not necessary if there is no need for values on each level, just switch to left joins in the SQL example.

Data never sleeps
Super User
Posts: 5,431

Re: Need help on a hierarchical data

This is a quite typical task: extracting generic hierarchical data and transform it to a dimensional table.

Wouldn't it be nice if SAS could make a PROC and/or a DI Studio transformation that could deal with this task in a generic way?

Data never sleeps
Super Contributor
Posts: 339

Re: Need help on a hierarchical data

Best trick is to ask your hierachical data provider to send you directly the xml files and do a custom maping yourself. At least given the fact that most hierachical data is at one point or another stored as xml.

It is very easy to self learn the basics of xml mapping

Super User
Posts: 5,431

Re: Need help on a hierarchical data

Posted in reply to Vince28_Statcan

XML?

No... It's typical in many enterprise data models to store hierarchical data in generic parent-child tables.

The benefit is that the data model does not need to define the maximum no of levels that your hierarchies can consist.

The drawback, at least when using SAS, it generates quite complex code to unwind it to "flat" dimensional tables, used in a star schema for instance.

So in my perspective, there are no "hierarchical data providers", it's a part of the DW architecture. 

Data never sleeps
Super User
Super User
Posts: 7,060

Re: Need help on a hierarchical data

If you have SAS/OR licensed you might want to look at PROC NETDRAW.  Might not help with this but I have used in the past for processing this type of linked network data.

Super User
Posts: 5,431

Re: Need help on a hierarchical data

I have access to OR via SAS Alliance license, but OR is not a big hit in most DW sites... ;-)

Took a glance at it, and it looks like it does a lot of graphical stuff. If will look into it deeper when I have the time.

Thanx for the tip anyway...

Data never sleeps
Super Contributor
Posts: 339

Re: Need help on a hierarchical data

Not necessarily the simplest approach but here's how I'd go about it.

data want;

     if _n_=1 then do;

     declare hash hl2(dataset: "have(where=(level=2) rename=(child=nextchild))");

     hl2.defineKey('parent');

     hl2.defineData('nextchild');

     hl2.defineDone();

     declare hash hl3(dataset: "have(where=(level=3) rename=(child=nextchild))");

     hl3.defineKey('parent');

     hl3.defineData('nextchild');

     hl3.defineDone();

     declare hash hl4(dataset: "have(where=(level=4) rename=(child=nextchild))");

     hl4.defineKey('parent');

     hl4.defineData('nextchild');

     hl4.defineDone();

     declare hash hl5(dataset: "have(where=(level=5) rename=(child=nextchild))");

     hl5.defineKey('parent');

     hl5.defineData('nextchild');

     hl5.defineDone();

     declare hash hl6(dataset: "have(where=(level=6) rename=(child=nextchild))");

     hl6.defineKey('parent');

     hl6.defineData('nextchild');

     hl6.defineDone();

     end;

     set have(where=(level=1));

     rc=hl2.find(key: child);

     if rc=0 then do;

          parent1=nextchild;

          rc=hl3.find(key: nextchild);

          if rc=0 then do;

               parent2=nextchild;

               rc=hl4.find(key: nextchild);

               if rc=0 then do;

                    parent3=nextchild;

                    rc=hl5.find(key: nextchild);

                    if rc=0 then do;

                         parent4=nextchild;

                         rc=hl6.find(key: nextchild);

                         if rc=0 then parent5=nextchild;

                    end;

                end;

          end;

     end;

     drop nextchild parent level rc;

run;

Based on your data example, I assumed there were 6 levels but since level1 doesn't have a parent, that there should only be parent1-parent5 and not parent-parent6 variables to fill. If there are 7 levels or if level1 can have both a parent and a child it can be worked around.

The issue with the above approach is that if any "key" is present multiple times (like if xxx has parents yyy and zzz) then the hash table, defined as is won't support it and will retain only one of the two (I can't remember if it retains first key or replaces). It can be worked around using multidata option but then the looping becomes a tad bit hectic.

Point being, conceptually this does a merge whilst renaming variables appropriately of the following subsets of your original data:

have(where(level=1))

have(where(level=2))

...

have(where(level=6))

Using this successive approach with the above data subsets and renaming the variable should also allow you to provide the appropriate results without messing with multidata hash tables like below:

proc sql;

     select t1.child, t2.parent1, t3.parent2, t4.parent3, t5.parent4, t6.parent5

     from

     have(where=(level=1) drop=parent) as t1

     left join

     have(where=(level=2) rename=(parent=child child=parent1))

     on t1.child=t2.child

     left join

     have(where=(level=3) rename=(parent=parent1 child=parent2)) as t3

     on t2.parent1=t3.parent1

     left join

     have(where=(level=4) rename=(parent=parent2 child=parent3)) as t4

     on t3.parent2=t4.parent2

     left join

     have(where=(level=5) rename=(parent=parent3 child=parent4)) as t5

     on t4.parent3=t5.parent3

     left join

     have(where=(level=6) rename=(parent=parent4 child=parent5)) as t6

     on t5.parent4=t6.parent4

;

quit;

Note this is all untested its more of a logical guideline on how you can tackle the problem.

Vince

*Edit indeed left join seemed more appropriate for the OPs objectives so I edited the sql example

Super Contributor
Posts: 578

Re: Need help on a hierarchical data

I might point out semantically that your example is backwards.  XXX is the parent of YYY, not the child.

But..to finish the question...Here's a more direct, brute force method:

proc sql;

create table work.want (

     child char(3),

     parent1 char(3),

     parent2 char(3),

     parent3 char(3),

     parent4 char(3),

     parent5 char(3),

     parent6 char(3));

insert into work.want (child)

select child from have where level=1;

update work.want t1

set parent1=(select parent from have where child=t1.child);

update work.want t1

set parent2=(select parent from have where child=t1.parent1);

update work.want t1

set parent3=(select parent from have where child=t1.parent2);

update work.want t1

set parent4=(select parent from have where child=t1.parent3);

update work.want t1

set parent5=(select parent from have where child=t1.parent4);

update work.want t1

set parent6=(select parent from have where child=t1.parent5);

quit;

Super User
Posts: 5,513

Re: Need help on a hierarchical data

This should get you most of the way there.  You will want to rename the variables, however.

data have;

   set have;

   if level=1 then grouping + 1;

run;

proc transpose data=have out=want prefix=Level_;

   by grouping;

   var child;

   id level;

run;

It does assume that the proper order exists in your hierarchical file.

Good luck.

Occasional Contributor
Posts: 13

Re: Need help on a hierarchical data

Thanks everybody for help.

I did manage to find the solution. Its not an optimzed way but it works.

proc sql;

create table work.want (

);


insert into work.want (child)
select child from have where level=1;

Create table want as select a.*, b.child as parent1 from want as a left join have as b on a.child = b.parent and b.level = 2 ;
Create table want as select a.*, b.child as parent2 from want as a left join have as b on a.parent1 = b.parent and b.level = 3;
Create table want as select a.*, b.child as parent3 from want as a left join have as b on a.parent2 = b.parent and b.level = 4;
Create table want as select a.*, b.child as parent4 from want as a left join have as b on a.parent3 = b.parent and b.level = 5;
Create table want as select a.*, b.child as parent5 from want as a left join have as b on a.parent4 = b.parent and b.level = 6;

quit;

Regards,

Suvi

Solution
‎09-04-2013 06:24 AM
Occasional Contributor
Posts: 13

Re: Need help on a hierarchical data

Converted it to a more sensible dynamic Macro.

%Macro Tstruct_Flat();

/*****Check what is the lowest level LCD present in the structure*****/

proc sql;
  Select max(level) into :max_level from work.lcd_raw ;
quit;


/*****Collect the highest level LCD's in the table****/
proc sql;
  Create table Tstruct_temp as select Child,level from work.lcd_raw where level=1 ;
quit;

%do i  = 2 %to &max_level;

  %let j = %eval(&i. - 1); /*Macro do loop variables to be used inside the loop*/
  %let k = %eval(&i. - 2); /*Macro do loop variables to be used inside the loop*/
  %let l = %eval(&i. ); /*Macro do loop variables to be used inside the loop*/

  %if %eval(&i.*1) = 2 %then %do;
   proc sql;
   Create table Tstruct_temp as select a.*, b.child as parent&j. , b.level as level&l. from Tstruct_temp as a left join work.lcd_raw as b on a.child = b.parent and b.level = &i. ;
   quit;
  %end;

  %if %eval(&i.*1) gt 2 %then %do;
   proc sql;
   Create table Tstruct_temp as select a.*, b.child as parent&j. , b.level as level&l. from Tstruct_temp as a left join work.lcd_raw as b on a.parent&k. = b.parent and b.level = &i. ;
   quit;
  %end;
%end;
%mend;

Thanks,

Suvi

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 1738 views
  • 8 likes
  • 7 in conversation