BookmarkSubscribeRSS Feed
anandbillava
Fluorite | Level 6

Hi There,

 

I have a data like this 

data a;
a=1; b=5.1; c=5.5; output;
a=1; b=5.2; c=5.5; output;
a=1; b=5.0; c=5.2; output;

a=1; b=4.7; c=5.0; output;
a=1; b=4.6; c=5.2; output;

a=1; b=.; c=4.6; output;

run;

 

I am trying to create a output in which for a given parent nodie (variable c) it should get all the childs/nodes (variable b) which should not have any child/nodes. for e.g in the above example for parent node with value 5.5 it should output 5.1, 4.7 and 4.6 which do not have any child.

 

Here is how the output is picked.

 5.5 has 5.1 and 5.2 child, then 5.2 has child 5.0, 5.0 has child 4.7 and 5.2 has child 4.6.  4.6, 4.7 and 5.1 do not have any childs.

 

Please let me know if you need more expalnation.

 

Thanks in advance for the help.

Anand

 

  .

 

15 REPLIES 15
PGStats
Opal | Level 21

Are we to understand that a child node has a number < parent node number? Why not code the child node number as missing when it is greater than the parent node?

PG
PeterClemmensen
Tourmaline | Level 20

Is your desired result a data set? Can you show us what your desired result looks like?

anandbillava
Fluorite | Level 6

I this case the output will be 

 

new variable

5.1

4.7

4.6 

All this do not have any child nodes or children for the ancestor or parent 5.5.

 

Basically when user asks variable c with value 5.5 then output should be look like above. 

Steps are 

1. The program reads c variable where value is 5.5 and then looks for the children which is variable b and the values are 5.1 and 5.2

2. For each of b value which is 5.1 and 5.2 the program looks for children and their children are for 5.1 no children so it will appear in the report and 5.2 has children 5.0... and it goes on for all the childrens...

Ksharp
Super User
data have(keep=_start _end);
a=1; b='5.1'; c='5.5'; output;
a=1; b='5.2'; c='5.5'; output;
a=1; b='5.0'; c='5.2'; output;

a=1; b='4.7'; c='5.0'; output;
a=1; b='4.6'; c='5.2'; output;

a=1; b=' '; c='4.6'; output;
rename c=_start b=_end;
run;
proc sql;
create table _have as
 select *
  from have where _start not in (select _end from have);
quit;


data want(keep=path want);
if _n_ eq 1 then do;
length path _path  $ 800 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y');
ha.definekey('_start');
ha.definedata('_end');
ha.definedone();

declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('n','path');
pa.definedone();

end;


set _have;
count=1;n=1;_n=1;
path=catx('|',_start,_end);
   
pa.add();
do while(hi_path.next()=0);
 if n ne 1 then pa.remove(key:_n);_n=n;
 _path=path;   
 _start=scan(path,-1,'|');
 rc=ha.find();
 if rc ne 0 then do; want=_start;output;end;
 do while(rc=0);
  if not findw(path,strip(_end),'|') then do;
   if length(path)+length(_end)+1 gt lengthc(path) then do;
    putlog 'ERROR: The length of path and _path are set too short';
    stop;
   end;
   
   count+1;n=count;
   path=catx('|',path,_end);
   pa.add(); 
   path=_path;
 end;
  rc=ha.find_next();
end;
end;
pa.clear();

run;
anandbillava
Fluorite | Level 6

Thank you very much. Its kind of working. But I need to output this based on value a and c. In this case the variable value for a is 1 and c is 5.5.

 

data a;
a=1; b=5.1; c=5.5; output;
a=1; b=5.2; c=5.5; output;
a=1; b=5.0; c=5.2; output;
a=1; b=4.7; c=5.0; output;
a=1; b=4.6; c=5.2; output;
a=1; b=.; c=4.6; output;

 

a=2; b=5.4; c=6.1; output;
a=2; b=5.3; c=6.1; output;
a=2; b=5.0; c=5.3; output;
a=2; b=4.7; c=5.0; output;
a=2; b=4.5; c=4.7; output;
run;

 

For the second set of record where a=2 and lets say value c is 6.1 the children output must be as below

5.4

4.5 

Only these two do not have children.

 

I am sorry for not mentioning this case earlier.

 


run;

Ksharp
Super User

You could split this table into two tables and running my code separately .

If you have many a, make a macro to go through all of a . That is not too difficulty .

 

 

Ksharp
Super User

You could split this table into two tables and running my code separately .

If you have many a, make a macro to go through all of a . That is not too difficulty .

Ksharp
Super User

OK. Since I have some time. Check the following code .

 


data have;
a=1; b=5.1; c=5.5; output;
a=1; b=5.2; c=5.5; output;
a=1; b=5.0; c=5.2; output;
a=1; b=4.7; c=5.0; output;
a=1; b=4.6; c=5.2; output;
a=1; b=.; c=4.6; output;

 

a=2; b=5.4; c=6.1; output;
a=2; b=5.3; c=6.1; output;
a=2; b=5.0; c=5.3; output;
a=2; b=4.7; c=5.0; output;
a=2; b=4.5; c=4.7; output;

rename c=_start b=_end;
run;



proc sql;
create table _have as
 select distinct a,_start
  from have where catx(' ',a,_start) not in (select catx(' ',a,_end) from have);
quit;


data want(keep=a _start  want);
if _n_ eq 1 then do;
if 0 then set have;
length key n path 8;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y');
ha.definekey('a','_start');
ha.definedata('_end');
ha.definedone();

declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('n','path');
pa.definedone();

declare hash k();
k.definekey('key');
k.definedone();
end;


set _have;
count=1;n=1;_n=1;
path=_start;pa.add();
key=_start;k.add();
do while(hi_path.next()=0);
 if n ne 1 then pa.remove(key:_n);_n=n;
 rc=ha.find(key:a,key:path);
 if rc ne 0 then do; want=path;output;end;
 do while(rc=0);
  if k.check(key:_end) ne 0 then do;
   count+1;n=count;
   path=_end;
   pa.add();   
   key=_end;
   k.add();
 end;
  rc=ha.find_next();
end;
end;
pa.clear();	k.clear();

run;
anandbillava
Fluorite | Level 6

Thank you.. I am able to solve this issue with another approach. Thanks for all your help

anandbillava
Fluorite | Level 6

 

Can you help me in solving problem below. I am providing here the appropriate ancestor for each of the record.  I need to find this ancestor for each of the record by user. Can you please help.

 

data a;

      user=1; base=4.1; cur=4.2; output; /* 4.1 to 4.2 - Ancestor here is 4.4 */

      user=1; base=4.2; cur=4.4; output; /* 4.2 to 4.4 - Ancestor here is 4.4 */

      user=1; base=4.6; cur=5.5; output; /* 4.6 to 5.5 - Ancestor here is 6.1 */

     user=1; base=5.5; cur=6.1; output; /* 5.5 to 6.1 - Ancestor here is 6.1 */

     

      user=2; base=4.2; cur=4.6; output; /* 4.2 to 4.6 - Ancestor here is 6.1 */

      user=2; base=4.6; cur=5.5; output; /* 4.6 to 5.5 - Ancestor here is 6.1 */

      user=2; base=5.5; cur=6.1; output;  /* 5.0 to 6.1 - Ancestor here is 6.1 */

 

      user=3; base=4.0; cur=4.2; output; /* 4.0 to 4.2 - Ancestor here is 5.5 */

      user=3; base=4.2; cur=4.6; output; /* 4.2 to 4.6 - Ancestor here is 5.5 */

      user=3; base=4.6; cur=5.5; output; /* 4.6 to 6.1 - Ancestor here is 5.5 */

run;

 

 

I am able to achieve this if there is only one user in the dataset using code below. But its not working if there are more than one user.

 

 

data _tmp1(rename=(base=base1 cur=cur1));

      set a;

run;

 

**Repeat the training module for each of the delta version between base and current module;

data _tmp2;

      retain base cur;

      set _tmp1; by u;

 

      cur = cur1;

      base = base1;

      output;

 

            drop base1 cur1 i;

run;                               

 

**Get all the base version for the current modue which has training then need to check the

      corresponding base version full access;

data _lineage;

      set _tmp2;  by u;   

      array hist(6);

     

      count=1;                   

      hist(count)=base;                  

      do i=1 to last while (count<dim(hist)-1);                                                                                         

      set _tmp2(rename=(base=base1 cur=cur1)) nobs=last point=i;

      if cur=base1 then do;

        count+1;                   

        hist(count)=base1;

        base=base1;

        cur=cur1;                    

        i=1;

      end;

      end;                

      count+1;

      hist(count)=cur;

      ancest = cur;

run;

 

 

 

Ksharp
Super User
You can split the table into three tables and run your code separately ? data user1; set a; if user=1; run; ...........
anandbillava
Fluorite | Level 6

Hi ,

I have lot of records in the dataset not just 3. It will take lot of time to process if i write a macro to subset these. Is there a way i can do it in one step.

 

Thank you,

andreas_lds
Jade | Level 19

@anandbillava wrote:

Hi ,

I have lot of records in the dataset not just 3. It will take lot of time to process if i write a macro to subset these. Is there a way i can do it in one step.

 

Thank you,


Creating subsets won't decrease the time required to create the final result. Do you get any errors while processing the complete dataset.

anandbillava
Fluorite | Level 6

I will have to create more than 4000 datasets which is am already doing and its taking lot of time to process.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 2066 views
  • 0 likes
  • 6 in conversation