Solved: Re: Do loop until end?

Minutemaid23 · Posted 03-17-2018 06:10 AM

Hello all,

I am really struggling with this one. I’m not sure if a macro would be the best option. Here is my issue.

I have 1 table with two columns. Parent and component. I am trying to write a program that does the following.

Beginning table
Parent component
1 2
1 3
1 4
2. 5
2. 6
6. 7

Into this
Parent component
1. 2
1. 3
1. 4
1. 5
1. 6
1. 7

Basically if a component is in the parent column (like 2) I want to pull any components of 2.

The part that I am really struggling with is getting this to loop. I need sas to look for 6 in the parent column since it was a component of 2 and continue until there are no matches.

This table has about 1.2 million records. Some parents have 30 components and some have much less.

Any help is appreciated. I hope I explained this well enough.

JohnHoughton · Posted 03-17-2018 05:32 PM

@LinusH wrote

Only a real world test will tell, but I would be REALLY surprised if a loop of SQL joins would beat a hash table look-up, especially if the depth > 3 levels

Code below generates a test data set with a million observations . The depth depends on the probabilities used in the select statement. With the numbers below I consistently get 6 levels.

Using this as input data , the SQL loop runs in a few seconds. I can't give a precise benchmark as I an running on SAS On Demand.

data have;
parent=1;component=2;output;
do i=1 to 1000000;
	c=rand("uniform");
	select;
		when (c<0.05) do;parent=component+1; component=parent+1;end; *new family;
		when (c<0.8) do;parent=parent+1; component=component+1;end; *new generation, same family;
		otherwise component=component+1;* new child, same parent;
	end;
	output;
end;
run;

View solution in original post

LinusH · Posted 03-17-2018 07:58 AM

I don't think that a macro would help here, since I believe that dynamic syntax will soulve the problem.
I had quite a similar task a couple of years back, and I used a format based on the table (source value component, formatted value patent).
And the a do while/until loop where you apply the format until you reach the top node (that is when you get the "other" value returned).

I believe you could solve this with a hash table in a similar way, but that is unfortunately not a part of my arsenal. @Ksharp to the rescue? 😉

Data never sleeps

Ksharp · Posted 03-17-2018 08:39 AM

Thanks @LinusH rooting for me.

The best choice is using SAS/OR if you have it, @RobPratt might give you some code.

Or check SAS/OR documentation ,I remembered there is already an example about Connection Component.


data have;
infile cards ;
input from $  to $ ;
cards;
1     2
1     3
4     5
5     2
9     4
6     7
8     7
;
run;
data full;
  set have end=last;
  if _n_ eq 1 then do;
   declare hash h();
    h.definekey('node');
     h.definedata('node');
     h.definedone();
  end;
  output;
  node=from; h.replace();
  from=to; to=node;
  output;
  node=from; h.replace();
  if last then h.output(dataset:'node');
  drop node;
run;


data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 20);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
 from_to.definekey('from');
 from_to.definedata('to');
 from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
 no.definekey('node');
 no.definedata('node');
 no.definedone();
 

do while(hi_no.next()=0);
 household+1; output;
 count=1;
 key=node;_ha.add();
 last=node;ha.add();
 rc=hi.first();
 do while(rc=0);
   from=last;rx=from_to.find();
   do while(rx=0);
     key=to;ry=_ha.check();
      if ry ne 0 then do;
       node=to;output;rr=no.remove(key:node);
       key=to;_ha.add();
       count+1;
       last=to;ha.add();
      end;
      rx=from_to.find_next();
   end;
   rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;

JohnHoughton · Posted 03-17-2018 01:56 PM

Hi @Ksharp . It may be just when I run it, but the hash table solution doesn't match the output described in the original post with @Minutemaid23's data.

JohnHoughton · Posted 03-17-2018 08:44 AM

This works . The question will be how this performs over your 1.2 million records, and if there is a more efficient way.

data have;
input parent component;
datalines;
1 2
1 3
1 4
2. 5
2. 6
6. 7
;


%macro loop;
%do %until (&count=0);

	proc sql;
	create table have as
	select   coalesce(b.parent,a.parent) as parent , coalesce(a.component,b.component) as component
	from have a  left join have b on b.component=a.parent;
	run;
	* if there are no components in parent then done;
	proc sql ;
	select count(*) into :count
	from have a   join have b on b.component=a.parent;
	run;

%end;
%mend;
%loop;

LinusH · Posted 03-17-2018 01:21 PM

Only a real world test will tell, but I would be REALLY surprised if a loop of SQL joins would beat a hash table look-up, especially if the depth > 3 levels.

Data never sleeps

JohnHoughton · Posted 03-17-2018 05:32 PM

@LinusH wrote

Only a real world test will tell, but I would be REALLY surprised if a loop of SQL joins would beat a hash table look-up, especially if the depth > 3 levels

Code below generates a test data set with a million observations . The depth depends on the probabilities used in the select statement. With the numbers below I consistently get 6 levels.

Using this as input data , the SQL loop runs in a few seconds. I can't give a precise benchmark as I an running on SAS On Demand.

data have;
parent=1;component=2;output;
do i=1 to 1000000;
	c=rand("uniform");
	select;
		when (c<0.05) do;parent=component+1; component=parent+1;end; *new family;
		when (c<0.8) do;parent=parent+1; component=component+1;end; *new generation, same family;
		otherwise component=component+1;* new child, same parent;
	end;
	output;
end;
run;

Minutemaid23 · Posted 03-17-2018 03:06 PM

Thanks so much! I am not able to run until I’m back in the office on Monday. I will provide feedback as soon as possible.

Minutemaid23 · Posted 03-22-2018 02:14 PM

Well, after a lengthy battle with my IT department i discovered that we do not have a SAS metadata server that i can connect to with JMP. Is there anyway to run macro's in SAS JMP? I haven't found a way with my research so far, and i have been looking everywhere.

SAS Innovate 2025: Call for Content

Classroom Training Available!