Re: Is there an easier way to do my do-loops? (it's within a macro)

kthenaj · Posted 01-03-2013 02:32 PM

Hello all,

I am working on a macro in which I perform calculations on subsets of variables from a dataset. Below is the code and as you can see I've manually duplicated the subset calculations, I've done pairs, groups of 3, 4, 5. What I want to do is make my code a little easier so it handles any number of subsets. It works just fine the way it is, I just think there must be an easier/shorter/cleaner way. I am limited in that the output needs to have the same notation y_##, because as you can see in the final lines a dataset is created just containing variables with the prefix "y_" because they are the only ones of interest in the end. Let me know if further explanation is needed (a sample data set is also supplied below). Thanks!

data pfisher;

set fisher;

/* pairs of two */

%do i=1 %to &k;

%do j=&i+1 %to &k;

py&i&j = b&i + b&j;

pymin&i&j = min&i + min&j;

pymax&i&j = max&i + max&j;

y_&i&j = (py&i&j-pymin&i&j)/(pymax&i&j-pymin&i&j);

%end; %end;

/* groups of 3 */

%do i=1 %to &k;

%do j=&i+1 %to &k;

%do l=&j+1 %to &k;

py&i&j&l = b&i + b&j + b&l;

pymin&i&j&l = min&i + min&j + min&l;

pymax&i&j&l = max&i + max&j + max&l;

y_&i&j&l = (py&i&j&l-pymin&i&j&l)/(pymax&i&j&l-pymin&i&j&l);

%end; %end; %end;

/* groups of 4 */

%do i=1 %to &k;

%do j=&i+1 %to &k;

%do l=&j+1 %to &k;

%do m=&l+1 %to &k;

py&i&j&l&m = b&i + b&j + b&l + b&m;

pymin&i&j&l&m = min&i + min&j + min&l + min&m;

pymax&i&j&l&m = max&i + max&j + max&l + max&m;

y_&i&j&l&m = (py&i&j&l&m-pymin&i&j&l&m)/(pymax&i&j&l&m-pymin&i&j&l&m);

%end; %end; %end; %end;

/* groups of 5 */

%do i=1 %to &k;

%do j=&i+1 %to &k;

%do l=&j+1 %to &k;

%do m=&l+1 %to &k;

%do n=&m+1 %to &k;

py&i&j&l&m&n = b&i + b&j + b&l + b&m +b&n;

pymin&i&j&l&m&n = min&i + min&j + min&l + min&m + min&n;

pymax&i&j&l&m&n = max&i + max&j + max&l + max&m + max&n;

y_&i&j&l&m&n = (py&i&j&l&m&n-pymin&i&j&l&m&n)/(pymax&i&j&l&m&n-pymin&i&j&l&m&n);

%end; %end; %end; %end; %end;

/**** need to extend to any possible subset size *****/

data pydefs;

set pfisher;

keep y_:;

run;

Here is an example of a data set that the macro would be used on where k = 4 (and thus the group of 5 wouldn't be applicable):

data fisher;

do i = 1 to 10;

call streaminit(123);

u1 = rand("Uniform");

b1 = ceil(5*u1);

min1 = ceil(3*u1);

max1 = ceil(2*u1);

call streaminit(456);

u2 = rand("Uniform");

b2 = ceil(5*u2);

min2 = ceil(3*u2);

max2 = ceil(2*u2);

call streaminit(789);

u3 = rand("Uniform");

b3 = ceil(5*u3);

min3 = ceil(3*u3);

max3 = ceil(2*u3);

call streaminit(235);

u4 = rand("Uniform");

b4 = ceil(5*u4);

min4 = ceil(3*u4);

max4 = ceil(2*u4);

output;

end;

drop u1 u2 u3 u4 i;

run;

art297 · Posted 01-03-2013 03:10 PM

Given your example data you end up with some cases where py..max equals py..min equals 0 and then is used as a divisor, resulting in a division by 0 warning and a missing value being assigned.

Is that the correct result or am I missing something?

kthenaj · Posted 01-03-2013 03:35 PM

Sorry, I didn't run that particular example data set before I posted. This should fix that problem:

data fisher;

do i = 1 to 10;

call streaminit(123);

u1 = rand("Uniform");

b1 = ceil(5*u1);

min1 = ceil(3*u1);

max1 = ceil(2*u1);

call streaminit(456);

u2 = rand("Uniform");

b2 = ceil(5*u2);

min2 = ceil(3*u2);

max2 = ceil(2*u2);

call streaminit(789);

u3 = rand("Uniform");

b3 = ceil(5*u3);

min3 = ceil(3*u3);

max3 = ceil(2*u3);

call streaminit(235);

u4 = rand("Uniform");

b4 = ceil(5*u4);

min4 = ceil(3*u4);

max4 = ceil(2*u4);

output;

end;

drop u1 u2 u3 u4 i;

run;

%macro test(k);

data pfisher;

set fisher;

/* pairs of two */

%do i=1 %to &k;

%do j=&i+1 %to &k;

py&i&j = b&i + b&j;

pymin&i&j = min&i + min&j;

pymax&i&j = max&i + max&j;

y_&i&j = (py&i&j-pymin&i&j)/(pymax&i&j-pymin&i&j+0.0001);

%end; %end;

/* groups of 3 */

%do i=1 %to &k;

%do j=&i+1 %to &k;

%do l=&j+1 %to &k;

py&i&j&l = b&i + b&j + b&l;

pymin&i&j&l = min&i + min&j + min&l;

pymax&i&j&l = max&i + max&j + max&l;

y_&i&j&l = (py&i&j&l-pymin&i&j&l)/(pymax&i&j&l-pymin&i&j&l+0.0001);

%end; %end; %end;

/* groups of 4 */

%do i=1 %to &k;

%do j=&i+1 %to &k;

%do l=&j+1 %to &k;

%do m=&l+1 %to &k;

py&i&j&l&m = b&i + b&j + b&l + b&m;

pymin&i&j&l&m = min&i + min&j + min&l + min&m;

pymax&i&j&l&m = max&i + max&j + max&l + max&m;

y_&i&j&l&m = (py&i&j&l&m-pymin&i&j&l&m)/(pymax&i&j&l&m-pymin&i&j&l&m+0.0001);

%end; %end; %end; %end;

/* groups of 5 */

%do i=1 %to &k;

%do j=&i+1 %to &k;

%do l=&j+1 %to &k;

%do m=&l+1 %to &k;

%do n=&m+1 %to &k;

py&i&j&l&m&n = b&i + b&j + b&l + b&m +b&n;

pymin&i&j&l&m&n = min&i + min&j + min&l + min&m + min&n;

pymax&i&j&l&m&n = max&i + max&j + max&l + max&m + max&n;

y_&i&j&l&m&n = (py&i&j&l&m&n-pymin&i&j&l&m&n)/(pymax&i&j&l&m&n-pymin&i&j&l&m&n+0.0001);

%end; %end; %end; %end; %end;

/**** need to extend to any possible subset size *****/

run;

data pydefs;

set pfisher;

keep y_:;

run;

%mend;

%test(k=4);

art297 · Posted 01-03-2013 05:51 PM

: You said that you would like it to work for any number of variables. Is there a max number of variables that it should be able to work for? I ask because, if there isn't, the code would be a little more complicated to develop. If there is, what would that number be?

kthenaj · Posted 01-03-2013 05:57 PM

I think k = 10 would be a reasonable maximum.

art297 · Posted 01-03-2013 06:44 PM

I only took it out to six iterations, but you could easily expand it to 10.

The only things I really did was reduce the number of loops involved, used a proc sql call to identify the maximum number of variables, kept the %do variables to a range beginning with i, and limited whether code would run (and variables created) based on the maximum number of variables.

If you wanted to generalize the code, you could write a fairly simple data _null_ step that wrote the code as an include file.

/*determine the value of k*/

proc sql noprint;

create table temp as

select name

from dictionary.columns

where libname="WORK" and

memname="FISHER" and

upcase(name) like "MAX%"

;

quit;

%let maxvars=&sqlobs.;

%macro newtest;

data pfisher;

set fisher;

%if &maxvars. ge 2 %then %do;

%do i=1 %to &maxvars.;

%do j=&i+1 %to &maxvars.;

py&i&j = b&i + b&j;

pymin&i&j = min&i + min&j;

pymax&i&j = max&i + max&j;

y_&i&j = (py&i&j-pymin&i&j)/(pymax&i&j-pymin&i&j+0.0001);

%if &maxvars. ge 3 %then %do k=&j+1 %to &maxvars.;

py&i&j&k = b&i + b&j + b&k;

pymin&i&j&k = min&i + min&j + min&k;

pymax&i&j&k = max&i + max&j + max&k;

y_&i&j&k = (py&i&j&k-pymin&i&j&k)/(pymax&i&j&k-pymin&i&j&k+0.0001);

%if &maxvars. ge 4 %then %do l=&k+1 %to &maxvars.;

py&i&j&k&l = b&i + b&j + b&k + b&l;

pymin&i&j&k&l = min&i + min&j + min&k + min&l;

pymax&i&j&k&l = max&i + max&j + max&k + max&l;

y_&i&j&k&l = (py&i&j&k&l-pymin&i&j&k&l)/(pymax&i&j&k&l-pymin&i&j&k&l+0.0001);

%if &maxvars. ge 5 %then %do m=&l+1 %to &maxvars.;

py&i&j&k&l&m = b&i + b&j + b&k + b&l +b&m;

pymin&i&j&k&l&m = min&i + min&j + min&k + min&l + min&m;

pymax&i&j&k&l&m = max&i + max&j + max&k + max&l + max&m;

y_&i&j&k&l&m = (py&i&j&k&l&m-pymin&i&j&k&l&m)/(pymax&i&j&k&l&m-pymin&i&j&k&l&m+0.0001);

%if &maxvars. ge 6 %then %do n=&m+1 %to &maxvars.;

py&i&j&k&l&m&n = b&i + b&j + b&k + b&l +b&m;

pymin&i&j&k&l&m&n = min&i + min&j + min&k + min&l + min&m + min&n;

pymax&i&j&k&l&m&n = max&i + max&j + max&k + max&l + max&m + max&n;

y_&i&j&k&l&m&n = (py&i&j&k&l&m&n-pymin&i&j&k&l&m&n)/(pymax&i&j&k&l&m&n-pymin&i&j&k&l&m&n+0.0001);

%end;

run;

data pydefs;

set pfisher;

keep y_:;

run;

%mend newtest;

%newtest

art297 · Posted 01-04-2013 08:11 PM

: Here is some code that will cover all possibilities. You, of course, are responsible for insuring that it runs correctly. It ran for me and appears to produce the same results as your original code:

/*specify input filename*/

%let fname=fisher;

/*create some sample data*/

data &fname.;

do i = 1 to 10;

call streaminit(123);

u1 = rand("Uniform");

b1 = ceil(5*u1);

min1 = ceil(3*u1);

max1 = ceil(2*u1);

call streaminit(456);

u2 = rand("Uniform");

b2 = ceil(5*u2);

min2 = ceil(3*u2);

max2 = ceil(2*u2);

call streaminit(789);

u3 = rand("Uniform");

b3 = ceil(5*u3);

min3 = ceil(3*u3);

max3 = ceil(2*u3);

call streaminit(235);

u4 = rand("Uniform");

b4 = ceil(5*u4);

min4 = ceil(3*u4);

max4 = ceil(2*u4);

output;

end;

drop u1 u2 u3 u4 i;

run;

/*determine the value of maxvars*/

proc sql noprint;

create table temp as

select name

from dictionary.columns

where libname="WORK" and

memname="%sysfunc(upcase(&fname.))" and

upcase(name) like "MAX%"

;

quit;

%let maxvars=&sqlobs.;

/*create the code*/

filename sascode temp;

data _null_;

length code stem part2 part3 part4 part5 end $255;

file sascode;

do i=105 to %eval(104+&maxvars.);

part1=byte(i);

if i eq 105 then do;

part2='=1 %to &maxvars.;';

stem=catt('&',byte(i));

part3=catt('=b&',byte(i));

part4=catt('=min&',byte(i));

part5=catt('=max&',byte(i));

end='%end;';

code='%macro create_code;';

put code;

code='data p&fname.;';

put code;

code='set &fname.;';

put code;

end;

else do;

part2=catt('=&',byte(i-1),'+1 %to &maxvars.;');

stem=catt(stem,'&',byte(i));

part3=catt(part3,'+b&',byte(i));

part4=catt(part4,'+min&',byte(i));

part5=catt(part5,'+max&',byte(i));

end=catt(end,'%end;');

end;

code='%do '||part1||part2;

put code;

if i gt 105 then do;

code=catt('py',stem,part3,';');

put code;

code=catt('pymin',stem,part4,';');

put code;

code=catt('pymax',stem,part5,';');

put code;

code=catt('y_',stem,'=(py',stem,'-pymin',stem,

')/(pymax',stem,'-pymin',stem,'+0.0001);');

put code;

end;

if i eq %eval(104+&maxvars.) then do;

put end;

code='run;';

put code;

code='%mend create_code;';

put code;

end;

run;

/*include and run the created code*/

%include sascode /source2;

%create_code

data pydefs;

set p&fname.;;

keep y_:;

run;

Classroom Training Available!