Grid Search Macro Program Problem

Reply
New Contributor
Posts: 2

Grid Search Macro Program Problem

[ Edited ]

The assignment I'm working on is to code from scratch a medthod to use commingling analysis methods to maximize a likelihood equation for a biallelic SNP. I am only given the phenotypes and am attemping to do a grid search method to find the parameters (mu1, mu2, mu3, q, and std.dev.) as well as the Log Lilkihood to run a LR test. When I look at 20-50 observations, it seems that my program is not running through any of the variables other than mu1. 

 

Does anyone have a suggestion on how to fix it? I've been playing around with different options and can't seem to find a solution.

 

Thanks!

 


%Macro likelihood(qstep,mu1step,mu2step,mu3step);

 

data likely;
set onet; *transposed data set for horizontal array;


array aQTp (200) COL1-COL200;
array MLE (200) MLE1-MLE200;

 

do q=0.0 to 0.5 by &qstep;
do mu1=&trtmin to &trtmax by &mu1step;
do mu2=&trtmin to &trtmax by &mu2step;
do mu3=&trtmin to &trtmax by &mu3step;
do sd=&sdmin to &sdmax by &sdstep;
do i=1 to 200;
MLE(i)=log(((1-q)**2)*(1/sqrt(2*CONSTANT('PI')*sd))*(EXP((-0.5)*((aQTp(i)-mu1)/sd)**2))+(2*q*(1-q))*(1/sqrt(2*CONSTANT('PI')*sd))*(EXP((-0.5)*((aQTp(i)-mu2)/sd)**2))+(q**2)*(1/sqrt(2*CONSTANT('PI')*sd))*(EXP((-0.5)*((aQTp(i)-mu3)/sd)**2)));
If MLE(i) =. then MLE(i)=0;


end; *End of QTp Loop;
logMLEtot=0;
LogMLEtot=sum(of MLE1-MLE200);


output;

 

end; *End of SD Loop;
end; *End of Mu3 Loop;
end; *End of Mu2 Loop;
end; *End of Mu1 Loop;
end; *End of q Loop;


Run;

 


[... Deleted Code to Check Work and Create Macro Variable Below]

 

%Let MLEtotmax=&MLEtotmax;

Proc sort data=likely;
by DESCENDING logMLEtot;
run;


Data likely; set likely;
keep mu1 mu2 mu3 sd q logMLEtot;
run;

PRoc print data=likely (obs=10);
run

%MEND likelihood;

 

%liklihood(0.01,0.01,0.01,0.01)
Run;

 

Super User
Posts: 5,372

Re: Grid Search Macro Program Problem

Notice that your innermost loop goes from 1 to 200.  So the first 200 observations (before sorting) must use the initial values of q, mu1, mu2, mu3, and sd.  So when you say you are searching 20 to 50 observations, that may not be enough.

New Contributor
Posts: 2

Re: Grid Search Macro Program Problem

Does the i loop correspond to the arrays from the beginning?

When I sort the data set to find the largest LogMLE q=0 and mu2=mu3=trtmin.
Should the i loop refer to the number of iterations I'd like?

##- Please type your reply above this line. Simple formatting, no
attachments. -##
Super User
Posts: 5,372

Re: Grid Search Macro Program Problem

The innermost loops change first.  To begin, the order would be:

 

For the initial values of q, mu1, mu2, mu3, and sd, let i loop from 1 to 200.

 

Increment sd.  For the initial values of q, mu1, mu2, mu3, and the second value for sd, let i loop from 1 to 200.

 

I can't see the starting and stopping points for the loops, but that's how nested loops work ... innermost increments first.

Super User
Posts: 11,144

Re: Grid Search Macro Program Problem

Also not showing values for your TRTMIN TRTMAX SDMIN and SDMAX macro variables could make it hard to determine specific problems with loop counts.

Trusted Advisor
Posts: 1,116

Re: Grid Search Macro Program Problem

The innermost DO loop (i=1 to 200) does not contribute to the number of observations in dataset LIKELY. It populates the array MLE. The number of observations is a product of six integers (hence probably quite large): The number of observations in ONET and the numbers of iterations of the five "outer" DO loops. From your initial post, we know only one of these six factors (51) and that three of them are equal.

 

So, a useful check (in addition to looking at a couple of observations) would be to compare the number of observations in LIKELY to the product of the number of observations in ONET and the expected numbers of iterations of the five "outer" DO loops (derived from the start, end and step size values). Due to the PROC SORT step, the values of the index variables in the first 50 observations of dataset LIKELY are hard to predict (at least for me) and most likely not representative for the dataset.

Ask a Question
Discussion stats
  • 5 replies
  • 418 views
  • 0 likes
  • 4 in conversation