The assignment I'm working on is to code from scratch a medthod to use commingling analysis methods to maximize a likelihood equation for a biallelic SNP. I am only given the phenotypes and am attemping to do a grid search method to find the parameters (mu1, mu2, mu3, q, and as well as the Log Lilkihood to run a LR test. When I look at 20-50 observations, it seems that my program is not running through any of the variables other than mu1.
Does anyone have a suggestion on how to fix it? I've been playing around with different options and can't seem to find a solution.
%Macro likelihood(qstep,mu1step,mu2step,mu3step);
data likely;
set onet; *transposed data set for horizontal array;
array aQTp (200) COL1-COL200;
array MLE (200) MLE1-MLE200;
do q=0.0 to 0.5 by &qstep;
do mu1=&trtmin to &trtmax by &mu1step;
do mu2=&trtmin to &trtmax by &mu2step;
do mu3=&trtmin to &trtmax by &mu3step;
do sd=&sdmin to &sdmax by &sdstep;
do i=1 to 200;
If MLE(i) =. then MLE(i)=0;
end; *End of QTp Loop;
LogMLEtot=sum(of MLE1-MLE200);
end; *End of SD Loop;
end; *End of Mu3 Loop;
end; *End of Mu2 Loop;
end; *End of Mu1 Loop;
end; *End of q Loop;
[... Deleted Code to Check Work and Create Macro Variable Below]
%Let MLEtotmax=&MLEtotmax;
Proc sort data=likely;
Data likely; set likely;
keep mu1 mu2 mu3 sd q logMLEtot;
PRoc print data=likely (obs=10);
%MEND likelihood;
Notice that your innermost loop goes from 1 to 200. So the first 200 observations (before sorting) must use the initial values of q, mu1, mu2, mu3, and sd. So when you say you are searching 20 to 50 observations, that may not be enough.
The innermost loops change first. To begin, the order would be:
For the initial values of q, mu1, mu2, mu3, and sd, let i loop from 1 to 200.
Increment sd. For the initial values of q, mu1, mu2, mu3, and the second value for sd, let i loop from 1 to 200.
I can't see the starting and stopping points for the loops, but that's how nested loops work ... innermost increments first.
Also not showing values for your TRTMIN TRTMAX SDMIN and SDMAX macro variables could make it hard to determine specific problems with loop counts.
The innermost DO loop (i=1 to 200) does not contribute to the number of observations in dataset LIKELY. It populates the array MLE. The number of observations is a product of six integers (hence probably quite large): The number of observations in ONET and the numbers of iterations of the five "outer" DO loops. From your initial post, we know only one of these six factors (51) and that three of them are equal.
So, a useful check (in addition to looking at a couple of observations) would be to compare the number of observations in LIKELY to the product of the number of observations in ONET and the expected numbers of iterations of the five "outer" DO loops (derived from the start, end and step size values). Due to the PROC SORT step, the values of the index variables in the first 50 observations of dataset LIKELY are hard to predict (at least for me) and most likely not representative for the dataset.
