I am trying to use call NLPNRA to generate a maximum likelihood estimate. IML gives an ERROR message saying I am taking log of zero. But I checked the arguments in all the log commands, none of them is zero. Below is the output of the program and error message. Thanks for help.
11108 %macro infere(rep1,rep2,mcno);
11109
11110 proc datasets library=JM;
11111 delete parajm&rep1;
11112 run;
11113
11114 proc iml;
11115 u=J(1,2,.);
11116 uu=J(1,2,.);
11117 zero=J(1,2,0);
11118 var=I(2);
11119 call randseed(1);
11120 do h1=1 to &mcno;
11121 norma=randnormal(1,zero,var);
11122 do g1=1 to 2;
11123 u[g1]=cdf("NORMAL",norma[g1]);
11124 end;
11125 uu=uu//u;
11126 end;
11127
11128 uu1=uu[2:nrow(uu),];
11129 create unifor from uu1;
11130 append from uu1;
11131 quit;
11132
11133 %do i=&rep1 %to &rep2;
11134
11135 data enable2fact;
11136 set JM.finaldata1;
11137 if rep=&i;
11138 run;
11139
11140 proc sort data=enable2fact;
11141 by ID;
11142 run;
11143
11144 proc means data=enable2fact;
11145 by id;
11146 var tij;
11147 output out=no1;
11148 run;
11149
11150 data no2;
11151 set no1;
11152 rename tij=N;
11153 if _STAT_='N';
11154 keep tij;
11155 run;
11156
11157 proc iml;
11158 *******************************************define log likelihood;
11159 start F_GLOBAL(x);
11160 p=x[1];
11161 lambda1t=x[2];lambda1=x[3];
11162 phi1=x[4];phi2=x[5];phi3=x[6];phi=x[7];
11163 beta=x[,8:11];
11164
11165 use no2;
11166 read all into n;
11167 use enable2fact;
11168 read all var {tij_b} into tijb;
11169 read all var {tij} into tij;
11170 read all var {trt} into trt;
11171 read all var {factpal} into y;
11172 read all var {intercept} into intcep;
11173 read all var {duration_month} into s;
11174 read all var {death} into D;
11175 use unifor;
11176 read all var {col1 col2} into uu;
11177
11178 m=0;lyd=0;
11179
11180 do i=1 to nrow(n);
11181 m=m+n;
11182 si=s
11183 **********************************log likelihood of mixed model part;
11184 rando=j(n,n,1);
11185 epsi=I(n);
11186 Ri=j(n,n,1);
11187 tib=tijb[m-n+1:m];
11188 ti=tij[m-n+1:m];
11189 *******************************generate Ri;
11190 do j=1 to n;
11191 do k=j to n;
11192 Ri[k,j]=exp(-phi*((abs(tib
11193 Ri[j,k]=exp(-phi*((abs(tib
11194 end;
11195 end;
11196
11197 Vi=phi1*rando+phi2*epsi+phi3*Ri;
11198 yi=y[m-n+1:m];
11199 incepti=intcep[m-n+1:m];
11200 trti=trt[m-n+1:m];
11201 one=J(n,1,1);
11202
11203 xib=incepti||trti||tib;
11204 beta1=J(3,1,1);
11205 beta1[1]=beta[1];beta1[2]=trt
11206 if D
11207
11208 ly=-0.5*(log(det(Vi))+(t(yi-xib*beta1))*inv(Vi)*(yi-xib*beta1));
11209 *************************************************************************log likelihood
11209! for observed events;
11210 if trt
11211 if trt
11212 lyd=lyd+ly+ld;
11213 end;
11214 *************************************************************************log likelihood for
11214! censored events;
11215 if D
11216 ff=0;
11217 do l=1 to nrow(uu);
11218 if trt
11219 if trt
11220 xbeta=(beta[1]+beta[2]*trt
11220! rt
11221 fyd=exp(-0.5*(t(yi-xbeta))*inv(Vi)*(yi-xbeta));
11222 if surv<si then fyd=0;
11223 ff=ff+fyd;
11224 end;
11225
11226 lyd=lyd+log(ff/nrow(uu))-0.5*log(det(Vi));
11227 end;
11228 end;
11229
11230 f=lyd;
11231
11232 return(f);
11233 finish F_GLOBAL;
11234
11235 ******************Making upper and lower bounds for the parameters;
11236 con=J(3,13,.);
11237 boundint=500;
11238 boundbeta=500;
11239 positive=10**(-10);
11240 varianlower=3;
11241 varianupper=500;
11242 boundphi=10;
11243 boundlam=1;
11244 do i=1 to 2;
11245 do j=2 to 11;
11246 if i=1 then do;
11247 if j<=3 then con[i,j]=positive;
11248 if j>=4 & j<=6 then con[i,j]=varianlower;
11249 if j=7 then con[i,j]=positive;
11250 if j=8 then con[i,j]=positive;
11251 if j>=9 then con[i,j]=-boundbeta;
11252 end;
11253 if i=2 then do;
11254 if j<=3 then con[i,j]=boundlam;
11255 if j>=4 & j<=6 then con[i,j]=varianupper;
11256 if j=7 then con[i,j]=boundphi;
11257 if j=8 then con[i,j]=boundint;
11258 if j>=9 then con[i,j]=boundbeta;
11259 end;
11260 end;
11261 end;
11262
11263 con[3,1]=1;con[3,12]=0;con[3,13]=2;
11264
11265 print con;
11266
11267 ********initial value;
11268 x={2 0.04954 0.07649 270.560 118.793 151.389 0.0284 108.702 12.560 1 -0.5};
11269
11270 ********threshold parameters;
11271 opt=J(1,9,.);
11272 opt[1]=1;
11273 opt[2]=5;
11274 tc=J(1,11,.);
11275 ********maximization;
11276 call nlpnra(rc,xres,"F_GLOBAL",x,opt,con,tc);
11277 create solution from xres;
11278 append from xres;
11279 quit;
11280
11281 proc datasets;
11282 append base=JM.parajm&rep1 data=solution;
11283 run;
11284
11285 %end;
11286
11287 %mend;
11288
11289 %infere(rep1=2,rep2=2,mcno=10000);
NOTE: PROCEDURE DATASETS used (Total process time):
real time 1:50.81
cpu time 7.31 seconds
NOTE: Writing HTML Body file: sashtml132.htm
Directory
Libref JM
Engine V9
Physical Name D:\lzg\jointmodel
Filename D:\lzg\jointmodel
Member
# Name Type File Size Last Modified
1 CICTR DATA 5120 15Jul11:10:02:23
2 CIDIFF DATA 5120 15Jul11:10:02:23
3 CITRT DATA 5120 15Jul11:10:02:23
4 DEATHIMPUT DATA 263168 04May11:11:19:36
5 DURATION DATA 5120 11Mar11:14:29:15
6 ENABLE050311 DATA 263168 03May11:09:21:53
7 ENABLE2 DATA 967680 04May11:11:24:16
8 ENABLE2FACT DATA 689152 01Mar11:13:53:37
9 ENABLEIIFORLI12012010 DATA 246784 08Dec10:09:59:54
10 FINALDATA1 DATA 74752 23Sep11:21:14:31
11 NO2 DATA 5120 01Mar11:13:53:37
12 PARAJM1 DATA 9216 26Sep11:09:29:30
13 PARAJM2 DATA 9216 26Sep11:10:47:40
14 PARAJM3 DATA 9216 26Sep11:09:44:16
15 PARAJMD1 DATA 9216 23Sep11:20:56:28
16 QALYDIFF1 DATA 16384 14Jul11:19:33:19
17 QALYDIFF188 DATA 16384 14Jul11:18:45:05
18 QALYDIFF188N DATA 16384 09Jul11:10:30:34
19 QALYDIFF1N DATA 24576 09Jul11:11:23:09
20 QALYDIFF251 DATA 24576 05Jul11:23:12:57
21 QALYDIFF376 DATA 16384 14Jul11:19:07:25
22 QALYDIFF376N DATA 16384 09Jul11:10:35:50
23 QALYDIFF501 DATA 24576 06Jul11:02:31:57
24 QALYDIFF563 DATA 16384 14Jul11:19:25:50
25 QALYDIFF563N DATA 16384 09Jul11:10:58:49
26 QALYDIFF751 DATA 16384 14Jul11:06:56:01
27 QALYDIFF751N DATA 24576 08Jul11:21:12:26
28 QALYDIFF876 DATA 16384 14Jul11:06:46:01
29 QALYDIFF876N DATA 16384 08Jul11:21:50:09
30 QALYDIFFERENCE DATA 5120 11Jul11:21:33:27
31 RESAMP DATA 145179648 01Jul11:09:57:14
32 SAMPLESIZELLAR1 DATA 16384 29Jul11:16:44:20
33 SAMPLESIZELLCS DATA 16384 29Jul11:16:42:38
34 SOLUTION DATA 17408 24Jun11:12:55:40
35 UNIFORM DATA 21504 01Mar11:13:53:37
36 VISITALL DATA 4293632 04May11:11:24:16
37 WITHDRAW DATA 9216 03May11:09:05:58
NOTE: Deleting JM.PARAJM2 (memtype=DATA).
NOTE: PROCEDURE DATASETS used (Total process time):
real time 0.06 seconds
cpu time 0.06 seconds
NOTE: IML Ready
NOTE: Module RANDNORMAL loaded from the storage SASHELP.IMLMLIB.
NOTE: Module ROWVEC loaded from the storage SASHELP.IMLMLIB.
NOTE: Exiting IML.
NOTE: The data set WORK.UNIFOR has 10000 observations and 2 variables.
NOTE: 7746 workspace compresses.
NOTE: PROCEDURE IML used (Total process time):
real time 0.22 seconds
cpu time 0.21 seconds
NOTE: There were 467 observations read from the data set JM.FINALDATA1.
NOTE: The data set WORK.ENABLE2FACT has 111 observations and 18 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
NOTE: There were 111 observations read from the data set WORK.ENABLE2FACT.
NOTE: The data set WORK.ENABLE2FACT has 111 observations and 18 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
NOTE: Writing HTML Body file: sashtml133.htm
NOTE: The above message was for the following BY group:
id=1
NOTE: There were 111 observations read from the data set WORK.ENABLE2FACT.
NOTE: The data set WORK.NO1 has 100 observations and 5 variables.
NOTE: PROCEDURE MEANS used (Total process time):
real time 0.07 seconds
cpu time 0.07 seconds
NOTE: There were 100 observations read from the data set WORK.NO1.
NOTE: The data set WORK.NO2 has 20 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
NOTE: IML Ready
NOTE: Module F_GLOBAL defined.
NOTE: Writing HTML Body file: sashtml134.htm
ERROR: (execution) Invalid argument to function.
operation : LOG at line 11289 column 1
operands : _TEM1002
_TEM1002 1 row 1 col (numeric)
0
statement : ASSIGN at line 11289 column 1
traceback : module F_GLOBAL at line 11289 column 1
ERROR: Execution error as noted previously. (rc=100)
operation : NLPNRA at line 11289 column 1
operands : *LIT1116, x, opt, con, tc
*LIT1116 1 row 1 col (character, size 😎
F_GLOBAL
x 1 row 11 cols (numeric)
opt 1 row 9 cols (numeric)
1 5 . . . . . . .
con 3 rows 13 cols (numeric)
tc 1 row 11 cols (numeric)
statement : CALL at line 11289 column 1
ERROR: Matrix xres has not been set to a value.
statement : CREATE at line 11289 column 1
ERROR: No data set is currently open for output.
statement : APPEND at line 11289 column 1
NOTE: Exiting IML.
NOTE: 194882 workspace compresses.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE IML used (Total process time):
real time 1:36.72
cpu time 1:36.40
NOTE: Writing HTML Body file: sashtml135.htm
Directory
Libref WORK
Engine V9
Physical Name C:\Users\Zhigang\AppData\Local\Temp\SAS Temporary Files\_TD2780
Filename C:\Users\Zhigang\AppData\Local\Temp\SAS Temporary Files\_TD2780
Member File
# Name Type Size Last Modified
1 AAA DATA 13312 26Sep11:10:12:46
2 ENABLE2FACT DATA 25600 26Sep11:10:49:31
3 FF1 DATA 5120 26Sep11:10:12:46
4 FF11 DATA 5120 26Sep11:10:36:27
5 FYD DATA 46080 26Sep11:09:56:52
6 FYD11 DATA 46080 26Sep11:09:57:17
7 NO1 DATA 9216 26Sep11:10:49:31
8 NO2 DATA 5120 26Sep11:10:49:31
9 SASMACR CATALOG 5120 26Sep11:09:15:51
10 SOLUTION DATA 9216 26Sep11:09:44:16
11 UNIFOR DATA 164864 26Sep11:10:49:31
12 _FMTDESC DATA 9216 26Sep11:10:01:38
13 _QWFFMT DATA 9216 26Sep11:10:01:38
NOTE: Appending WORK.SOLUTION to JM.PARAJM2.
NOTE: BASE data set does not exist. DATA file is being copied to BASE file.
NOTE: There were 1 observations read from the data set WORK.SOLUTION.
NOTE: The data set JM.PARAJM2 has 1 observations and 11 variables.
Because of the macro call, the ERROR message doesn't give any useful information about which LOG call is failing.Two suggestions:
1) For debugging, define %LET i=2; %LET mcno = 10000; and get rid of the other macro statements.
2) You have many LOG statements, but we don't know which temporary argument is going negative. Go through the code and assign real variables to the arguments:
temp1 = ff/nrow(uu);
temp2 = det(Vi);
etc.
Then run the program. When the program fails you'll get a real line number that points to the error, and you'll get the actual name of a variable that is negative. For example, if temp1 is negative, you'll get the error message:
ERROR: (execution) Invalid argument to function.
operation : LOG at line XXXX column CCC
operands : TEMP1
TEMP1 1 row 1 col (numeric)
0
Rick
SAS/IML blog: http://blogs.sas.com/content/iml
PS. Unless I'm mistaken, your first call to PROC IML just fills an (&mcno x 2) matrix with uniform random values. You can just use
uu1=J(&mcno,2,.);
call randgen(uu1, "Uniform");
Hi Rick,
Thanks a lot for the prompt advice. I tried what you suggested and found that the error message refered to temp1=ff/nrow(uu). To find out which step it is equal to zero, I printed out all the values of temp1. The strange thing is that none of them is zero. And the objective function F_GLOBAL(x) can be calculated the at the initial value. It is equal to -431.3269953. Do you know why IML give this error message? Thanks a lot. Below has the new output of the error message.
12092 proc iml;
NOTE: IML Ready
12093 *******************************************define log likelihood;
12094 start F_GLOBAL(x);
12095 p=x[1];
12096 lambda1t=x[2];
12096! lambda1=x[3];
12097 phi1=x[4];
12097! phi2=x[5];
12097! phi3=x[6];
12097! phi=x[7];
12098 beta=x[,8:11];
12099
12100 use no2;
12101 read all into n;
12102 use enable2fact;
12103 read all var {tij_b} into tijb;
12104 read all var {tij} into tij;
12105 read all var {trt} into trt;
12106 read all var {factpal} into y;
12107 read all var {intercept} into intcep;
12108 read all var {duration_month} into s;
12109 read all var {death} into D;
12110 use unifor;
12111 read all var {col1 col2} into uu;
12112
12113 m=0;
12113! lyd=0;
12114
12115 do i=1 to nrow(n);
12116 m=m+n;
12117 si=s
12118 **********************************log likelihood of mixed model part;
12119 rando=j(n,n,1);
12120 epsi=I(n);
12121 Ri=j(n,n,1);
12122 tib=tijb[m-n+1:m];
12123 ti=tij[m-n+1:m];
12124 *******************************generate Ri;
12125 do j=1 to n;
12126 do k=j to n;
12127 Ri[k,j]=exp(-phi*((abs(tib
12128 Ri[j,k]=exp(-phi*((abs(tib
12129 end;
12130 end;
12131
12132 Vi=phi1*rando+phi2*epsi+phi3*Ri;
12133 yi=y[m-n+1:m];
12134 incepti=intcep[m-n+1:m];
12135 trti=trt[m-n+1:m];
12136 one=J(n,1,1);
12137
12138 xib=incepti||trti||tib;
12139 beta1=J(3,1,1);
12140 beta1[1]=beta[1];
12140! beta1[2]=trt
12140! beta1[3]=beta[3]+trt
12141 if D
12142
12143 ly=-0.5*(log(det(Vi))+(t(yi-xib*beta1))*inv(Vi)*(yi-xib*beta1));
12144 *************************************************************************log likelihood
12144! for observed events;
12145 if trt
12146 if trt
12147 lyd=lyd+ly+ld;
12148 end;
12149 *************************************************************************log likelihood for
12149! censored events;
12150 if D
12151 ff=0;
12152 do l=1 to nrow(uu);
12153 if trt
12154 if trt
12155 xbeta=(beta[1]+beta[2]*trt
12155! rt
12156 fyd=exp(-0.5*(t(yi-xbeta))*inv(Vi)*(yi-xbeta));
12157 if surv<si then fyd=0;
12158 ff=ff+fyd;
12159 end;
12160 temp1=ff/nrow(uu);
12160! temp2=det(Vi);
12161 lyd=lyd+log(temp1)-0.5*log(temp2);
12162 print temp1;
12163
12164 end;
12165 end;
12166
12167 f=lyd;
12168
12169 return(f);
12170 finish F_GLOBAL;
NOTE: Module F_GLOBAL defined.
12171
12172 ******************Making upper and lower bounds for the parameters;
12173 con=J(3,13,.);
12174 boundint=500;
12175 boundbeta=500;
12176 positive=10**(-10);
12177 varianlower=3;
12178 varianupper=500;
12179 boundphi=10;
12180 boundlam=1;
12181 do i=1 to 2;
12182 do j=2 to 11;
12183 if i=1 then do;
12184 if j<=3 then con[i,j]=positive;
12185 if j>=4 & j<=6 then con[i,j]=varianlower;
12186 if j=7 then con[i,j]=positive;
12187 if j=8 then con[i,j]=positive;
12188 if j>=9 then con[i,j]=-boundbeta;
12189 end;
12190 if i=2 then do;
12191 if j<=3 then con[i,j]=boundlam;
12192 if j>=4 & j<=6 then con[i,j]=varianupper;
12193 if j=7 then con[i,j]=boundphi;
12194 if j=8 then con[i,j]=boundint;
12195 if j>=9 then con[i,j]=boundbeta;
12196 end;
12197 end;
12198 end;
12199
12200 con[3,1]=1;
12200! con[3,12]=0;
12200! con[3,13]=2;
12201
12202 print con;
NOTE: Writing HTML Body file: sashtml147.htm
12203
12204 ********initial value;
12205 *x={2 0.04954 0.07649 270.560 118.793 151.389 0.0284 108.702 12.560 1 -0.5};
12206 x={2 0.0584913446 0.0810734988 270.5599998 118.79300677 151.38885874 0.0378699563
12206! 108.70197893 12.56070818 0.9993222093 -0.500002042};
12207 xxx=F_GLOBAL(x);
12208 print xxx;
12209 ********threshold parameters;
12210 opt=J(1,9,.);
12211 opt[1]=1;
12212 opt[2]=5;
12213 tc=J(1,11,.);
12214 ********maximization;
12215 call nlpnra(rc,xres,"F_GLOBAL",x,opt,con,tc);
ERROR: (execution) Invalid argument to function.
operation : LOG at line 12161 column 16
operands : temp1
temp1 1 row 1 col (numeric)
0
statement : ASSIGN at line 12161 column 5
traceback : module F_GLOBAL at line 12161 column 5
ERROR: Execution error as noted previously. (rc=100)
operation : NLPNRA at line 12215 column 1
operands : *LIT1116, x, opt, con, tc
*LIT1116 1 row 1 col (character, size 😎
F_GLOBAL
x 1 row 11 cols (numeric)
opt 1 row 9 cols (numeric)
1 5 . . . . . . .
con 3 rows 13 cols (numeric)
tc 1 row 11 cols (numeric)
statement : CALL at line 12215 column 1
12216 create solution from xres;
ERROR: Matrix xres has not been set to a value.
statement : CREATE at line 12216 column 1
12217 append from xres;
ERROR: No data set is currently open for output.
statement : APPEND at line 12217 column 1
12218 quit;
NOTE: Exiting IML.
NOTE: 190554 workspace compresses.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE IML used (Total process time):
real time 1:37.33
cpu time 1:35.27
I think you have to print temp1 BEFORE you call the LOG function. You are only seeing the result of previous iterations.
I think I might see what is happening. Your loop is essentially as follows:
ff=0;
do l=1 to nrow(uu);
surv= ...;
if surv<si then fyd=0;
ff=ff+fyd;
end;
I think si is large compared to surv so that surv < si is always true.
Of course, the real question is what to do about it, and hopefully you can figure that out. If the value of Duration_Month (which determines si) is bad, you can fix it. Otherwise, it could be that lambda1t and lambda1 are getting small during the optimization. This would case surv to get large. If this is the problem, you might want to make sure they remain bounded away from zero.
By the way, if you haven't read this paper by Charlie Hallahan on MLE estimation, I recommend it: http://www2.sas.com/proceedings/sugi22/STATS/PAPER293.PDF
Hi Rick,
Thank you so much! This is really helpful. I got the problem resolved. I was just wondering if IML or "call NLPNRA" also offer the calculation of Fisher information matrix which is a very natural thing associated with maximum likelihood estiamtors (MLEs). Thanks again.
Glad you resolved the problem.
I am not very familiar with the Fisher information matrix, so I'll let someone else offer advice on that. As far as I know, there are no "built in" routines for computing it, since the NLP functions are designed for general optimization, not for the specific case of MLE.
Hi Rick,
Do you know if IML has a built in function or command to calculate the second order derivatives of a function? Thanks a lot.
The NLPFDD function uses computes numerical first and second derivatives:
http://blogs.sas.com/content/iml/2011/10/14/hints-for-derivatives/
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.