BookmarkSubscribeRSS Feed
frank021227
Calcite | Level 5

Dear all,

I am extremely new to sas, and I am working on a labor economics problem with sas.

the question is to generate a fake sample of observations and code the likelihood function assiciate with the problem. 

 

matlab code of the likelihood function is the following:

 

function [loglikelihood] = ps1q1_llk(beta, D, X)
pr = normcdf(X*beta);
loglikelihood = -sum(D.*log(pr) + (1-D).*log(1-pr));

 

my sas cod is the following:

proc iml;
start loglike(param) global (x);                           (param is beta in the previous, and x includes D, X)
x1 = x[,{1 2}];                                                     (x1 is X, and x3 is D)
x3 = x[,3];
pr = cdf("normal",x1*t(param));
pr2 = 1 - pr;
f = -(t(x3)*log(pr) + t(1-x3)*log(pr2));
return ( f );
finish;

 

I recevie error Invocation of unresolved module LOGLIK, so i messed up in the likelihoood function. could someone plz help me with it? If you somehow need more info, I would like to provide in details.

 

Yours,

Frank

7 REPLIES 7
Rick_SAS
SAS Super FREQ

Welcome to SAS.

 

It doesn't look like you messed up the function, just the call. The function name is "loglike" but the error that you posted says "loglik" (without an 'e') was not found. If I'm wrong, please also include the code that calls your function.

 

Here are some other ideas.

1) You can put multiple variables in the GLOBAL statement, so maybe rewrite your function like this:

start loglike(beta) global (x, D);     
   pr = cdf("normal",x*t(beta));
   pr2 = 1 - pr;
   f = -(t(D)*log(pr) + t(1-D)*log(pr2));
   return ( f );
finish;

2) If you want the function to look more like your MATLAB prototype, you can use '#' to do elementwise multiplication.

 

f = -sum(D#log(pr) + (1-D)#log(pr2));

   Personally I like the way you wrote it, but the choice is yours.

 

3)  You say that you are eventually trying to simulate data.  You might search for topics that interest you on The DO Loop blog, which has a lot of simulation topics.  Click the "Simulation" link in the word cloud in the right-hand sidebar.

4) If you are going to need to do a lot of simulation, you might consider the book Simulating Data with SAS (2013),

5) There is a very short MATLAB to SAS/IML Cheat Sheet. You might have already figured out a lot of what's there. See also these ten tips for learning SAS/IML.

frank021227
Calcite | Level 5

Hello Doc Wicklin,

I was just reading your artical http://blogs.sas.com/content/iml/2011/10/12/maximum-likelihood-estimation-in-sasiml.html, and it has been extremely helpful. This is my data generating code.

 

data q1;
call streaminit(1024);
do i = 1 to 1000;
z = rand("Normal", 0, 2);
v = rand("Normal", 0, 1);
if 2-z> v then d = 1;
else d = 0;
a = 1;
output;
end;
run;

I fixed the typo and the same error remains.

proc iml;
start loglik(param) global (x);
x1 = x[,{1 2}];
x3 = x[,3];
pr = cdf("normal",x1*t(param));
pr2 = 1 - pr;
f = -sum(t(x3)*log(pr)+t(1-x3)*log(pr2));
return ( f );
finish;

 

 the call code is the following:

proc iml;
use q1;
read all var {a z d} into x;
close q1;
con = { .   .,  .   .}; 
p = {-10 -10};
opt = {1,4};   
call nlpnra(rc, result, "loglik", p, opt, con);

Again, thank you very much for your help. Since I am extremely new to sas, i might make very stupid mistakes.

 

Yours,

Frank

 

Rick_SAS
SAS Super FREQ

1) Delete the second PROC IML statement.  You are quitting one session (in which you defined the loglik function) and starting a second session (for which the function is not defined).

2) If your linear predictor X*beta has an element that is very negative (less than about -5), the CDF function will return 0.  The LOG function is not going to like that. Similarly with large values (greater than 5).

 

Would you like to share what you are trying to accomplish?

frank021227
Calcite | Level 5

Question 1. Consider the following discrete choice model:
D = 1[ a0 + a1 Z > V ]:
Assume the following parameterization of the model: Z~ N(0; 2), V ~N(0; 1), a0 = 2, and a1 = -􀀀1.
(a) Using this parameterization, generate a fake sample of 1,000 observations.
(b) Code the likelihood function associated with this problem.

so use given a0 a1 to generate a fake sample of z v and d. then use that sample to estimate a0 and a1. I do notice that log(0) might be a problem. but i did a test that put in value for para and x, and i could calculate f with the likelihood function. 

frank021227
Calcite | Level 5

quick update. 

call nlpnra(rc, result, "loglik", p, opt, con);
ERROR: (execution) Invalid argument to function.
It seems that sometimes 1-pr is too close to 0.
Rick_SAS
SAS Super FREQ

I guess you'll have to truncate the smallest and largest values of the probabilities so that they are firmly within the interior of (0, 1);

 

   pr  = pr <> 1e-8;   /* bound probability away from 0 and 1 */
   pr2 = pr2 <> 1e-8; 

The symbol "<>" is the elementwise maximum operator.

 

 

I think another problem you are having is that you are maximizing the function in NLPNRA, but you've put a negative sign in the loglik function.  Thus your optimization will go off to infinity.  It also appears that the log-likelihood is pretty flat for those simulated data. You might try constraining the parameters to a large region  near the origin, like this:

start loglik(param) global (x);
   x1 = x[,{1 2}];
   d = x[,3];
   y = X1*t(param);
   pr = cdf("normal", y);
   pr2 = 1 - pr;
   pr  = pr <> 1e-8;   /* bound probability away from 0 and 1 */
   pr2 = pr2 <> 1e-8; 
   f = -sum(d#log(pr)+(1-d)#log(pr2));
   return ( f );
finish;

con = { 0   -6 ,  
        10   1 }; 
p = {1 0.2};  
opt = {0,2};/* loglike is -LL? Minimize instead of maximize */   
call nlpnra(rc, result, "loglik", p, opt, con);
Rick_SAS
SAS Super FREQ

I just realized what you are doing. I believe this is a probit regression model for the response variable D. The following call to PROC PROBIT gives the same answers if you switch the negative sign on the loglikelihood function:

 

proc probit data=q1;
model d=z;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 7 replies
  • 1237 views
  • 0 likes
  • 2 in conversation