turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-05-2015 04:00 PM

Dear all,

I am extremely new to sas, and I am working on a labor economics problem with sas.

the question is to generate a fake sample of observations and code the likelihood function assiciate with the problem.

matlab code of the likelihood function is the following:

function [loglikelihood] = ps1q1_llk(beta, D, X)

pr = normcdf(X*beta);

loglikelihood = -sum(D.*log(pr) + (1-D).*log(1-pr));

my sas cod is the following:

proc iml;

start loglike(param) global (x); (param is beta in the previous, and x includes D, X)

x1 = x[,{1 2}]; (x1 is X, and x3 is D)

x3 = x[,3];

pr = cdf("normal",x1*t(param));

pr2 = 1 - pr;

f = -(t(x3)*log(pr) + t(1-x3)*log(pr2));

return ( f );

finish;

I recevie error Invocation of unresolved module LOGLIK, so i messed up in the likelihoood function. could someone plz help me with it? If you somehow need more info, I would like to provide in details.

Yours,

Frank

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-05-2015 04:31 PM

Welcome to SAS.

It doesn't look like you messed up the function, just the call. The function name is "loglike" but the error that you posted says "loglik" (without an 'e') was not found. If I'm wrong, please also include the code that calls your function.

Here are some other ideas.

1) You can put multiple variables in the GLOBAL statement, so maybe rewrite your function like this:

```
start loglike(beta) global (x, D);
pr = cdf("normal",x*t(beta));
pr2 = 1 - pr;
f = -(t(D)*log(pr) + t(1-D)*log(pr2));
return ( f );
finish;
```

2) If you want the function to look more like your MATLAB prototype, you can use '#' to do elementwise multiplication.

```
f = -sum(D#log(pr) + (1-D)#log(pr2));
```

Personally I like the way you wrote it, but the choice is yours.

3) You say that you are eventually trying to simulate data. You might search for topics that interest you on The DO Loop blog, which has a lot of simulation topics. Click the "Simulation" link in the word cloud in the right-hand sidebar.

4) If you are going to need to do a lot of simulation, you might consider the book *Simulating Data with SAS* (2013),

5) There is a very short MATLAB to SAS/IML Cheat Sheet. You might have already figured out a lot of what's there. See also these ten tips for learning SAS/IML.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-05-2015 04:46 PM

Hello Doc Wicklin,

I was just reading your artical http://blogs.sas.com/content/iml/2011/10/12/maximum-likelihood-estimation-in-sasiml.html, and it has been extremely helpful. This is my data generating code.

data q1; call streaminit(1024); do i = 1 to 1000; z = rand("Normal", 0, 2); v = rand("Normal", 0, 1); if 2-z> v then d = 1; else d = 0; a = 1; output; end; run;

I fixed the typo and the same error remains.

proc iml; start loglik(param) global (x); x1 = x[,{1 2}]; x3 = x[,3]; pr = cdf("normal",x1*t(param)); pr2 = 1 - pr; f = -sum(t(x3)*log(pr)+t(1-x3)*log(pr2)); return ( f ); finish;

the call code is the following:

proc iml; use q1; read all var {a z d} into x; close q1; con = { . ., . .}; p = {-10 -10}; opt = {1,4}; call nlpnra(rc, result, "loglik", p, opt, con);

Again, thank you very much for your help. Since I am extremely new to sas, i might make very stupid mistakes.

Yours,

Frank

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-05-2015 05:08 PM

1) Delete the second PROC IML statement. You are quitting one session (in which you defined the loglik function) and starting a second session (for which the function is not defined).

2) If your linear predictor X*beta has an element that is very negative (less than about -5), the CDF function will return 0. The LOG function is not going to like that. Similarly with large values (greater than 5).

Would you like to share what you are trying to accomplish?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-05-2015 05:51 PM

Question 1. Consider the following discrete choice model:

D = 1[a0 + a1 Z > V ]:

Assume the following parameterization of the model: Z~ N(0; 2), V ~N(0; 1), a0 = 2, and a1 = -1.

(a) Using this parameterization, generate a fake sample of 1,000 observations.

(b) Code the likelihood function associated with this problem.

so use given a0 a1 to generate a fake sample of z v and d. then use that sample to estimate a0 and a1. I do notice that log(0) might be a problem. but i did a test that put in value for para and x, and i could calculate f with the likelihood function.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-05-2015 07:57 PM

quick update.

call nlpnra(rc, result, "loglik", p, opt, con);

ERROR: (execution) Invalid argument to function.

It seems that sometimes 1-pr is too close to 0.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-06-2015 12:43 PM

I guess you'll have to truncate the smallest and largest values of the probabilities so that they are firmly within the interior of (0, 1);

```
pr = pr <> 1e-8; /* bound probability away from 0 and 1 */
pr2 = pr2 <> 1e-8;
```

The symbol "<>" is the elementwise maximum operator.

I think another problem you are having is that you are maximizing the function in NLPNRA, but you've put a negative sign in the loglik function. Thus your optimization will go off to infinity. It also appears that the log-likelihood is pretty flat for those simulated data. You might try constraining the parameters to a large region near the origin, like this:

```
start loglik(param) global (x);
x1 = x[,{1 2}];
d = x[,3];
y = X1*t(param);
pr = cdf("normal", y);
pr2 = 1 - pr;
pr = pr <> 1e-8; /* bound probability away from 0 and 1 */
pr2 = pr2 <> 1e-8;
f = -sum(d#log(pr)+(1-d)#log(pr2));
return ( f );
finish;
con = { 0 -6 ,
10 1 };
p = {1 0.2};
opt = {0,2};/* loglike is -LL? Minimize instead of maximize */
call nlpnra(rc, result, "loglik", p, opt, con);
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-08-2015 09:08 AM - edited 10-08-2015 09:28 AM

I just realized what you are doing. I believe this is a probit regression model for the response variable D. The following call to PROC PROBIT gives the same answers if you switch the negative sign on the loglikelihood function:

```
proc probit data=q1;
model d=z;
run;
```