turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- proc nlmixed with categorical variable

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-21-2015 04:24 AM

hi everybody,

I'm a little confused bout how to define the categorical variables in "proc nlmixed". As we know, class statement can't be used in this procedure. So how do I solve these? Please help me

this is the syntax.

MIG, WORK, EDUC, LIVE, MARR, KB, WI are categorical variable.

Please help me fix this syntax, so the 7 variables can turn into dummy variable.

proc nlmixed data=CLEAN tech=newrap cov;

parameters b0=0 b1=0 b2=0 b3=0 b4=0 b5=0 b6=0 b7=0 b8=0 w=0;

bpart=b0+b1*MIG+b2*WORK+b3*EDUC+b4*LIVE+b5*MARR+b6*KB+b7*WI+b8*UKP;

lambda=exp(bpart);

phi=1+(w*lambda);

omega=1+(w*y);

teta=(lambda*omega)/phi;

ll=y*(log(lambda)-log(phi))+(y-1)*log(omega)-lgamma(y+1)-teta;

model y~general(ll);

predict _ll out=LL_2;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-22-2015 11:39 AM

You have to create dummy variables, either in a data step or within the PROC. This can be tedious. For instance, if WORK has five categories, you need four dummy variables.

xw1=0; xw2=0; xw3=0; xw4=0;

if (work eq 1) the xw1=1;

if (work eq 2) then xw2=1;

if (work eq 3) then xw3=1;

if (work eq 4) then xw4=1;

One then needs a parameter for each of these dummy covariates.

... bw1*xw1 + bw2*xw2 + bw3*xw3 + bw4*xw4 + ........ ;

So, with so many categorical variables, you might have a LOT of parameters. This can get tricky to interpret.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-22-2015 01:36 PM

And a new reply to a different post reminded me that you can use PROC GLMMOD. With this PROC, you store the generated data set with the dummy variables and then use this in NLMIXED. You still need to write out the model with all the parameters.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-22-2015 01:38 PM

Also, consider using PROC GLMMOD to generate a design matrix of 1's and 0's. It will be the overparameterized standard, so that columns that are all zeroes need to be eliminated, and the variable names aren't associated, but those are things that can be worked around.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-22-2015 02:30 PM

It can be done a little bit shorter. For example:

bpart = b0+b1*(MIG=1)+b2*(MIG=2)+ ...

Using copy and paste to add similar terms, this isn't too bad as long as there aren't a large number of levels to deal with. If so, the best way to produce a coded set of design variables is (odd as it may seem) in PROC LOGISTIC via the OUTDESIGN= and OUTDESIGNONLY options. You can then use reference (or other full-rank) coding to just produce k-1 design variables for a predictor with k levels. And the created variables are named using the variable name and level. See this note that describes it.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-22-2015 09:42 PM

After that, a warning like this show up. How do I solve this?

"WARNING: The final Hessian matrix is not positive definite, and therefore the estimated covariance matrix is not full rank and may be unreliable. The variance of some parameter estimates is zero or some parameters are linearly related to other parameters."

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-23-2015 08:55 AM

If you used GLMMOD, then you have to eliminate any columns that are all zeroes. I suspect that the same applies if you use the (MIG=1), (MIG=2), etc. approach. If you have 5 levels of MIG, I think you can only include 4 if you have an intercept in the model. Otherwise, the model would be overparameterized, and you end up with the warning about the Hessian. Also, you may have more random effects than can be fit with the existing data. If you could share all of your NLMIXED code, we might be able to figure out which is happening.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-23-2015 10:40 AM

You can have this problem because of several issues, even with a small number of parameters. But the most likely culprit is the large number of parameters in your model (as suggested by Steve). Remember, if there are five levels of a factor, you can only have four parameters (when you have an intercept). Moreover, with many factors and an observational study (?), your remaining dummy variables may be linearly related (even when you drop the last level).