Programming the statistical procedures from SAS

proc nlmixed with categorical variable

Reply
New Contributor
Posts: 2

proc nlmixed with categorical variable

hi everybody,

I'm a little confused bout how to define the categorical variables in "proc nlmixed". As we know, class statement can't be used in this procedure. So how do I solve these? Please help me

this is the syntax.

MIG, WORK, EDUC, LIVE, MARR, KB, WI are categorical variable.

Please help me fix this syntax, so the 7 variables can turn into dummy variable.

proc nlmixed data=CLEAN tech=newrap cov;

parameters b0=0 b1=0 b2=0 b3=0 b4=0 b5=0 b6=0 b7=0 b8=0 w=0;

bpart=b0+b1*MIG+b2*WORK+b3*EDUC+b4*LIVE+b5*MARR+b6*KB+b7*WI+b8*UKP;

lambda=exp(bpart);

phi=1+(w*lambda);

omega=1+(w*y);

teta=(lambda*omega)/phi;

ll=y*(log(lambda)-log(phi))+(y-1)*log(omega)-lgamma(y+1)-teta;

model y~general(ll);

predict _ll out=LL_2;

run;

Valued Guide
Valued Guide
Posts: 679

Re: proc nlmixed with categorical variable

You have to create dummy variables, either in a data step or within the PROC. This can be tedious. For instance, if WORK has five categories, you need four dummy variables.

xw1=0; xw2=0; xw3=0; xw4=0;

if (work eq 1) the xw1=1;

if (work eq 2) then xw2=1;

if (work eq 3) then xw3=1;

if (work eq 4) then xw4=1;

One then needs a parameter for each of these dummy covariates.

... bw1*xw1 + bw2*xw2 + bw3*xw3 + bw4*xw4 + ........ ;

So, with so many categorical variables, you might have a LOT of parameters. This can get tricky to interpret.

Valued Guide
Valued Guide
Posts: 679

Re: proc nlmixed with categorical variable

And a new reply to a different post reminded me that you can use PROC GLMMOD. With this PROC, you store the generated data set with the dummy variables and then use this in NLMIXED. You still need to write out the model with all the parameters.

Respected Advisor
Posts: 2,655

Re: proc nlmixed with categorical variable

Also, consider using PROC GLMMOD to generate a design matrix of 1's and 0's.  It will be the overparameterized standard, so that columns that are all zeroes need to be eliminated, and the variable names aren't associated, but those are things that can be worked around.

Steve Denham

SAS Employee
Posts: 187

Re: proc nlmixed with categorical variable

It can be done a little bit shorter.  For example:

bpart = b0+b1*(MIG=1)+b2*(MIG=2)+ ...

Using copy and paste to add similar terms, this isn't too bad as long as there aren't a large number of levels to deal with.  If so, the best way to produce a coded set of design variables is (odd as it may seem) in PROC LOGISTIC via the OUTDESIGN= and OUTDESIGNONLY options.  You can then use reference (or other full-rank) coding to just produce k-1 design variables for a predictor with k levels.  And the created variables are named using the variable name and level.  See this note that describes it.

New Contributor
Posts: 2

Re: proc nlmixed with categorical variable

After that, a warning like this show up. How do I solve this?

"WARNING: The final Hessian matrix is not positive definite, and therefore the estimated covariance matrix is not full rank and may be unreliable.  The variance of some parameter estimates is zero or some parameters are linearly related to other parameters."

Respected Advisor
Posts: 2,655

Re: proc nlmixed with categorical variable

If you used GLMMOD, then you have to eliminate any columns that are all zeroes.  I suspect that the same applies if you use the (MIG=1), (MIG=2), etc. approach.  If you have 5 levels of MIG, I think you can only include 4 if you have an intercept in the model.  Otherwise, the model would be overparameterized, and you end up with the warning about the Hessian.  Also, you may have more random effects than can be fit with the existing data.  If you could share all of your NLMIXED code, we might be able to figure out which is happening.

Steve Denham

Valued Guide
Valued Guide
Posts: 679

Re: proc nlmixed with categorical variable

You can have this problem because of several issues, even with a small number of parameters. But the most likely culprit is the large number of parameters in your model (as suggested by Steve). Remember, if there are five levels of a factor, you can only have four parameters (when you have an intercept). Moreover, with many factors and an observational study (?), your remaining dummy variables may be linearly related (even when you drop the last level).

Ask a Question
Discussion stats
  • 7 replies
  • 872 views
  • 0 likes
  • 4 in conversation