BookmarkSubscribeRSS Feed
dwi_wahyudi
Calcite | Level 5

hi everybody,

I'm a little confused bout how to define the categorical variables in "proc nlmixed". As we know, class statement can't be used in this procedure. So how do I solve these? Please help me

this is the syntax.

MIG, WORK, EDUC, LIVE, MARR, KB, WI are categorical variable.

Please help me fix this syntax, so the 7 variables can turn into dummy variable.

proc nlmixed data=CLEAN tech=newrap cov;

parameters b0=0 b1=0 b2=0 b3=0 b4=0 b5=0 b6=0 b7=0 b8=0 w=0;

bpart=b0+b1*MIG+b2*WORK+b3*EDUC+b4*LIVE+b5*MARR+b6*KB+b7*WI+b8*UKP;

lambda=exp(bpart);

phi=1+(w*lambda);

omega=1+(w*y);

teta=(lambda*omega)/phi;

ll=y*(log(lambda)-log(phi))+(y-1)*log(omega)-lgamma(y+1)-teta;

model y~general(ll);

predict _ll out=LL_2;

run;

7 REPLIES 7
lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

You have to create dummy variables, either in a data step or within the PROC. This can be tedious. For instance, if WORK has five categories, you need four dummy variables.

xw1=0; xw2=0; xw3=0; xw4=0;

if (work eq 1) the xw1=1;

if (work eq 2) then xw2=1;

if (work eq 3) then xw3=1;

if (work eq 4) then xw4=1;

One then needs a parameter for each of these dummy covariates.

... bw1*xw1 + bw2*xw2 + bw3*xw3 + bw4*xw4 + ........ ;

So, with so many categorical variables, you might have a LOT of parameters. This can get tricky to interpret.

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

And a new reply to a different post reminded me that you can use PROC GLMMOD. With this PROC, you store the generated data set with the dummy variables and then use this in NLMIXED. You still need to write out the model with all the parameters.

SteveDenham
Jade | Level 19

Also, consider using PROC GLMMOD to generate a design matrix of 1's and 0's.  It will be the overparameterized standard, so that columns that are all zeroes need to be eliminated, and the variable names aren't associated, but those are things that can be worked around.

Steve Denham

StatDave
SAS Super FREQ

It can be done a little bit shorter.  For example:

bpart = b0+b1*(MIG=1)+b2*(MIG=2)+ ...

Using copy and paste to add similar terms, this isn't too bad as long as there aren't a large number of levels to deal with.  If so, the best way to produce a coded set of design variables is (odd as it may seem) in PROC LOGISTIC via the OUTDESIGN= and OUTDESIGNONLY options.  You can then use reference (or other full-rank) coding to just produce k-1 design variables for a predictor with k levels.  And the created variables are named using the variable name and level.  See this note that describes it.

dwi_wahyudi
Calcite | Level 5

After that, a warning like this show up. How do I solve this?

"WARNING: The final Hessian matrix is not positive definite, and therefore the estimated covariance matrix is not full rank and may be unreliable.  The variance of some parameter estimates is zero or some parameters are linearly related to other parameters."

SteveDenham
Jade | Level 19

If you used GLMMOD, then you have to eliminate any columns that are all zeroes.  I suspect that the same applies if you use the (MIG=1), (MIG=2), etc. approach.  If you have 5 levels of MIG, I think you can only include 4 if you have an intercept in the model.  Otherwise, the model would be overparameterized, and you end up with the warning about the Hessian.  Also, you may have more random effects than can be fit with the existing data.  If you could share all of your NLMIXED code, we might be able to figure out which is happening.

Steve Denham

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

You can have this problem because of several issues, even with a small number of parameters. But the most likely culprit is the large number of parameters in your model (as suggested by Steve). Remember, if there are five levels of a factor, you can only have four parameters (when you have an intercept). Moreover, with many factors and an observational study (?), your remaining dummy variables may be linearly related (even when you drop the last level).

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 4499 views
  • 0 likes
  • 4 in conversation