BookmarkSubscribeRSS Feed
Unay13
Obsidian | Level 7

I was using the Proc NLMIXED for estimating my model using poisson and negative binomial model. The results change drastically when using the parms statement and assigning initial values. On what basis do we define the initial parameters in the Parms statement? We run the model for the estimation of those coefficient so how do we initially guess the values?

 

Any help would be greatly appreciated. 

7 REPLIES 7
Rick_SAS
SAS Super FREQ

Post your code. There are many ways:

 

  • Sometimes we have domain knowledge or historical knowledge that helps us make an informed guess.
  • Sometimes we can solve a closely-related linear system (which does not require an initial guess) and use the solution to the linear system as an initial guess for the nonlinear system.
  • In low dimensions (three or fewer parameters) you can use graphical methods to make an initial guess 
  • You can use grid-based methods up to about 6 or 8 dimensions.
  • You can use the method of percentiles or the method of moments to obtain an initial guess when you are fitting the parameters of a probability distribution.
Unay13
Obsidian | Level 7

This is the code:

 

proc NLMIXED data = work.spf;
mu=beta_0 +(beta_1/((1+ beta_2*(maj)**(-beta_3))*(1+ beta_4*(min)**(-beta_5))));
NBLL = (lgamma(tot+(1/k)) - lgamma(tot+1) - lgamma(1/k)+ tot*log(k*mu) - (tot+(1/k))*log(1+k*mu));
model tot ~ general (NBLL);
run;

 

When I assigned mu as a different function with three parameters to be estimated the model converged. The first three solutions you provided are not applicable as I have less knowledge regarding what the parameters should be and there are 7 parameters. Also, I am not much familiar with the method of moments or grid based methods.

Rick_SAS
SAS Super FREQ

Please specify a PARMS statement. Without it, we have no way of knowing which symbols are parameters and which are parameters in the data. For example, I assume the betas are parameters. What about k, tot, min, and maj?  Are there any notes or warnings in the SAS log? How many observations in the data set?

Unay13
Obsidian | Level 7

Thank you for the response and I apologize for not being clear.

 

I have defined the PARMS statement below with the default values.

 

proc NLMIXED data = work.spf;

PARMS beta_0=1 beta_1=1 beta_2=1 beta_3=1 beta_4=1 beta_5=1 k=1;
mu=beta_0 +(beta_1/((1+ beta_2*(maj)**(-beta_3))*(1+ beta_4*(min)**(-beta_5))));
NBLL = (lgamma(tot+(1/k)) - lgamma(tot+1) - lgamma(1/k)+ tot*log(k*mu) - (tot+(1/k))*log(1+k*mu));
model tot ~ general (NBLL);
run;

 

The Betas and K are the parameters. There are 198 observation. tot is the dependent variable and maj and min are the independent variables.

Any help regarding the initial parameters would be highly appreciated.

Rick_SAS
SAS Super FREQ

I'd try a grid of values, and also supply bounds for any parameters that I know are positive or otherwise constrained. Use domain knowledge to choose reasonable values for the parameter guesses. The example below indicates how to choose a grid of values. The numbers are just made up since I know nothing about your data or experiment.

 

PARMS beta_0=-1 0 1

             beta_1=0.1 1 10 100

             beta_2=-2 0 2   

             beta_3=10 100

             beta_4=10 100

             beta_5=1 2

             k=1;

BOUNDS k > 0;

 

For more information about choosing a grid of values, see "Use a grid search to find initial parameter values for regression models in SAS"

Unay13
Obsidian | Level 7

Thank you for the help. I did try guessing the initial parameters based on other similar models. The model does converge but I get null values or blank spaces in parameter estimates standard error, t Value, confidence limit and P value. What could possibly be the reason for that? These are the notes and warning in Log:


NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: At least one element of the gradient is greater than 1e-3.
NOTE: Moore-Penrose inverse is used in covariance matrix.
WARNING: The final Hessian matrix is full rank but has at least one
negative eigenvalue. Second-order optimality condition violated.

Rick_SAS
SAS Super FREQ

There are several reasons, but the two most likely are

1. The model does not fit the data.

2. You are using variables that are linearly dependent. For example, if you include a binary indicator variable for "Males" and also include an indicator variable for "Females," the two variables are not independent and the covariance matrix is singular.

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 2209 views
  • 2 likes
  • 2 in conversation