BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
CHELS
Obsidian | Level 7

 I am trying to run a phreg procedure:

 

proc phreg data = Data
class A B C D E F G H I ;
model time*event(0) = A B C D E F G H I J K /  ties = exact ;
strata L M;

run;

 

and I received the following warning:

 

Warning: Ridging has failed to improve the loglikelihood. You may want to increase the initial ridge value (RIDGEINIT= option), or use a different ridging technique (RIDGING = option), or switch to using linesearch to reduce the step size (RIDGING=NONE), or specify a new set of parameter estimates (INEST= option). 

 

I suspect that it is due to too many categorical variables such that the data (n=220)  becomes so sparse that the likelihood cannot be maximized. However, I do have another set of data (n=170) that has no such warning using the exact same model. Is there a way that I can find out exactly where the problem lies?

 

THANK YOU!

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

This is only my personal opinion, but I am not a big fan of ridging. It solves a nearby problem, but it does not solve the MLE for the actual data that you observed. I would rather re-examine the model than use ridging to find the parameters for a dubious model. 

 

The exception to this general rule is when the model that you are fitting is well-established in the literature. If a certain model is a standard, then using that model might be preferable to creating a new model.

 

Regarding "can I trust the estimates," if the MLE converges, then you can trust that the parameter estimates. However, that does not mean that the model fits the data. (Think about fitting a straight line to quadratic data.) To investigate the model fit, you should use diagnostic plots, goodness of fit tests, and similar analyses.

View solution in original post

3 REPLIES 3
Rick_SAS
SAS Super FREQ

There are other threads that address this same issue, such as this one and this one. In general, maximum likelihood estimation run into these problems when the model does not fit the data, and the problems are worsened when there are small samples (relative to the number of parameters in the model). Although your smaller data set might converge, that can be because t data fits the model better than the current data.

 

 Is there a way that I can find out exactly where the problem lies?

 

The truth is that there is not an easy way. For low-dimensional problems (one or two parameters), you can draw graphs of the likelihood function and see where it is flat, but in high dimensions that is not feasible.

CHELS
Obsidian | Level 7

Thank you so much for the explanation!!!

 

I have one more question following that: If I use ridging = none or ridging = absolute, am I purely suppressing the problem? or I can trust the estimates? Comparing to perhaps re-examining the model and covariates, which is a better solution?

 

THANK YOU!

 

 

 

 

Rick_SAS
SAS Super FREQ

This is only my personal opinion, but I am not a big fan of ridging. It solves a nearby problem, but it does not solve the MLE for the actual data that you observed. I would rather re-examine the model than use ridging to find the parameters for a dubious model. 

 

The exception to this general rule is when the model that you are fitting is well-established in the literature. If a certain model is a standard, then using that model might be preferable to creating a new model.

 

Regarding "can I trust the estimates," if the MLE converges, then you can trust that the parameter estimates. However, that does not mean that the model fits the data. (Think about fitting a straight line to quadratic data.) To investigate the model fit, you should use diagnostic plots, goodness of fit tests, and similar analyses.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 7455 views
  • 0 likes
  • 2 in conversation