"param = glm" gave a singular matrix warning while "param = ref" did n...

YKH · Posted 02-12-2025 04:30 PM

Hi everyone,

I'm conducting a multinomial logistic regression model using proc logistic in SAS with around 3.6 million observations, an outcome with 5 levels, and dozens of categorical predictors. I had no issue running both univariate and multivariate models when setting param = ref.

However, once I tried param = glm, it started giving the warning message of "The information matrix is singular and thus the convergence is questionable. specifying a larger SINGULAR= value." in multivariate models. After doing some research, I found this message suggesting a multicollinearity issue in the model. I then tried to use only 2 predictors and it still gave the message while the correlation matrix showed no correlation between the two predictors.

As far as I know, the only difference of param = ref and param = glm is that param = glm uses less-than-full-rank reference coding, meaning that it will create k-1 dummy variables given k levels in the categorical predictor. These two parametrization methods should generate the same log-likelihood and estimates given the same reference level. To confirm this, I also compared the result of the two models using only 2 predictors. While param = glm throwing a warning, the result is identical to param = ref (Except a bunch of zeros in the estimates of reference levels for each predictor in param = glm, is it the cause?).

My question is, why did the param = glm model throwing a warning while param = ref did not. And more importantly, in this situation, should I trust the result of the param = ref even though no warning was displayed.

I appreciate any advice and suggestions. Thank you in advance.

StatDave · Posted 02-12-2025 04:54 PM

Any change in the model, such as parameterization of CLASS effects, changes the optimization, so unexpected differences like this can happen. But to assess the fit from the PARAM=REF fit, you can add the ITPRINT option and examine the vector of gradients. For proper convergence, they should all be quite close to zero. Also examine the standard errors of the parameters - they should not be large, like approaching 100 or even more. If you want more assurance, you could use any of the other procedures that can fit the logistic model such as the GLIMMIX, GENMOD, HPGENSELECT, PROBIT procedures and others which generally don't have identical algorithm code.

YKH · Posted 02-19-2025 11:25 AM

Thank you for the reply! I've used the ITPRINT option, and the gradients seemed to approach zero at the end. I'll try other options you suggested.

Ksharp · Posted 02-12-2025 09:13 PM

You could use different value of Y variable to check which one make such annoying WARNNING message.

model Smoker_NXT(ref='No') = AgeStartCIGS Age1stIview Sex Race Hispanic Wave|Smoker|ENDSer SmkHistory
Start2SMK  /  noint link=glogit  Singular=1E-7  ;

model Smoker_NXT(ref='Yes') = AgeStartCIGS Age1stIview Sex Race Hispanic Wave|Smoker|ENDSer SmkHistory
Start2SMK  /  noint link=glogit  Singular=1E-7  ;

model Smoker_NXT(ref='None') = AgeStartCIGS Age1stIview Sex Race Hispanic Wave|Smoker|ENDSer SmkHistory
Start2SMK  /  noint link=glogit  Singular=1E-7  ;

......................

YKH · Posted 02-19-2025 11:26 AM

So the reference level also makes the difference? Will definitely try it. Thank you.

"param = glm" gave a singular matrix warning while "param = ref" did not

Re: "param = glm" gave a singular matrix warning while "param = ref" did not

Re: "param = glm" gave a singular matrix warning while "param = ref" did not

Re: "param = glm" gave a singular matrix warning while "param = ref" did not

Re: "param = glm" gave a singular matrix warning while "param = ref" did not

"param = glm" gave a singular matrix warning while "param = ref" did not

Re: "param = glm" gave a singular matrix warning while "param = ref" did not

Re: "param = glm" gave a singular matrix warning while "param = ref" did not

Re: "param = glm" gave a singular matrix warning while "param = ref" did not

Re: "param = glm" gave a singular matrix warning while "param = ref" did not

The 2025 SAS Hackathon has begun!