I tried to calculate the risk difference using a multivariable logistic model with the macro NLmeans.
The following variables were included:
group (binary)
sex (binary)
weight (continuous)
year (continuous)
treatment1 (binary)
treatment2 (binary)
treatment3 (binary)
town (5 categories)
However, when I run the code as shown below, it produces missing values.
proc logistic data = dat;
class group(ref="0") sex treatment1 treatment2 treatment3 town / param = glm;
model style(event="1") = group sex weight year treatment1 treatment2 treatment3 town / link = logit;
lsmeans group / e ilink;
ods output coef=coef1;
store clog1;
run;
%nlmeans(instore=clog1, coef=coef1, link=logit)
When I exclude the two continuous variables from the model, it works well.
proc logistic data = dat;
class group(ref="0") sex treatment1 treatment2 treatment3 town / param = glm;
model style(event="1") = group sex treatment1 treatment2 treatment3 town / link = logit;
lsmeans group / e ilink;
ods output coef=coef2;
store clog2;
run;
%nlmeans(instore=clog2, coef=coef2, link=logit)
Is there any reason why the model fails to work when the continuous variables are included?
I only got a warning of the positive hessian matrix below
"
WARNING: The final Hessian matrix is not positive definite, and therefore the estimated covariance matrix is not full rank and may be unreliable. The variance of some parameter estimates is zero or some parameters are linearly related to other parameters.
However, it seems like that is not an issue since both models obtained the same warning,
Also, I had to run a conditional logistic model, which uses proc logistic with strata option that proc genmod does not provide
When I run proc logistic, it does not show any warnings or errors.
Also, when I run NLMeans, it only shows the differences with its confidence intervals, which provides a missing value
Not all model fitting problems cause warnings or errors in the log. Again, please post the parameter estimates table from the model fit. The large standard errors in the LSMEANS results suggest a problem.
Thanks! I will post the parameter estimates table once I have access to the data!
I remember the standard errors were not that big, but some towns had separation issue.
Separation is exactly the problem I suspect. If that is the case, some of the model parameter estimates are infinite making the model useless. Simplifying the model, as you did can help, though even in the second run the standard errors on the estimates are very large and you can see that the confidence intervals are essentially the entire valid probability range [0,1]. Another possible solution is to use the FIRTH option in the model fit to use a penalized likelihood.
Okay, I will try to exclude such variables, which causes the separation!
Just wondered I found that I could obtain the odds ratio of the variable, group, but not the risk difference.
Is there any reason for that??
Although I have added additional independent variables as listed below, I am still encountering the same problem, even though there is no separation.
I also checked the VIF of the variables, and all eigenvalues were less than 2.5. However, I received the following warning when I ran nlmeans:
"WARNING: The final Hessian matrix is not positive definite, and
therefore the estimated covariance matrix is not full rank and
may be unreliable. The variance of some parameter estimates is
zero or some parameters are linearly related to other
parameters."
Belows are the results after running proc logistics and nlmeans.
I found that excluding either only "year2" or both "weight and year1" variables led the model working well, but I need to include all of them.
Thanks!
The intercept parameter and its standard error are both very large for a logistic model and is still suggesting a problem. Was this done with the FIRTH option or not? If no, try using it. If yes, then you will need to simplify the model somehow to get a model that fits without evidence of problems - you should have no messages in the log about separation. Note that for a generalized model like a logistic model you cannot just run the model through PROC REG to get VIF values. See this note for assessing collinearity for generalized models. As noted in the NLMeans macro documentation, the Hessian warning will occur any time that the GLM parameterization (PARAM=GLM) is used in the model read into the macro.
Sorry what do you mean by other options?
Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.
Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.