Hi,
I want to model the probability of being obese with a set a risk factors. Data are from a longitudinal survey (so there are many observations for one single individuals). I thus use GEE with proc GENMOD, which I think is the appropriate model for this kind of situation. However, I have this error:
NOTE: Class levels for some variables were not printed due to excessive size.
NOTE: PROC GENMOD is modeling the probability that obe1='1'.
NOTE: Algorithm converged.
ERROR: Error in computing the variance function.
ERROR: Error in parameter estimate covariance computation.
ERROR: Error in estimation routine.
I identified the problematic variable, which is cntry (for country), but the error is still there. Here is the code for this simplified model:
proc genmod data=work.data1 descending ;
class smoking(ref='0') obesity(ref='0') ah(ref='0') edu3(ref='2') depression(ref='0') vig_pa(ref='0') cntry(ref='DE') noid sex;
model obe1=/*obesity sex age_num age_num*age_num edu3 smoking ah depression vig_pa*/ cntry /*cntry*sex dur*// dist=bin link=logit;
repeated subject=noid / type=exch;
weight pond;
run;
Analysis Of GEE Parameter Estimates |
|||||||
Empirical Standard Error Estimates |
|||||||
Parameter |
|
Estimate |
Standard |
95% Confidence Limits |
Z |
Pr > |Z| |
|
Intercept |
|
-1.6629 |
. |
. |
. |
. |
. |
cntry |
AT |
0.5749 |
. |
. |
. |
. |
. |
cntry |
BE |
0.4967 |
. |
. |
. |
. |
. |
cntry |
CZ |
0.9894 |
. |
. |
. |
. |
. |
cntry |
DK |
0.7170 |
. |
. |
. |
. |
. |
cntry |
EE |
0.8038 |
. |
. |
. |
. |
. |
cntry |
ES |
0.4774 |
. |
. |
. |
. |
. |
cntry |
FR |
13.7220 |
. |
. |
. |
. |
. |
cntry |
GR |
0.2397 |
. |
. |
. |
. |
. |
cntry |
IT |
0.7010 |
. |
. |
. |
. |
. |
cntry |
NT |
0.0756 |
. |
. |
. |
. |
. |
cntry |
SE |
1790526 |
. |
. |
. |
. |
. |
cntry |
SI |
0.6285 |
. |
. |
. |
. |
. |
cntry |
DE |
0.0000 |
0.0000 |
0.0000 |
0.0000 |
. |
. |
And this is the crosstab for the obe1*cntry. I don’t see any category that may be problematic.
proc freq data=work.data1;
table obe1*cntry /nocol norow nopercent;
weight pond;
run;
Table of obe1 by cntry |
||||||||||||||||||||||||||||
obe1 |
cntry |
|||||||||||||||||||||||||||
AT |
BE |
CZ |
DE |
DK |
EE |
ES |
FR |
GR |
IT |
NT |
SE |
SI |
Total |
|||||||||||||||
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||||||||||||||
1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||||||||||||||
Total |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||||||||||||||
Note 1: the model works when I remove the weight statement. The range of the weight is from 0.006735 to 46.43589.
Note 2: the model works when I use another dependent variable (and using the weight statement).
I believe you can use the CLUSTER statement to deal with the clustering in your data.
In any modeling procedure, you should not specify variables in the CLASS statement that are not used in other statements in the procedure. Doing so can result in additional observations being ignored if some are missing on these variables. In binary response models, decreasing the number of observations used can easily cause the data to become too sparse which can cause results like what you show.
Also, weights are typically not needed for logistic models like this. If the weights you are using are sampling weights, then you should be using PROC SURVEYLOGISTIC, not GENMOD. GENMOD does not have the variance estimators needed to do a proper survey data analysis.
It's sampling weight. Is it possible with PROC SURVEYLOGISTIC to use the GEE method (some observations are for a same individuals,but different years, so there is a correlation that needs to be accounted)?
I believe you can use the CLUSTER statement to deal with the clustering in your data.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.