Solved: PROC GENMOD. Error with GEE

Demographer · Posted 01-21-2019 11:12 AM

Hi,

I want to model the probability of being obese with a set a risk factors. Data are from a longitudinal survey (so there are many observations for one single individuals). I thus use GEE with proc GENMOD, which I think is the appropriate model for this kind of situation. However, I have this error:

NOTE: Class levels for some variables were not printed due to excessive size.

NOTE: PROC GENMOD is modeling the probability that obe1='1'.

NOTE: Algorithm converged.

ERROR: Error in computing the variance function.

ERROR: Error in parameter estimate covariance computation.

ERROR: Error in estimation routine.

I identified the problematic variable, which is cntry (for country), but the error is still there. Here is the code for this simplified model:

proc genmod data=work.data1 descending ;
  class  smoking(ref='0') obesity(ref='0') ah(ref='0') edu3(ref='2') depression(ref='0') vig_pa(ref='0') cntry(ref='DE') noid sex;
   model  obe1=/*obesity sex age_num age_num*age_num edu3 smoking ah depression vig_pa*/ cntry /*cntry*sex dur*//  dist=bin link=logit;
   repeated subject=noid / type=exch;
   weight pond;
run;

Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Parameter		Estimate	Standard Error	95% Confidence Limits		Z	Pr > \|Z\|
Intercept		-1.6629	.	.	.	.	.
cntry	AT	0.5749	.	.	.	.	.
cntry	BE	0.4967	.	.	.	.	.
cntry	CZ	0.9894	.	.	.	.	.
cntry	DK	0.7170	.	.	.	.	.
cntry	EE	0.8038	.	.	.	.	.
cntry	ES	0.4774	.	.	.	.	.
cntry	FR	13.7220	.	.	.	.	.
cntry	GR	0.2397	.	.	.	.	.
cntry	IT	0.7010	.	.	.	.	.
cntry	NT	0.0756	.	.	.	.	.
cntry	SE	1790526	.	.	.	.	.
cntry	SI	0.6285	.	.	.	.	.
cntry	DE	0.0000	0.0000	0.0000	0.0000	.	.

And this is the crosstab for the obe1*cntry. I don’t see any category that may be problematic.

proc freq data=work.data1;
table obe1*cntry /nocol norow nopercent;
weight pond;
run;

Table of obe1 by cntry

obe1

cntry

AT

BE

CZ

DE

DK

EE

ES

FR

GR

IT

NT

SE

SI

Total

0

719.568

836.852

508.02

2711.3

275.363

97.5724

3317.06

3159.86

609.726

3309.08

739.526

351.252

139.237

16774.4

1

180.909

197.059

220.64

744.693

41.8123

41.3259

913.276

736.899

147.012

717.024

146.01

62.5714

49.4657

4198.7

Total

900.477

1033.91

728.66

3456

317.176

138.898

4230.34

3896.76

756.738

4026.1

885.536

413.823

188.702

20973.1

Note 1: the model works when I remove the weight statement. The range of the weight is from 0.006735 to 46.43589.

Note 2: the model works when I use another dependent variable (and using the weight statement).

StatDave · Posted 01-24-2019 09:57 AM

I believe you can use the CLUSTER statement to deal with the clustering in your data.

View solution in original post

StatDave · Posted 01-24-2019 09:41 AM

In any modeling procedure, you should not specify variables in the CLASS statement that are not used in other statements in the procedure. Doing so can result in additional observations being ignored if some are missing on these variables. In binary response models, decreasing the number of observations used can easily cause the data to become too sparse which can cause results like what you show.

Also, weights are typically not needed for logistic models like this. If the weights you are using are sampling weights, then you should be using PROC SURVEYLOGISTIC, not GENMOD. GENMOD does not have the variance estimators needed to do a proper survey data analysis.

Demographer · Posted 01-24-2019 09:49 AM

It's sampling weight. Is it possible with PROC SURVEYLOGISTIC to use the GEE method (some observations are for a same individuals,but different years, so there is a correlation that needs to be accounted)?

StatDave · Posted 01-24-2019 09:57 AM

I believe you can use the CLUSTER statement to deal with the clustering in your data.

PROC GENMOD. Error with GEE

Re: PROC GENMOD. Error with GEE

Re: PROC GENMOD. Error with GEE

Re: PROC GENMOD. Error with GEE

Re: PROC GENMOD. Error with GEE