Statistical Procedures

katy-barry · Posted 04-12-2020 02:07 AM

Hello Everyone,

I am currently trying to perform a PROC GEE model on data that I transposed. I get results but I think that something is wrong here.

/*explanation of variables:

I have data from 2009, 2011, 2014, and 2018 measuring if a person is unemployed or not. earlystart is categorized into three categories (0,1,2,). first, I recoded each of these variables so that they would be the same*/

data chomage;

set cannabis;

keep earlystart NTT j09_EVEN12M10 J11_CHOMAGE J14_CHOMAGE J18_CHOMAGE;

run;

data chomage2;

set chomage;

if j09_EVEN12M10=2 then chomj09=0;

if j09_EVEN12M10=1 then chomj09=1;

if j11_CHOMAGE = 2 then chomj11=0;

else if j11_CHOMAGE = 1 then chomj11=1;

chomj14 = j14_CHOMAGE;

chomj18= J18_CHOMAGE;

DROP

j09_EVEN12M10

j11_CHOMAGE

j14_chomage

J18_CHOMAGE;

run;

/*then I transposed my data. NTT is my I.D. variable*/

proc transpose data=chomage2 out=transpose3;

by ntt earlystart; /*earlystart is not changing by each wave of data*/

var chomj09 chomj11 chomj14 chomj18;

run;

/*renamed the column*/

data chomagetranspose;

set transpose3 (rename=(col1=chomj));

drop _name_;

run;

/* I do not know if this is necessary but I removed the missing values for if my independent and dependent variable were missing*/

data finaldeleted;

set chomagetranspose;

if earlystart =. or chomj=. then delete;

run;

/*running the final model*/

proc genmod data=finaldeleted descending;

class ntt earlystart;

model chomj = earlystart/ dist=bin link=logit;

repeated subject=ntt / type=exch covb corrw;

run;

my results below:

Model Information
Data Set	WORK.FINALDELETED
Distribution	Binomial
Link Function	Logit
Dependent Variable	chomj

Number of Observations Read	3483
Number of Observations Used	3483
Number of Events	1999
Number of Trials	3483

Class Level Information
Class	Levels	Values
NTT	1476	142 149 162 165 169 170 177 178 179 180 182 188 192 200 207 215 217 218 224 228 238 246 258 264 270 274 284 285 286 289 292 295 296 297 298 303 306 307 310 322 326 333 343 349 351 355 356 358 360 361 362 363 364 373 374 387 393 394 398 409 413 414 415 ...
earlystart	3	0 1 2

Response Profile
Ordered Value	chomj	Total Frequency
1	1	1999
2	0	1484

PROC GENMOD is modeling the probability that chomj='1'.

/*I want chomage=1 because it means the probability that a person will be

unemployed*/

Parameter Information
Parameter	Effect	earlystart
Prm1	Intercept
Prm2	earlystart	0
Prm3	earlystart	1
Prm4	earlystart	2

Algorithm converged.

GEE Model Information
Correlation Structure	Exchangeable
Subject Effect	NTT (1476 levels)
Number of Clusters	1476
Correlation Matrix Dimension	4
Maximum Cluster Size	4
Minimum Cluster Size	1

Covariance Matrix (Model-Based)
	Prm1	Prm2	Prm3
Prm1	0.004538	-0.004538	-0.004538
Prm2	-0.004538	0.009496	0.004538
Prm3	-0.004538	0.004538	0.01291

Covariance Matrix (Empirical)
	Prm1	Prm2	Prm3
Prm1	0.004691	-0.004691	-0.004691
Prm2	-0.004691	0.009653	0.004691
Prm3	-0.004691	0.004691	0.01225

Algorithm converged.

Working Correlation Matrix
	Col1	Col2	Col3	Col4
Row1	1.0000	0.3325	0.3325	0.3325
Row2	0.3325	1.0000	0.3325	0.3325
Row3	0.3325	0.3325	1.0000	0.3325
Row4	0.3325	0.3325	0.3325	1.0000

Exchangeable Working Correlation
Correlation	0.3324594755

GEE Fit Criteria
QIC	4764.3766
QICu	4761.2962

Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Parameter		Estimate	Standard Error	95% Confidence Limits	Z	Pr > \|Z\|
Intercept		0.4582	0.0685	0.3239	0.5924	6.69	<.0001
earlystart	0	-0.1410	0.0983	-0.3336	0.0516	-1.43	0.1513
earlystart	1	-0.0495	0.1107	-0.2665	0.1674	-0.45	0.6546
earlystart	2	0.0000	0.0000	0.0000	0.0000	.	.

/*this is what I find strange. Why is the early start category (2) blank?

FreelanceReinh · Posted 04-12-2020 10:26 AM

Hello @katy-barry and welcome to the SAS Support Communities!

I've run your PROC GENMOD step on simple, randomly generated data and -- as expected -- the last row in the "Analysis Of GEE Parameter Estimates" table is identical to what you've shown. The reason is: By default, PROC GENMOD uses GLM coding as the parameterization method for classification variables: see documentation of the CLASS statement (PARAM= option). This is explained in more detail in GLM Parameterization of Classification Variables and Effects and Other Parameterizations, where it says: "Parameter estimates of CLASS main effects that use the GLM coding scheme estimate the difference in the effects of each level compared to the last level."

The "last level" of earlystart is 2 and, of course, the difference in the effect of the last level compared to itself is trivially zero.

If you had specified reference cell coding

class ntt earlystart(param=ref ref='2');

the reference level (here: 2) would have been just omitted, without changing the other parameters and statistics.

Regardless of the parameterization method, there's one more parameter to be estimated than there are degrees of freedom. Hence, it is correct that one parameter is set to zero or omitted.

Similarly, adding the NOINT option to the MODEL statement would have displayed in row "earlystart 2" what is shown in row "Intercept" in your output (and the intercept parameter would be zero instead).

katy-barry · Posted 04-12-2020 01:02 PM

thank you I just tried (param=ref ref= '0') and it worked brilliantly. thank you for your help 🙂

Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Parameter		Estimate	Standard Error	95% Confidence Limits	Z	Pr > \|Z\|
Intercept		0.3172	0.0704	0.1791	0.4552	4.50	<.0001
earlystart	1	0.0915	0.1119	-0.1279	0.3108	0.82	0.4138
earlystart	2	0.1410	0.0983	-0.0516	0.3336	1.43	0.1513

Statistical Procedures

PROC GEE troubleshooting

Re: PROC GEE troubleshooting

Re: PROC GEE troubleshooting

PROC GEE Year Variable Coding Affects P-Values and Estimates

PROC GENMOD. Error with GEE

PROC GENMOD/PROC GEE for repeated County-level data

PROC GEE (or proc GENMOD) does not converge

Tip: How to collect troubleshooting information for SAS Visual Data Mi...

Follow Us

What is...

Statistical Procedures

Our biggest data and AI event of the year.

Follow Us

What is...