Contributor
Posts: 50

# Using continuous variables in CLASS statement in PROC GENMOD

Hello,

I apologize if the answer to this question is obvious but I'm having trouble finding a clear answer to my question.  I am using PROC GENMOD to estimate some incidence rates (adjusted for age distributions) for different regions, using the ESTIMATE statement. However, I am uncertain which variables to include in my CLASS statement.  Since I have region coded using dummy variables for the different levels, I know that doesn't need to be in the CLASS statement.  But if I am adjusting for age, age^2 and age^-1, do I put them in my CLASS statement?  When I leave the age terms out of the CLASS statement, my incidence rates seem too high.

If possible, please explain to me how the CLASS statement functions when estimating adjusted rates.  Doesn't it essentially make dummy variables across the levels of all of the variables included in the CLASS statement?  What I don't understand is why this is needed to estimate the incidence rates.

Coding below -->

PROC GENMOD DATA=work.Merge_CountPT_AgeRegion1;

CLASS age agesq ageneg1;

MODEL cases = age agesq ageneg1 region2 region3 region4 region5

ESTIMATE "IR: Region 1" int 1 region2 0 region3 0 region4 0 region5 0;

ESTIMATE "IR: Region 2" int 1 region2 1 region3 0 region4 0 region5 0;

ESTIMATE "IR: Region 3" int 1 region2 0 region3 1 region4 0 region5 0;

ESTIMATE "IR: Region 4" int 1 region2 0 region3 0 region4 1 region5 0;

ESTIMATE "IR: Region 5" int 1 region2 0 region3 0 region4 0 region5 1;

RUN;

Posts: 2,655

## Re: Using continuous variables in CLASS statement in PROC GENMOD

Take a look at the write-up for the ESTIMATE statement under Shared Concepts and Topics of the SAS/STAT documentation, as well as in the GENMOD.documentation.

The CLASS statement will generate dummy codes, and there are at least 8 different ways these can be generated using the PARAM= option.  It is critical to know which is being used in estimate statements.

I am curious as to why you have dummy variables precoded for region, when it is a classic candidate for a CLASS variable, and then use age, agesq and ageneg1 as continuous covariates.  The five estimate statements could be replaced with a single LSMEANS statement.

Steve Denham

Posts: 3,055