BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lisahoward
Calcite | Level 5

Hello.

I am trying to calculate the RR using Poisson regression for a cohort of patients (treated=1, untreated=0) who have an outcome of X using the following Syntax.  i have calculated the IR rate by exposure  in a previous datastep as I need to call it in to use in the model data line in this procedure.  The model is adjusted with some  specified 0/1 covariates which I call in using the  &covariates statement.  I also need to include two other variables as covariates Age_At_Index and Index_Year.  My question is do I need to put them in the Class statement?  The model looks at Incidence Rate = Exposure.

Many thanks in advance for any help.

ods listing;

proc genmod data=IRsforCohort;

class Exposure IndexYear Age_At_Index;

model IR=Exposure &covariates /  dist = poisson  link=log type3;

estimate ' 1 Vs 0' Exposure 1 -1 ;

       ods output ParameterEstimates=RR&infile._adj;

        title1 "Proc GENMOD for Adjusted";

      run;

1 ACCEPTED SOLUTION

Accepted Solutions
JacobSimonsen
Barite | Level 11

No, its opposite.

Continous variables should not be included in the class statement.

Categorical variables should be included in the class statement.

View solution in original post

18 REPLIES 18
Ksharp
Super User

Yes. If I understood right, you should put all the explanation variables into CLASS for calculating RATIO .

lisahoward
Calcite | Level 5

Thank you so I would put the &covariates in the class statement as well?  I had understood I only needed to put continuous variables  (such as Age and Year in the class statement) and then any other covariates that were categorical did not need to be listed in the class statement. The &covariates contains Age and Year as well as all the 0/1 answered variables as well.

JacobSimonsen
Barite | Level 11

No, its opposite.

Continous variables should not be included in the class statement.

Categorical variables should be included in the class statement.

Ksharp
Super User

Jacob,

OP is doing a mode for estimating rate and rate ratio by Possion Regression.

Check the following url .Note its AGE is included in CLASS statement.

http://support.sas.com/kb/24/188.html

Xia Keshan

JacobSimonsen
Barite | Level 11

The question is whether or not contionous variables should be included in the CLASS statement.

When a variable is not included in the class statement you assume that the log(mean) increase proportional with the variable. This is meaningful for contionous variables.

When the variable is included in the class statement the log(mean) will be different for each level of the variable without any assumptions about linear slope. Therefore, one parameter for each level of the variable. If the variable truely is contious there will be as many parameters as there are observations, which is not meaninfull.

If age or weight is measured exact then it is contious and can not be included in class. If the are grouped (Young/old or low/heavy) then they can be included in class. This is not special for Poisson regression, same rule apply to most other regression models.

lisahoward
Calcite | Level 5

Dear Jacob,

Thank you so much for your help.

I will modify the syntax to include all categorical variables in the class statement only and then in the model statement i will include the categorical variables + age and year of index.  I appreciate your help.

Could I ask your advice on the offset option please.  Should I include this , I am unsure what and how this will effect the RR. The question I am trying to answer or the RR I am trying to get is based on the following:

Treatment Difference in the Diagnosis (Headache) Incidence rate , during the  follow-up period (defined as 90 days) between treated and untreated group.

I am doing both unadjusted and adjusted Relative Rates.

Also should I be displaying  the LBETA estimate value or the mean estimate value in my report to represent  the RR value?  I apologise I am not a statistician so any advice is truly appreciated.

Many thanks in advance.

JacobSimonsen
Barite | Level 11

Create a offset=log(personyears) and put it as an offset in the model statement. It is neccessary because the number of event in average is proportional with person-years.

That Means (in average) events=exp(Xβ)*personyears,

take log on both sides gives Log(events)=Xβ)+log(personyears). For this reason, log(personyears) should be used as offset. Without the offset variable your results will be wrong and nonsense.

I think most non-statisticians will not be interested in the Lbeta estimate, but only estimate of rates and rate-ratios. With confidenceintervals ofcourse.

In case you have precise entry and exit information on you study-individuals, then I will recommend Cox-regression instead of Poisson regression.

lisahoward
Calcite | Level 5

Hi ,

How would I go about creating offset=log(personyears).


This is my current syntax:

ods listing;

proc genmod data=IRsforCohort;

class TestRx &covariates ;

model IR=TestRX &covariates Age_At_Index IndexYear /  dist = poisson  link=log  type3;

Estimate ' 1 Vs 0' testrx 1 -1 ;

       ods output Estimates=RR&infile._adj;

        title1 "Proc GENMOD for Adjusted &infile &time";

      run;

Do I just add (in bold):

ods listing;

proc genmod data=IRsforCohort;

class TestRx &covariates ;

model IR=TestRX &covariates Age_At_Index IndexYear /  dist = poisson  offset= personyears link=log  type3;

Estimate ' 1 Vs 0' testrx 1 -1 ;

       ods output Estimates=RR&infile._adj;

        title1 "Proc GENMOD for Adjusted &infile &time";

      run;

JacobSimonsen
Barite | Level 11

The offset variable should be made in a datastep before PROC GENMOD. It can be that "person-years" is not the right Word, but it should be a variable that measure how much observation time you have on each observation. Lets say you have an entry and exit time on each individual, then you create the offset like this:

data irsforcohort;

  set irsforcohort;

  myoffset=log(exit-entry);

run;

and the model statement in the genmod:

model IR=TestRX &covariates Age_At_Index IndexYear /  dist = poisson  link=log  type3 offset=myoffset;

If you have equally much observationtime on each observation then you can omit the offset.

lisahoward
Calcite | Level 5

Great thank you so much I am very appreciatl=ive of the time you have taken to answer my questions.  I already created person years in a previous data step so that I could calculate the Incidence rate of the outcome.    I use the IR in the model statement:

IR=Exposure .  So I  will include the Person years variable in the offset option.  I presume I am correct in ssuming it should be IR=Exposure and not Outcome=Exposure as the spec wording I am going off in the protocol is :

Unadjusted model: Headace rate = Exposure;

Adjusted model: AMI rate =Exposure Group + Age+ indexyear + baseline risk factors

I am assuming Headache rate is Incidence Rate and not Outcome .?

JacobSimonsen
Barite | Level 11

no. not person years in offset.  LOG(personyears) in offset.
and, on the left side of the "=" you should have number of events, which I assume is the outcome variable.

lisahoward
Calcite | Level 5

Sorry to bother you again.  I put the syntaxwith outcome=EXPOSURE  instead or IR=EXPOSURE to the left of =, But i think i have the 'offset=log(PersonYrs'' bit wrong as I get an error. Please can you help, I have created PersonYrs so this variable exists in the IRsforCohort dataset.

thanks.

proc genmod data=IRsforCohort;

class EXPOSURE &covariates ;

model outcome=EXPOSURE &covariates Age_At_Index IndexYear /  dist = poisson  offset=log(PersonYrs) link=log  type3;

Estimate ' 1 Vs 0' EXPOSURE 1 -1 ;

       ods output Estimates=RR&infile._adj;

        title1 "Proc GENMOD for Adjusted &infile &time";

      run;

ERROR: Variable LOG not found.

ERROR 22-322: Syntax error, expecting one of the following: ;, A, AGGREGATE, ALPH, ALPHA, CICONV, CL, CLCONV, CODING, CONVERGE, CONVH,

              CORRB, COVB, D, DIAGNOSTICS, DIST, DSCALE, ERR, ERROR, EXPECTED, ID, INFLUENCE, INITIAL, INTERCEPT, ITPRINT, LINK, LOGNB,

              LRCI, LRCL, MAXIT, MAXITER, NOINT, NOLOGNB, NOPRINTCL, NOSCALE, OBSTAT, OBSTATS, OFFSET, P, PRED, PREDICTED, PSCALE, R,

              RESIDUAL, RORDER, SCALE, SCORING, SINGULAR, TYPE1, TYPE3, TYPE3WALD, WALD, WALDCI, WALDCL, XVARS.

JacobSimonsen
Barite | Level 11

you should make the offset variable in a datastep before. As I showed with the "myoffset" above..

lisahoward
Calcite | Level 5

Thank you Jacob I created an offset variable and then included it in my procedure:

proc genmod data=IRsforCohort;

class Exposure &covariates ;

model outcome=Exposure &covariates Age_At_Index IndexYear /  dist = poisson  offset=myoffset link=log  type3;

Estimate ' 1 Vs 0' Exposure 1 -1 ;

       ods output Estimates=RR&infile._adj;

        title1 "Proc GENMOD for Adjusted &infile &time";

      run;

I changed the IR variable I had created in a previous data step and now use the Outcome variable instead so have outcome=Exposure.

I get the following results. In your opinion you think I should use the Mean rather than the LBeta variables to show relative risk for

Treatment Difference in Headache rate during 90 days follow-up period for treated and untreated group

                                           Contrast Estimate Results

                    Mean            Mean              L'Beta    Standard                     L'Beta             Chi-

     Label      Estimate      Confidence Limits     Estimate       Error     Alpha      Confidence Limits     Square    Pr > ChiSq

      1 Vs 0     0.9714      0.5534      1.7049     -0.0291      0.2870      0.05     -0.5916      0.5335      0.01        0.9194

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 18 replies
  • 3430 views
  • 6 likes
  • 3 in conversation