BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
telc24
Obsidian | Level 7

Hello,

 

I'm using PROC GENMOD in version 9.4 to calculate adjusted prevalence ratios and 95% CIs. In my adjusted models, I'm getting different estimates for these values, which are sometimes in the opposite direction. I'm a bit confused about why the estimates are differing for some, but not all, variables, and which are the "true" estimates that I should be reporting. My code and brief explanation of the model are below. Thank you!

 

Code to estimate the prevalence ratio of my binary outcome of disease status (yes/no). My primary exposure of interest is sex (binary), and covariates are age (binary) and race/ethnicity (categorical with 5 levels). 

 

proc genmod data=analysis;
class id age2(ref='1') sex (ref='M') race_ethnicity (ref='1');
model disease (event='1') = age2 sex race_ethnicity / dist=poisson link=log;
repeated subject=id /type=unstr;
estimate "PR for age2" age2 1 -1/exp;
estimate "PR for sex" sex 1 -1/exp;
estimate "PR for NHB vs NHW" race_ethnicity 1 -1 0 0 0/exp;
estimate "PR for H vs NHW" race_ethnicity 1 0 -1 0 0/exp;
estimate "PR for NHA vs NHW" race_ethnicity 1 0 0 -1 0/exp;
estimate "PR for NHO vs NHW" race_ethnicity 1 0 0 0 -1/exp;
run;

Here is the output. Here, the estimates for age2 and sex are the same for the top and bottom tables. But they start to differ for the race_ethnicity results, so much so that the estimates are in the opposite direction. For example, the estimate for race_ethnicity 4 in the top table is -0.0499 but 0.1609 in the bottom table. 

telc24_0-1732560807342.png

Any documentation or other posts that you recommend would also be greatly appreciated! Thank you!

 

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

I believe the problem here is because you want to make comparisons with your reference level, 1, and the REF='1' option makes it the LAST level in the parameter estimates table, but your ESTIMATE statements contrast each level against the FIRST level (referring to your original post). Try moving all the "1" values from the first position to the last position. For example, the first ESTIMATE statement for race_ethnicity would become:

estimate "PR for NHB vs NHW" race_ethnicity -1 0 0 0 1/exp;

This should make all of the values in the L'Beta column just the negatives of the corresponding parameter estimates since you are then estimating reference-level_i differences rather than the other way around, which is what the parameters estimate.

 

The values for an effect in the ESTIMATE statement are applied to the parameter estimates in the order in which they appear in the parameter estimates table. 

View solution in original post

11 REPLIES 11
PaigeMiller
Diamond | Level 26

In the top table, you have race_ethnicity as numbers 1 through 5. It seems to me the names used in the text in the bottom table (example: "PR for H vs NHW") don't align with 1 through 5. Can you tell us the alignment? Because a different alignment could make the results in the bottom table align with the results in the top table.

--
Paige Miller
telc24
Obsidian | Level 7

Yes, sorry! race_ethnicity is a categorical variable with the following coding. NHW is the reference group.

1=NHW

2=NHB

3=H (noted as Hisp in the table from different coding by example)

4=NHA

5=NHO

 

Thank you!

PaigeMiller
Diamond | Level 26

In the top table, race_ethnicity = 2 shows an estimate of 0.2162 and this matches the bottom table estimate for NHO vs NHW.

 

So there is still some misalignment of numbers to labels, as far as I can tell. Perhaps having race_ethnicity (ref='1') affects this somehow. As a wild guess, re-run this code without race_ethnicity (ref='1') and see if the numbers match what you (and I) expect.

--
Paige Miller
telc24
Obsidian | Level 7

Great point about the race_ethnicity=2 estimate. 

 

Do you mean remove race_ethnicity(ref='1') from the class statement completely? If I do that, does it treat it as a continuous variable? Alternatively, if I remove just the "(ref='1'), then it automatically uses race_ethnicity=5 as the reference group.

PaigeMiller
Diamond | Level 26

Yes remove the (ref='1') and see if the estimates are comparable.

--
Paige Miller
telc24
Obsidian | Level 7

Got it! When the only change that I make is remove the "(ref='1')", the estimates aren't comparable, EXCEPT For the NHO group (like before). So now I'm even more puzzled about how to troubleshoot!

telc24_0-1732566260540.png

 

PaigeMiller
Diamond | Level 26

The estimates are comparable.

 

NHB vs NHW is estimate 1 minus estimate 2

Hisp vs NHW is estimate 1 minus estimate 3

and so on

 

 

So if you go back to the original tables, I think you will find something similar.

--
Paige Miller
telc24
Obsidian | Level 7

Ok I think I'm starting to understand! So I should use the exponentiated RR (PR) estimates in the second/bottom table, correct? Do you know why the ref='1' statement seems to be messing it up? 

 

As an aside, I think i found I had a mistake in the code, where I think the reference group (which I want to be NHW) should be the "-1" instead of "1". 

proc genmod data=analysis;
class id age2(ref='1') sex (ref='M') race_ethnicity ;
model disease (event='1') = age2 sex race_ethnicity / dist=poisson link=log;
repeated subject=id /type=unstr;
estimate "PR for age2" age2 1 -1/exp;
estimate "PR for sex" sex 1 -1/exp;
estimate "PR for NHB vs NHW" race_ethnicity -1 1 0 0 0/exp;
estimate "PR for H vs NHW" race_ethnicity -1 0 1 0 0/exp;
estimate "PR for NHA vs NHW" race_ethnicity -1 0 0 1 0/exp;
estimate "PR for NHO vs NHW" race_ethnicity -1 0 0 0 1/exp;
run;

When I make that adjustment, the directionality of the estimates make sense

 

telc24_0-1732568024697.png

 

Thank you!!

PaigeMiller
Diamond | Level 26

Do you know why the ref='1' statement seems to be messing it up? 

 

I don't know. I suspect @StatDave or @jiltao might have an explanation.

--
Paige Miller
StatDave
SAS Super FREQ

I believe the problem here is because you want to make comparisons with your reference level, 1, and the REF='1' option makes it the LAST level in the parameter estimates table, but your ESTIMATE statements contrast each level against the FIRST level (referring to your original post). Try moving all the "1" values from the first position to the last position. For example, the first ESTIMATE statement for race_ethnicity would become:

estimate "PR for NHB vs NHW" race_ethnicity -1 0 0 0 1/exp;

This should make all of the values in the L'Beta column just the negatives of the corresponding parameter estimates since you are then estimating reference-level_i differences rather than the other way around, which is what the parameters estimate.

 

The values for an effect in the ESTIMATE statement are applied to the parameter estimates in the order in which they appear in the parameter estimates table. 

telc24
Obsidian | Level 7

That fixed it! Thank you!! I didn't realize that's what was occurring in the background when using ref='1'. Thank you for the explanation and your time!

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 865 views
  • 6 likes
  • 3 in conversation