Hi, I am working on an epidemiologic analysis using SAS 9.4 and would appreciate guidance on how to correctly estimate a Relative Excess Risk due to Interaction (RERI) when the exposure variable has three categories.
Context of the analysis:
My question:
Is there a valid way in SAS to compute a RERI when the exposure has three categories after running PROC MI and PROC MIANALYZE?
Specifically, I am trying to understand whether:
If there is an example of code or recommended workflow for handling RERI with a three‑level exposure after multiple imputation, I would be grateful for any guidance.
Thank you very much for your help.
Best regards,
Here is my code so far:
RERI appears to be the same idea as the "difference in difference of means" but looking at the difference in difference of relative risks instead of probabilities (means). See this note on estimating the difference in difference of means. It is essentially the interaction between the predictors but using the mean, or relative risk, in each term of the RERI formula. The "Generalized Linear Models with a Non-Identity Link" section of the note uses a logistic model on binary predictors and response. It then computes the difference in difference of probabilities which is often thought to be more easily understood than log odds, odds ratios, or relative risks. Nevertheless, the NLMeans macro could be used compute the RERI for this model. You might need to download the latest version of the macro to obtain the functionality shown below.
For example, these statements fit the model, save the necessary components for NLMeans, and then NLMeans computes RERI by specifying its formula in f= as shown in flabel=.
proc logistic;
class a b / param=glm ref=first;
model y(event="1") = a b a*b;
lsmeans a*b / e ilink;
ods output coef=coeffs;
store log;
run;
%nlmeans(instore=log, coef=coeffs, link=logit, f=mu1/mu4-mu2/mu4-mu3/mu4+1,
title=RERI, flabel=RR(A,B)-RR(A-,B)-RR(A,B-)+1)
Note, however, that some would use a different model which defines the additivity of the predictors differently. For example, see Richardson DB, Kaufman JS. Estimation of the relative excess risk due to interaction and associated confidence bounds. Am J Epidemiol. 2009 Mar 15;169(6):756-60.
In the case of a multinomial predictor, as in your case, you could either compute a separate RERI for each non-reference level, or combine the non-reference levels to make the predictor binary.
Using the example in the note, suppose that A has three levels (2, 1, 0). The following computes the RERI for A=1 and then for A=2, considering A=0 as the reference level.
%nlmeans(instore=log, coef=coeffs, link=logit, f=mu1/mu6-mu2/mu6-mu5/mu6+1,
title=RERI(A1), flabel=RR(A1,B)-RR(A1-,B)-RR(A1,B-)+1)
%nlmeans(instore=log, coef=coeffs, link=logit, f=mu3/mu6-mu4/mu6-mu5/mu6+1,
title=RERI(A2), flabel=RR(A2,B)-RR(A2-,B)-RR(A2,B-)+1)
On further review and considering VanderWeele & Knol (2014), my approach with NLMeans might not be correct. As they show for a logistic model on two binary predictors (logit(p)=b0+b1*A+b2*B+b3*AB), RERI is defined as a function of odds ratios and computed as
RERI = OR(A1B1) - OR(A1B0) - OR(A0B1) + 1 = exp(b1+b2+b3) - exp(b1) - exp(b2) + 1
For example, using the logistic model and comparing the odds of A1B1 to A0B0:
logit(p_A1B1) = logodds(A1B1) = b0+b1+b2+b3
and
logit(p_A0B0) = logodds(A0B0) = b0
The difference is
logodds(A1B1)-logodds(A0B0) = log(OR(A1B1) = b1+b2+b3. So, OR(A1B1) = exp(b1+b2+b3). Similarly for OR(A1B0) and OR(A0B1).
Since the RERI is just a function of the logistic model parameters, it can be easily computed using the NLEST macro. For the binary predictors example in the note on difference in difference that I referred to earlier, these statements fit the model and compute RERI.
proc logistic;
class a b / param=glm ref=first;
model y(event="1") = a b a*b;
lsmeans a*b / e ilink;
ods output coef=coeffs;
store log;
run;
%nlest(instore=log, label=RERI_or, f=exp(b_p2+b_p4+b_p6)-exp(b_p2)-exp(b_p4)+1)
Or you could use NLMeans, but by defining RERI as a function of odds ratios rather than relative risks as I did earlier:
%nlmeans(instore=log, coef=coeffs, link=logit, flabel=RERI_or,
f=(mu1/(1-mu1))/(mu4/(1-mu4)) - (mu2/(1-mu2))/(mu4/(1-mu4)) - (mu3/(1-mu3))/(mu4/(1-mu4)) + 1)
This method reproduces the RERI computed for the second example in the Richardson & Kaufman paper that uses a different model form.
Again, for the multi-level predictor situation like yours, you can do the RERI computation for each of the non-reference levels. For the example of A with 3 levels (0, 1, 2) and B with 2 levels (0, 1):
proc logistic;
class a b / param=glm ref=first;
model y(event="1") = a b a*b;
store log;
run;
%nlest(instore=log, label=RERI_or_A2, f=exp(b_p3+b_p5+b_p9)-exp(b_p3)-exp(b_p5)+1)
%nlest(instore=log, label=RERI_or_A1, f=exp(b_p2+b_p5+b_p7)-exp(b_p2)-exp(b_p5)+1)
---
VanderWeele, T. J., & Knol, M. J. (2014). A tutorial on interaction. Epidemiologic methods, 3(1), 33-72.
Dive into keynotes, announcements and breakthroughs on demand.
Explore Now →ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.