Hello All! I have been computing relative risk for cohort data with the Proc freq Tables command and have been a little confused by the meaning of the two different risk values. When I get the relative risk (Column 1) I understand that this refers to the intersection of the first row and the first column. But what does the column 2 risk refer to? Is this the intersection of the first row and the second column or the second row and the second column? Any help is much appreciated!
Hello @corbinLA and welcome to the SAS Support Communities!
The relative risk estimates of PROC FREQ are ratios of row proportions ("Row Pct / 100") in the 2x2 table with the first row in the numerator and the second row in the denominator. If you denote these proportions for the first column with p1 (row 1) and p2 (row 2) and for the second column with q1 (row 1) and q2 (row 2), then p1/p2 is the (estimated) "Relative Risk (Column 1)" and q1/q2 = (1-p1)/(1-p2) is the (estimated) "Relative Risk (Column 2)."
Example:
Fictitious "cohort study" with 100,000 smokers and 100,000 non-smokers, who (by the end of the study) develop some sort of condition (cond=1) with probabilities 0.25+0.45=0.7 and 0.25, respectively.
data have;
call streaminit(27182818);
do smoker='Y','n';
do _n_=1 to 100000;
cond=rand('table',0.25+0.45*(smoker='Y'));
output;
end;
end;
run;
proc freq data=have;
tables smoker*cond / relrisk;
run;
Result:
Table of smoker by cond
smoker cond
Frequency|
Percent |
Row Pct |
Col Pct | 1| 2| Total
---------+--------+--------+
Y | 69938 | 30062 | 100000
| 34.97 | 15.03 | 50.00
| 69.94 | 30.06 |
| 73.73 | 28.59 |
---------+--------+--------+
n | 24918 | 75082 | 100000
| 12.46 | 37.54 | 50.00
| 24.92 | 75.08 |
| 26.27 | 71.41 |
---------+--------+--------+
Total 94856 105144 200000
47.43 52.57 100.00
Statistics for Table of smoker by cond
Odds Ratio and Relative Risks
Statistic Value 95% Confidence Limits
------------------------------------------------------------------
Odds Ratio 7.0100 6.8733 7.1495
Relative Risk (Column 1) 2.8067 2.7746 2.8392
Relative Risk (Column 2) 0.4004 0.3964 0.4045
Sample Size = 200000
SAS doesn't know if cond=1 or cond=2 is the condition of interest, so it estimates the relative risks (true values: 0.7/0.25=2.8, (1-0.7)/(1-0.25)=0.4, estimates: 0.69938/0.24918=2.8067, 0.30062/0.75082=0.4004) for both.
Hello @corbinLA and welcome to the SAS Support Communities!
The relative risk estimates of PROC FREQ are ratios of row proportions ("Row Pct / 100") in the 2x2 table with the first row in the numerator and the second row in the denominator. If you denote these proportions for the first column with p1 (row 1) and p2 (row 2) and for the second column with q1 (row 1) and q2 (row 2), then p1/p2 is the (estimated) "Relative Risk (Column 1)" and q1/q2 = (1-p1)/(1-p2) is the (estimated) "Relative Risk (Column 2)."
Example:
Fictitious "cohort study" with 100,000 smokers and 100,000 non-smokers, who (by the end of the study) develop some sort of condition (cond=1) with probabilities 0.25+0.45=0.7 and 0.25, respectively.
data have;
call streaminit(27182818);
do smoker='Y','n';
do _n_=1 to 100000;
cond=rand('table',0.25+0.45*(smoker='Y'));
output;
end;
end;
run;
proc freq data=have;
tables smoker*cond / relrisk;
run;
Result:
Table of smoker by cond
smoker cond
Frequency|
Percent |
Row Pct |
Col Pct | 1| 2| Total
---------+--------+--------+
Y | 69938 | 30062 | 100000
| 34.97 | 15.03 | 50.00
| 69.94 | 30.06 |
| 73.73 | 28.59 |
---------+--------+--------+
n | 24918 | 75082 | 100000
| 12.46 | 37.54 | 50.00
| 24.92 | 75.08 |
| 26.27 | 71.41 |
---------+--------+--------+
Total 94856 105144 200000
47.43 52.57 100.00
Statistics for Table of smoker by cond
Odds Ratio and Relative Risks
Statistic Value 95% Confidence Limits
------------------------------------------------------------------
Odds Ratio 7.0100 6.8733 7.1495
Relative Risk (Column 1) 2.8067 2.7746 2.8392
Relative Risk (Column 2) 0.4004 0.3964 0.4045
Sample Size = 200000
SAS doesn't know if cond=1 or cond=2 is the condition of interest, so it estimates the relative risks (true values: 0.7/0.25=2.8, (1-0.7)/(1-0.25)=0.4, estimates: 0.69938/0.24918=2.8067, 0.30062/0.75082=0.4004) for both.
Thank you this is an excellent explanation!
Thanks @FreelanceReinh
Also -- if you want to display the relative risk for only one column, you can use the RELRISK COLUMN= option to specify which column. For example, this displays the column 1 relative risk together with score confidence limits.
tables smoker * cond / relrisk(column=1 cl=score);
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.