- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello All! I have been computing relative risk for cohort data with the Proc freq Tables command and have been a little confused by the meaning of the two different risk values. When I get the relative risk (Column 1) I understand that this refers to the intersection of the first row and the first column. But what does the column 2 risk refer to? Is this the intersection of the first row and the second column or the second row and the second column? Any help is much appreciated!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @corbinLA and welcome to the SAS Support Communities!
The relative risk estimates of PROC FREQ are ratios of row proportions ("Row Pct / 100") in the 2x2 table with the first row in the numerator and the second row in the denominator. If you denote these proportions for the first column with p1 (row 1) and p2 (row 2) and for the second column with q1 (row 1) and q2 (row 2), then p1/p2 is the (estimated) "Relative Risk (Column 1)" and q1/q2 = (1-p1)/(1-p2) is the (estimated) "Relative Risk (Column 2)."
Example:
Fictitious "cohort study" with 100,000 smokers and 100,000 non-smokers, who (by the end of the study) develop some sort of condition (cond=1) with probabilities 0.25+0.45=0.7 and 0.25, respectively.
data have;
call streaminit(27182818);
do smoker='Y','n';
do _n_=1 to 100000;
cond=rand('table',0.25+0.45*(smoker='Y'));
output;
end;
end;
run;
proc freq data=have;
tables smoker*cond / relrisk;
run;
Result:
Table of smoker by cond
smoker cond
Frequency|
Percent |
Row Pct |
Col Pct | 1| 2| Total
---------+--------+--------+
Y | 69938 | 30062 | 100000
| 34.97 | 15.03 | 50.00
| 69.94 | 30.06 |
| 73.73 | 28.59 |
---------+--------+--------+
n | 24918 | 75082 | 100000
| 12.46 | 37.54 | 50.00
| 24.92 | 75.08 |
| 26.27 | 71.41 |
---------+--------+--------+
Total 94856 105144 200000
47.43 52.57 100.00
Statistics for Table of smoker by cond
Odds Ratio and Relative Risks
Statistic Value 95% Confidence Limits
------------------------------------------------------------------
Odds Ratio 7.0100 6.8733 7.1495
Relative Risk (Column 1) 2.8067 2.7746 2.8392
Relative Risk (Column 2) 0.4004 0.3964 0.4045
Sample Size = 200000
SAS doesn't know if cond=1 or cond=2 is the condition of interest, so it estimates the relative risks (true values: 0.7/0.25=2.8, (1-0.7)/(1-0.25)=0.4, estimates: 0.69938/0.24918=2.8067, 0.30062/0.75082=0.4004) for both.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @corbinLA and welcome to the SAS Support Communities!
The relative risk estimates of PROC FREQ are ratios of row proportions ("Row Pct / 100") in the 2x2 table with the first row in the numerator and the second row in the denominator. If you denote these proportions for the first column with p1 (row 1) and p2 (row 2) and for the second column with q1 (row 1) and q2 (row 2), then p1/p2 is the (estimated) "Relative Risk (Column 1)" and q1/q2 = (1-p1)/(1-p2) is the (estimated) "Relative Risk (Column 2)."
Example:
Fictitious "cohort study" with 100,000 smokers and 100,000 non-smokers, who (by the end of the study) develop some sort of condition (cond=1) with probabilities 0.25+0.45=0.7 and 0.25, respectively.
data have;
call streaminit(27182818);
do smoker='Y','n';
do _n_=1 to 100000;
cond=rand('table',0.25+0.45*(smoker='Y'));
output;
end;
end;
run;
proc freq data=have;
tables smoker*cond / relrisk;
run;
Result:
Table of smoker by cond
smoker cond
Frequency|
Percent |
Row Pct |
Col Pct | 1| 2| Total
---------+--------+--------+
Y | 69938 | 30062 | 100000
| 34.97 | 15.03 | 50.00
| 69.94 | 30.06 |
| 73.73 | 28.59 |
---------+--------+--------+
n | 24918 | 75082 | 100000
| 12.46 | 37.54 | 50.00
| 24.92 | 75.08 |
| 26.27 | 71.41 |
---------+--------+--------+
Total 94856 105144 200000
47.43 52.57 100.00
Statistics for Table of smoker by cond
Odds Ratio and Relative Risks
Statistic Value 95% Confidence Limits
------------------------------------------------------------------
Odds Ratio 7.0100 6.8733 7.1495
Relative Risk (Column 1) 2.8067 2.7746 2.8392
Relative Risk (Column 2) 0.4004 0.3964 0.4045
Sample Size = 200000
SAS doesn't know if cond=1 or cond=2 is the condition of interest, so it estimates the relative risks (true values: 0.7/0.25=2.8, (1-0.7)/(1-0.25)=0.4, estimates: 0.69938/0.24918=2.8067, 0.30062/0.75082=0.4004) for both.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you this is an excellent explanation!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @FreelanceReinh
Also -- if you want to display the relative risk for only one column, you can use the RELRISK COLUMN= option to specify which column. For example, this displays the column 1 relative risk together with score confidence limits.
tables smoker * cond / relrisk(column=1 cl=score);