BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
corbinLA
Calcite | Level 5

Hello All! I have been computing relative risk  for cohort data with the Proc freq Tables command and have been a little confused by the meaning of the two different risk values. When I get the relative risk (Column 1) I understand that this refers to the intersection of the first row and the first column. But what does the column 2 risk refer to? Is this the intersection of the first row and the second column or the second row and the second column? Any help is much appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @corbinLA and welcome to the SAS Support Communities!

 

The relative risk estimates of PROC FREQ are ratios of row proportions ("Row Pct / 100") in the 2x2 table with the first row in the numerator and the second row in the denominator. If you denote these proportions for the first column with p1 (row 1) and p2 (row 2) and for the second column with q1 (row 1) and q2 (row 2), then p1/p2 is the (estimated) "Relative Risk (Column 1)" and q1/q2 = (1-p1)/(1-p2) is the (estimated) "Relative Risk (Column 2)."

 

Example:

Fictitious "cohort study" with 100,000 smokers and 100,000 non-smokers, who (by the end of the study) develop some sort of condition (cond=1) with probabilities 0.25+0.45=0.7 and 0.25, respectively.

data have;
call streaminit(27182818);
do smoker='Y','n';
  do _n_=1 to 100000;
    cond=rand('table',0.25+0.45*(smoker='Y'));
    output;
  end;
end;
run;

proc freq data=have;
tables smoker*cond / relrisk;
run;

Result:

Table of smoker by cond

smoker     cond

Frequency|
Percent  |
Row Pct  |
Col Pct  |       1|       2|  Total
---------+--------+--------+
Y        |  69938 |  30062 | 100000
         |  34.97 |  15.03 |  50.00
         |  69.94 |  30.06 |
         |  73.73 |  28.59 |
---------+--------+--------+
n        |  24918 |  75082 | 100000
         |  12.46 |  37.54 |  50.00
         |  24.92 |  75.08 |
         |  26.27 |  71.41 |
---------+--------+--------+
Total       94856   105144   200000
            47.43    52.57   100.00


Statistics for Table of smoker by cond

                  Odds Ratio and Relative Risks

Statistic                        Value       95% Confidence Limits
------------------------------------------------------------------
Odds Ratio                      7.0100        6.8733        7.1495
Relative Risk (Column 1)        2.8067        2.7746        2.8392
Relative Risk (Column 2)        0.4004        0.3964        0.4045

Sample Size = 200000

SAS doesn't know if cond=1 or cond=2 is the condition of interest, so it estimates the relative risks (true values: 0.7/0.25=2.8, (1-0.7)/(1-0.25)=0.4, estimates: 0.69938/0.24918=2.8067, 0.30062/0.75082=0.4004) for both.

View solution in original post

3 REPLIES 3
FreelanceReinh
Jade | Level 19

Hello @corbinLA and welcome to the SAS Support Communities!

 

The relative risk estimates of PROC FREQ are ratios of row proportions ("Row Pct / 100") in the 2x2 table with the first row in the numerator and the second row in the denominator. If you denote these proportions for the first column with p1 (row 1) and p2 (row 2) and for the second column with q1 (row 1) and q2 (row 2), then p1/p2 is the (estimated) "Relative Risk (Column 1)" and q1/q2 = (1-p1)/(1-p2) is the (estimated) "Relative Risk (Column 2)."

 

Example:

Fictitious "cohort study" with 100,000 smokers and 100,000 non-smokers, who (by the end of the study) develop some sort of condition (cond=1) with probabilities 0.25+0.45=0.7 and 0.25, respectively.

data have;
call streaminit(27182818);
do smoker='Y','n';
  do _n_=1 to 100000;
    cond=rand('table',0.25+0.45*(smoker='Y'));
    output;
  end;
end;
run;

proc freq data=have;
tables smoker*cond / relrisk;
run;

Result:

Table of smoker by cond

smoker     cond

Frequency|
Percent  |
Row Pct  |
Col Pct  |       1|       2|  Total
---------+--------+--------+
Y        |  69938 |  30062 | 100000
         |  34.97 |  15.03 |  50.00
         |  69.94 |  30.06 |
         |  73.73 |  28.59 |
---------+--------+--------+
n        |  24918 |  75082 | 100000
         |  12.46 |  37.54 |  50.00
         |  24.92 |  75.08 |
         |  26.27 |  71.41 |
---------+--------+--------+
Total       94856   105144   200000
            47.43    52.57   100.00


Statistics for Table of smoker by cond

                  Odds Ratio and Relative Risks

Statistic                        Value       95% Confidence Limits
------------------------------------------------------------------
Odds Ratio                      7.0100        6.8733        7.1495
Relative Risk (Column 1)        2.8067        2.7746        2.8392
Relative Risk (Column 2)        0.4004        0.3964        0.4045

Sample Size = 200000

SAS doesn't know if cond=1 or cond=2 is the condition of interest, so it estimates the relative risks (true values: 0.7/0.25=2.8, (1-0.7)/(1-0.25)=0.4, estimates: 0.69938/0.24918=2.8067, 0.30062/0.75082=0.4004) for both.

corbinLA
Calcite | Level 5

Thank you this is an excellent explanation!

Watts
SAS Employee

Thanks @FreelanceReinh 

 

Also -- if you want to display the relative risk for only one column, you can use the RELRISK COLUMN= option to specify which column. For example, this displays the column 1 relative risk together with score confidence limits. 

 

tables smoker * cond / relrisk(column=1 cl=score);

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 3498 views
  • 1 like
  • 3 in conversation