Statistical Procedures

corbinLA · Posted 11-13-2020 05:13 PM

Hello All! I have been computing relative risk for cohort data with the Proc freq Tables command and have been a little confused by the meaning of the two different risk values. When I get the relative risk (Column 1) I understand that this refers to the intersection of the first row and the first column. But what does the column 2 risk refer to? Is this the intersection of the first row and the second column or the second row and the second column? Any help is much appreciated!

FreelanceReinh · Posted 11-14-2020 05:27 AM

Hello @corbinLA and welcome to the SAS Support Communities!

The relative risk estimates of PROC FREQ are ratios of row proportions ("Row Pct / 100") in the 2x2 table with the first row in the numerator and the second row in the denominator. If you denote these proportions for the first column with p₁ (row 1) and p₂ (row 2) and for the second column with q₁ (row 1) and q₂ (row 2), then p₁/p₂ is the (estimated) "Relative Risk (Column 1)" and q₁/q₂ = (1-p₁)/(1-p₂) is the (estimated) "Relative Risk (Column 2)."

Example:

Fictitious "cohort study" with 100,000 smokers and 100,000 non-smokers, who (by the end of the study) develop some sort of condition (cond=1) with probabilities 0.25+0.45=0.7 and 0.25, respectively.

data have;
call streaminit(27182818);
do smoker='Y','n';
  do _n_=1 to 100000;
    cond=rand('table',0.25+0.45*(smoker='Y'));
    output;
  end;
end;
run;

proc freq data=have;
tables smoker*cond / relrisk;
run;

Result:

Table of smoker by cond

smoker     cond

Frequency|
Percent  |
Row Pct  |
Col Pct  |       1|       2|  Total
---------+--------+--------+
Y        |  69938 |  30062 | 100000
         |  34.97 |  15.03 |  50.00
         |  69.94 |  30.06 |
         |  73.73 |  28.59 |
---------+--------+--------+
n        |  24918 |  75082 | 100000
         |  12.46 |  37.54 |  50.00
         |  24.92 |  75.08 |
         |  26.27 |  71.41 |
---------+--------+--------+
Total       94856   105144   200000
            47.43    52.57   100.00


Statistics for Table of smoker by cond

                  Odds Ratio and Relative Risks

Statistic                        Value       95% Confidence Limits
------------------------------------------------------------------
Odds Ratio                      7.0100        6.8733        7.1495
Relative Risk (Column 1)        2.8067        2.7746        2.8392
Relative Risk (Column 2)        0.4004        0.3964        0.4045

Sample Size = 200000

SAS doesn't know if cond=1 or cond=2 is the condition of interest, so it estimates the relative risks (true values: 0.7/0.25=2.8, (1-0.7)/(1-0.25)=0.4, estimates: 0.69938/0.24918=2.8067, 0.30062/0.75082=0.4004) for both.

View solution in original post

FreelanceReinh · Posted 11-14-2020 05:27 AM

Hello @corbinLA and welcome to the SAS Support Communities!

The relative risk estimates of PROC FREQ are ratios of row proportions ("Row Pct / 100") in the 2x2 table with the first row in the numerator and the second row in the denominator. If you denote these proportions for the first column with p₁ (row 1) and p₂ (row 2) and for the second column with q₁ (row 1) and q₂ (row 2), then p₁/p₂ is the (estimated) "Relative Risk (Column 1)" and q₁/q₂ = (1-p₁)/(1-p₂) is the (estimated) "Relative Risk (Column 2)."

Example:

Fictitious "cohort study" with 100,000 smokers and 100,000 non-smokers, who (by the end of the study) develop some sort of condition (cond=1) with probabilities 0.25+0.45=0.7 and 0.25, respectively.

data have;
call streaminit(27182818);
do smoker='Y','n';
  do _n_=1 to 100000;
    cond=rand('table',0.25+0.45*(smoker='Y'));
    output;
  end;
end;
run;

proc freq data=have;
tables smoker*cond / relrisk;
run;

Result:

Table of smoker by cond

smoker     cond

Frequency|
Percent  |
Row Pct  |
Col Pct  |       1|       2|  Total
---------+--------+--------+
Y        |  69938 |  30062 | 100000
         |  34.97 |  15.03 |  50.00
         |  69.94 |  30.06 |
         |  73.73 |  28.59 |
---------+--------+--------+
n        |  24918 |  75082 | 100000
         |  12.46 |  37.54 |  50.00
         |  24.92 |  75.08 |
         |  26.27 |  71.41 |
---------+--------+--------+
Total       94856   105144   200000
            47.43    52.57   100.00


Statistics for Table of smoker by cond

                  Odds Ratio and Relative Risks

Statistic                        Value       95% Confidence Limits
------------------------------------------------------------------
Odds Ratio                      7.0100        6.8733        7.1495
Relative Risk (Column 1)        2.8067        2.7746        2.8392
Relative Risk (Column 2)        0.4004        0.3964        0.4045

Sample Size = 200000

SAS doesn't know if cond=1 or cond=2 is the condition of interest, so it estimates the relative risks (true values: 0.7/0.25=2.8, (1-0.7)/(1-0.25)=0.4, estimates: 0.69938/0.24918=2.8067, 0.30062/0.75082=0.4004) for both.

corbinLA · Posted 11-14-2020 11:51 AM

Thank you this is an excellent explanation!

Watts · Posted 11-14-2020 12:08 PM

Thanks @FreelanceReinh

Also -- if you want to display the relative risk for only one column, you can use the RELRISK COLUMN= option to specify which column. For example, this displays the column 1 relative risk together with score confidence limits.

tables smoker * cond / relrisk(column=1 cl=score);

Statistical Procedures

Relative Risk

Re: Relative Risk

Re: Relative Risk

Re: Relative Risk

Re: Relative Risk

Follow Us

What is...

Statistical Procedures

Relative Risk

Re: Relative Risk

Re: Relative Risk

Re: Relative Risk

Re: Relative Risk

Our biggest data and AI event of the year.

Follow Us

What is...