I have a question regarding calculating confidence intervals (95%) in SAS for multiple exposure groups (with one being the reference group). The numbers below represent the number of people who were exposed versus the number of people who became ill along with the cum incidence, sickness rate and risk ratio.
I calculated the following risk ratios by hand with the 18 - 29 group being the reference group and I have the following questions:
- Is it possible to calculate confidence intervals for multiple exposure groups in SAS and if so, what proc and syntax would be used?
- Would I input the table below in SAS order to calculate the confidence intervals in SAS, or would the risk ratios and confidence intervals have to be calculated in SAS (using the raw data)?
Sick | Exposed | cum incidence | rate | risk ratio | Confidence Interval | |
18 – 29 | 3 | 10323 | 0.000291 | 0.03 | ||
30-49 | 23 | 13450 | 0.00171 | 0.2 | 5.9 | |
50-64 | 37 | 5307 | 0.006972 | 0.7 | 24.0 | |
65-79 | 77 | 2014 | 0.038232 | 3.8 | 131.6 | |
80+ | 73 | 653 | 0.111792 | 11.2 | 384.7 |
This can be done using a Poisson model and the LSMEANS statement as shown below. See also this note. The Exponentiated columns in the first LSMEANS table reproduce your rates and in the table of differences gives the ratios and confidence intervals. Note the use of the log of the exposed counts as an offset in the Poisson model.
data b;
input age $ Nsick Nexp;
off=log(Nexp);
datalines;
18-29 3 10323 0.000291 0.03
30-49 23 13450 0.00171 0.2 5.9
50-64 37 5307 0.006972 0.7 24.0
65-79 77 2014 0.038232 3.8 131.6
80+ 73 653 0.111792 11.2 384.7
;
proc genmod;
class age(ref="18-29");
model Nsick=age / dist=poisson offset=off;
lsmeans age / diff=control("18-29") exp cl plots=none;
run;
This can be done using a Poisson model and the LSMEANS statement as shown below. See also this note. The Exponentiated columns in the first LSMEANS table reproduce your rates and in the table of differences gives the ratios and confidence intervals. Note the use of the log of the exposed counts as an offset in the Poisson model.
data b;
input age $ Nsick Nexp;
off=log(Nexp);
datalines;
18-29 3 10323 0.000291 0.03
30-49 23 13450 0.00171 0.2 5.9
50-64 37 5307 0.006972 0.7 24.0
65-79 77 2014 0.038232 3.8 131.6
80+ 73 653 0.111792 11.2 384.7
;
proc genmod;
class age(ref="18-29");
model Nsick=age / dist=poisson offset=off;
lsmeans age / diff=control("18-29") exp cl plots=none;
run;
Here is what I got when I tried the code: Is there supposed to be a "mean" in the age Least Squares Means table?
Analysis Of Maximum Likelihood Parameter Estimates | ||||||||
---|---|---|---|---|---|---|---|---|
Parameter | DF | Estimate | Standard Error |
Wald 95% Confidence Limits | Wald Chi-Square | Pr > ChiSq | ||
Intercept | 1 | 0.0000 | 0.0098 | -0.0193 | 0.0193 | 0.00 | 1.0000 | |
age | 30-49 | 1 | 0.0000 | 0.0131 | -0.0256 | 0.0256 | 0.00 | 1.0000 |
age | 50-64 | 1 | 0.0000 | 0.0169 | -0.0331 | 0.0331 | 0.00 | 1.0000 |
age | 65-79 | 1 | 0.0000 | 0.0244 | -0.0477 | 0.0477 | 0.00 | 1.0000 |
age | 80+ | 1 | 0.0000 | 0.0404 | -0.0791 | 0.0791 | 0.00 | 1.0000 |
age | 18-29 | 0 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | . | . |
Scale | 0 | 1.0000 | 0.0000 | 1.0000 | 1.0000 |
age Least Squares Means | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
age | Estimate | Standard Error | z Value | Pr > |z| | Alpha | Lower | Upper | Exponentiated | Exponentiated Lower |
Exponentiated Upper |
30-49 | 0 | 0.008623 | 0.00 | 1.0000 | 0.05 | -0.01690 | 0.01690 | 1.0000 | 0.9832 | 1.0170 |
50-64 | 0 | 0.01373 | 0.00 | 1.0000 | 0.05 | -0.02690 | 0.02690 | 1.0000 | 0.9735 | 1.0273 |
65-79 | 0 | 0.02228 | 0.00 | 1.0000 | 0.05 | -0.04367 | 0.04367 | 1.0000 | 0.9573 | 1.0446 |
80+ | 0 | 0.03913 | 0.00 | 1.0000 | 0.05 | -0.07670 | 0.07670 | 1.0000 | 0.9262 | 1.0797 |
18-29 | 0 | 0.009842 | 0.00 | 1.0000 | 0.05 | -0.01929 | 0.01929 | 1.0000 | 0.9809 | 1.0195 |
Differences of age Least Squares Means | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
age | _age | Estimate | Standard Error | z Value | Pr > |z| | Alpha | Lower | Upper | Exponentiated | Exponentiated Lower |
Exponentiated Upper |
30-49 | 18-29 | 0 | 0.01309 | 0.00 | 1.0000 | 0.05 | -0.02565 | 0.02565 | 1.0000 | 0.9747 | 1.0260 |
50-64 | 18-29 | 0 | 0.01689 | 0.00 | 1.0000 | 0.05 | -0.03311 | 0.03311 | 1.0000 | 0.9674 | 1.0337 |
65-79 | 18-29 | 0 | 0.02436 | 0.00 | 1.0000 | 0.05 | -0.04774 | 0.04774 | 1.0000 | 0.9534 | 1.0489 |
80+ | 18-29 | 0 | 0.04035 | 0.00 | 1.0000 | 0.05 | -0.07909 | 0.07909 | 1.0000 | 0.9240 | 1.0823 |
You must not have run the code exactly as shown in my previous post since all of your parameter estimates are zero. And no, there shouldn't be a Mean column since the ILINK option is not specified. The EXP option is taking care of that.
Here is what I typed into SAS, where did I go wrong?
data a;
input age $ Hosp Nhosp;
off=log (Nhosp);
datalines;
18-29 3 10323 0.000291 0.03
30-49 23 13450 0.00171 0.2 5.9
50-64 37 5307 0.006972 0.7 24.0
65-79 77 2014 0.038232 3.8 131.6
80+ 73 653 0.111792 11.2 384.7
;
proc genmod;
class age(ref="18-29");
model Nhosp=age / dist=poisson offset=off;
lsmeans age / diff=control("18-29") exp cl plots=none;
run;
Here is the log:
1 data a;
2 input age $ Hosp Nhosp;
3 off=log (Nhosp);
4 datalines;
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.A has 5 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.47 seconds
cpu time 0.04 seconds
11 ;
NOTE: Writing HTML Body file: sashtml.htm
12 proc genmod;
13 class age(ref="18-29");
14 model Nhosp=age / dist=poisson offset=off;
15 lsmeans age / diff=control("18-29") exp cl plots=none;
16 run;
NOTE: Fitting saturated model. Scale will not be estimated.
NOTE: Algorithm converged.
NOTE: The scale parameter was held fixed.
NOTE: The structure of the LSMeans table has changed from an earlier release of SAS.
NOTE: The structure of the Diffs table has changed from an earlier release of SAS.
NOTE: PROCEDURE GENMOD used (Total process time):
real time 2.31 seconds
cpu time 0.29 seconds
I tried running it again and got a different result. Perhaps its the spacing with the data section that made a difference? I noticed that there was spacing in your code, for the datalines, but when I typed mine in, I didn't space.
Analysis Of Maximum Likelihood Parameter Estimates | ||||||||
---|---|---|---|---|---|---|---|---|
Parameter | DF | Estimate | Standard Error |
Wald 95% Confidence Limits | Wald Chi-Square | Pr > ChiSq | ||
Intercept | 1 | -8.1435 | 0.5774 | -9.2751 | -7.0119 | 198.95 | <.0001 | |
age | 30-49 | 1 | 1.7723 | 0.6138 | 0.5692 | 2.9754 | 8.34 | 0.0039 |
age | 50-64 | 1 | 3.1777 | 0.6003 | 2.0011 | 4.3542 | 28.02 | <.0001 |
age | 65-79 | 1 | 4.8794 | 0.5885 | 3.7260 | 6.0329 | 68.75 | <.0001 |
age | 80+ | 1 | 5.9524 | 0.5891 | 4.7978 | 7.1070 | 102.10 | <.0001 |
age | 18-29 | 0 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | . | . |
Scale | 0 | 1.0000 | 0.0000 | 1.0000 | 1.0000 |
Here are the other two tables:
age Least Squares Means | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
age | Estimate | Standard Error | z Value | Pr > |z| | Alpha | Lower | Upper | Exponentiated | Exponentiated Lower |
Exponentiated Upper |
30-49 | -6.3712 | 0.2085 | -30.56 | <.0001 | 0.05 | -6.7799 | -5.9626 | 0.001710 | 0.001136 | 0.002573 |
50-64 | -4.9659 | 0.1644 | -30.21 | <.0001 | 0.05 | -5.2881 | -4.6436 | 0.006972 | 0.005051 | 0.009623 |
65-79 | -3.2641 | 0.1140 | -28.64 | <.0001 | 0.05 | -3.4874 | -3.0407 | 0.03823 | 0.03058 | 0.04780 |
80+ | -2.1911 | 0.1170 | -18.72 | <.0001 | 0.05 | -2.4205 | -1.9617 | 0.1118 | 0.08888 | 0.1406 |
18-29 | -8.1435 | 0.5774 | -14.10 | <.0001 | 0.05 | -9.2751 | -7.0119 | 0.000291 | 0.000094 | 0.000901 |
Differences of age Least Squares Means | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
age | _age | Estimate | Standard Error | z Value | Pr > |z| | Alpha | Lower | Upper | Exponentiated | Exponentiated Lower |
Exponentiated Upper |
30-49 | 18-29 | 1.7723 | 0.6138 | 2.89 | 0.0039 | 0.05 | 0.5692 | 2.9754 | 5.8842 | 1.7668 | 19.5975 |
50-64 | 18-29 | 3.1777 | 0.6003 | 5.29 | <.0001 | 0.05 | 2.0011 | 4.3542 | 23.9904 | 7.3971 | 77.8061 |
65-79 | 18-29 | 4.8794 | 0.5885 | 8.29 | <.0001 | 0.05 | 3.7260 | 6.0329 | 131.56 | 41.5138 | 416.91 |
80+ | 18-29 | 5.9524 | 0.5891 | 10.10 | <.0001 | 0.05 | 4.7978 | 7.1070 | 384.68 | 121.24 | 1220.48 |
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.