BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
carmong
Obsidian | Level 7

I have a question regarding calculating confidence intervals (95%) in SAS for multiple exposure groups (with one being the reference group).  The numbers below represent the number of people who were exposed versus the number of people who became ill along with the cum incidence, sickness rate and risk ratio.

I calculated the following risk ratios by hand with the 18 - 29 group being the reference group and I have the following questions:

 

- Is it possible to calculate confidence intervals for multiple exposure groups in SAS and if so, what proc and syntax would be used?

- Would I input the table below in SAS order to calculate the confidence intervals in SAS, or would the risk ratios and confidence intervals have to be calculated in SAS (using the raw data)?

 

 SickExposedcum incidenceraterisk ratioConfidence Interval
   18 – 293103230.0002910.03  
   30-4923134500.001710.25.9 
   50-64 3753070.0069720.724.0 
   65-797720140.0382323.8131.6 
   80+736530.11179211.2384.7 
1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

This can be done using a Poisson model and the LSMEANS statement as shown below. See also this note. The Exponentiated columns in the first LSMEANS table reproduce your rates and in the table of differences gives the ratios and confidence intervals. Note the use of the log of the exposed counts as an offset in the Poisson model.

data b;
input age $ Nsick Nexp;
off=log(Nexp);
datalines;
   18-29	3	10323	0.000291	0.03	 	 
   30-49	23	13450	0.00171	0.2	5.9	 
   50-64 	37	5307	0.006972	0.7	24.0	 
   65-79	77	2014	0.038232	3.8	131.6	 
   80+	73	653	0.111792	11.2	384.7	 
;
proc genmod;
class age(ref="18-29");
model Nsick=age / dist=poisson offset=off;
lsmeans age / diff=control("18-29") exp cl plots=none;
run;

View solution in original post

10 REPLIES 10
StatDave
SAS Super FREQ

This can be done using a Poisson model and the LSMEANS statement as shown below. See also this note. The Exponentiated columns in the first LSMEANS table reproduce your rates and in the table of differences gives the ratios and confidence intervals. Note the use of the log of the exposed counts as an offset in the Poisson model.

data b;
input age $ Nsick Nexp;
off=log(Nexp);
datalines;
   18-29	3	10323	0.000291	0.03	 	 
   30-49	23	13450	0.00171	0.2	5.9	 
   50-64 	37	5307	0.006972	0.7	24.0	 
   65-79	77	2014	0.038232	3.8	131.6	 
   80+	73	653	0.111792	11.2	384.7	 
;
proc genmod;
class age(ref="18-29");
model Nsick=age / dist=poisson offset=off;
lsmeans age / diff=control("18-29") exp cl plots=none;
run;
carmong
Obsidian | Level 7

Here is what I got when I tried the code: Is there supposed to be a "mean" in the age Least Squares Means table?

 

 

Analysis Of Maximum Likelihood Parameter Estimates
Parameter   DF Estimate Standard
Error
Wald 95% Confidence Limits Wald Chi-Square Pr > ChiSq
Intercept   1 0.0000 0.0098 -0.0193 0.0193 0.00 1.0000
age 30-49 1 0.0000 0.0131 -0.0256 0.0256 0.00 1.0000
age 50-64 1 0.0000 0.0169 -0.0331 0.0331 0.00 1.0000
age 65-79 1 0.0000 0.0244 -0.0477 0.0477 0.00 1.0000
age 80+ 1 0.0000 0.0404 -0.0791 0.0791 0.00 1.0000
age 18-29 0 0.0000 0.0000 0.0000 0.0000 . .
Scale   0 1.0000 0.0000 1.0000 1.0000  

 

age Least Squares Means
age Estimate Standard Error z Value Pr > |z| Alpha Lower Upper Exponentiated Exponentiated
Lower
Exponentiated
Upper
30-49 0 0.008623 0.00 1.0000 0.05 -0.01690 0.01690 1.0000 0.9832 1.0170
50-64 0 0.01373 0.00 1.0000 0.05 -0.02690 0.02690 1.0000 0.9735 1.0273
65-79 0 0.02228 0.00 1.0000 0.05 -0.04367 0.04367 1.0000 0.9573 1.0446
80+ 0 0.03913 0.00 1.0000 0.05 -0.07670 0.07670 1.0000 0.9262 1.0797
18-29 0 0.009842 0.00 1.0000 0.05 -0.01929 0.01929 1.0000 0.9809 1.0195

 

Differences of age Least Squares Means
age _age Estimate Standard Error z Value Pr > |z| Alpha Lower Upper Exponentiated Exponentiated
Lower
Exponentiated
Upper
30-49 18-29 0 0.01309 0.00 1.0000 0.05 -0.02565 0.02565 1.0000 0.9747 1.0260
50-64 18-29 0 0.01689 0.00 1.0000 0.05 -0.03311 0.03311 1.0000 0.9674 1.0337
65-79 18-29 0 0.02436 0.00 1.0000 0.05 -0.04774 0.04774 1.0000 0.9534 1.0489
80+ 18-29 0 0.04035 0.00 1.0000 0.05 -0.07909 0.07909 1.0000 0.9240 1.0823
StatDave
SAS Super FREQ

You must not have run the code exactly as shown in my previous post since all of your parameter estimates are zero. And no, there shouldn't be a Mean column since the ILINK option is not specified. The EXP option is taking care of that.

carmong
Obsidian | Level 7

Here is what I typed into SAS, where did I go wrong?

data a;
input age $ Hosp Nhosp;
off=log (Nhosp);
datalines;

18-29 3 10323 0.000291 0.03
30-49 23 13450 0.00171 0.2 5.9
50-64 37 5307 0.006972 0.7 24.0
65-79 77 2014 0.038232 3.8 131.6
80+ 73 653 0.111792 11.2 384.7
;

proc genmod;
class age(ref="18-29");
model Nhosp=age / dist=poisson offset=off;
lsmeans age / diff=control("18-29") exp cl plots=none;
run;

 

Here is the log:

1 data a;
2 input age $ Hosp Nhosp;
3 off=log (Nhosp);
4 datalines;

NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.A has 5 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.47 seconds
cpu time 0.04 seconds


11 ;

NOTE: Writing HTML Body file: sashtml.htm
12 proc genmod;
13 class age(ref="18-29");
14 model Nhosp=age / dist=poisson offset=off;
15 lsmeans age / diff=control("18-29") exp cl plots=none;
16 run;

NOTE: Fitting saturated model. Scale will not be estimated.
NOTE: Algorithm converged.
NOTE: The scale parameter was held fixed.
NOTE: The structure of the LSMeans table has changed from an earlier release of SAS.
NOTE: The structure of the Diffs table has changed from an earlier release of SAS.
NOTE: PROCEDURE GENMOD used (Total process time):
real time 2.31 seconds
cpu time 0.29 seconds

StatDave
SAS Super FREQ
Should be model hosp=age not model nhosp=age.
carmong
Obsidian | Level 7

I tried running it again and got a different result. Perhaps its the spacing with the data section that made a difference? I noticed that there was spacing in your code, for the datalines, but when I typed mine in, I didn't space.   

 

Analysis Of Maximum Likelihood Parameter Estimates
Parameter   DF Estimate Standard
Error
Wald 95% Confidence Limits Wald Chi-Square Pr > ChiSq
Intercept   1 -8.1435 0.5774 -9.2751 -7.0119 198.95 <.0001
age 30-49 1 1.7723 0.6138 0.5692 2.9754 8.34 0.0039
age 50-64 1 3.1777 0.6003 2.0011 4.3542 28.02 <.0001
age 65-79 1 4.8794 0.5885 3.7260 6.0329 68.75 <.0001
age 80+ 1 5.9524 0.5891 4.7978 7.1070 102.10 <.0001
age 18-29 0 0.0000 0.0000 0.0000 0.0000 . .
Scale   0 1.0000 0.0000 1.0000 1.0000

 

 

StatDave
SAS Super FREQ
Those are the parameter estimates I get from my code. The spacing isn't important. The LSMEANS tables should also be what I got and should be what you are looking for.
StatDave
SAS Super FREQ
Those are the parameter estimates I get from my code. The spacing isn't important. The LSMEANS tables should also be what I got and should be what you are looking for.
carmong
Obsidian | Level 7

Here are the other two tables: 

age Least Squares Means
age Estimate Standard Error z Value Pr > |z| Alpha Lower Upper Exponentiated Exponentiated
Lower
Exponentiated
Upper
30-49 -6.3712 0.2085 -30.56 <.0001 0.05 -6.7799 -5.9626 0.001710 0.001136 0.002573
50-64 -4.9659 0.1644 -30.21 <.0001 0.05 -5.2881 -4.6436 0.006972 0.005051 0.009623
65-79 -3.2641 0.1140 -28.64 <.0001 0.05 -3.4874 -3.0407 0.03823 0.03058 0.04780
80+ -2.1911 0.1170 -18.72 <.0001 0.05 -2.4205 -1.9617 0.1118 0.08888 0.1406
18-29 -8.1435 0.5774 -14.10 <.0001 0.05 -9.2751 -7.0119 0.000291 0.000094 0.000901

 

Differences of age Least Squares Means
age _age Estimate Standard Error z Value Pr > |z| Alpha Lower Upper Exponentiated Exponentiated
Lower
Exponentiated
Upper
30-49 18-29 1.7723 0.6138 2.89 0.0039 0.05 0.5692 2.9754 5.8842 1.7668 19.5975
50-64 18-29 3.1777 0.6003 5.29 <.0001 0.05 2.0011 4.3542 23.9904 7.3971 77.8061
65-79 18-29 4.8794 0.5885 8.29 <.0001 0.05 3.7260 6.0329 131.56 41.5138 416.91
80+ 18-29 5.9524 0.5891 10.10 <.0001 0.05 4.7978 7.1070 384.68 121.24 1220.48
StatDave
SAS Super FREQ
That's correct. Again, the Exponentiated columns in the first table give the risk estimates for each age group, and in the second table, the risk ratios comparing each group to the reference age group.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 2486 views
  • 3 likes
  • 2 in conversation