Solved: How do I compute confidence intervals in SAS for sample proportions (p...

Eibhlin_w · Posted 11-01-2018 07:14 AM

Hi there, I am trying to compute 95% confidence intervals of sample proportions in SAS enterprise guide. I have computed overall percentages and divided these by 100 to produce p-hat (0.0328) using proc sql. Referring to the code below, I then tried to compute confidence intervals using the code below but it does not compute the correct number (I have done these done correctly in excel, I should be getting .02884 but I am getting .03279 instead using the code below). Below is the lower CI, and the same problem happens for the upper CI. The 7810 below in the code refers to the number of my sample. I am new to SAS so I would appreciate any help or feedback. I have heard that proc freq data could be used (the overall percentages I have computed are derived from the original data, they are percentage differences of two totalled columns) but I don't know where to begin in computing the CIS in this way, but if this was a better way to do it, I have access to the raw data. Thanks so much in advance.

Proc sql;

create table CI as

select (p_hat-(1.96*SQRT(p_hat*(1-p_hat))/7810)) as CI

from P_hat_data;

Quit;

FreelanceReinh · Posted 11-01-2018 07:55 AM

Hi @Eibhlin_w and welcome to the SAS Support Communities!

I agree that common statistics such as a CI for a proportion don't need to be computed "by hand" (i.e. in PROC SQL or a DATA step). PROC FREQ offers a variety of confidence intervals for binomial proportions. Use the BINOMIAL option of the TABLES statement:

Example:

/* Create test data for demonstration */

data test;
do _n_=1 to 7810;
  c=2-(_n_<=256);
  output;
end;
run;

/* Compute proportions and their confidence intervals */

proc freq data=test;
tables c / binomial;
run;

/* Need more decimals? Use ODS output datasets. */

ods output binomial=bin;
proc freq data=test;
tables c / binomial;
run;

proc print data=bin;
format nvalue1 12.10;
run;

Results:

PROC FREQ:

The FREQ Procedure

                              Cumulative    Cumulative
c    Frequency     Percent     Frequency      Percent
------------------------------------------------------
1         256        3.28           256         3.28
2        7554       96.72          7810       100.00


      Binomial Proportion
             c = 1

Proportion                0.0328
ASE                       0.0020
95% Lower Conf Limit      0.0288
95% Upper Conf Limit      0.0367

Exact Conf Limits
95% Lower Conf Limit      0.0289
95% Upper Conf Limit      0.0370

PROC PRINT (using ODS output from PROC FREQ):

Obs     Table     Name1     Label1                  Value1         nValue1

 1     Table c    _BIN_     Proportion              0.0328    0.0327784891
 2     Table c    E_BIN     ASE                     0.0020    0.0020147999
 3     Table c    L_BIN     95% Lower Conf Limit    0.0288    0.0288295539
 4     Table c    U_BIN     95% Upper Conf Limit    0.0367    0.0367274244
 5     Table c                                                 .
 6     Table c              Exact Conf Limits                  .
 7     Table c    XL_BIN    95% Lower Conf Limit    0.0289    0.0289407165
 8     Table c    XU_BIN    95% Upper Conf Limit    0.0370    0.0369698724

As you see, you get both approximate (Wald) confidence limits using the normal approximation [0.0288, 0.0367] and exact (Clopper-Pearson) confidence limits [0.0289, 0.0370]. With the CL= suboption of the BINOMIAL option you can request even more types of confidence intervals (e.g. Agresti-Coull or Wilson), see documentation.

Note that the formula (for the lower Wald confidence limit) you used in your PROC SQL step is incorrect: The denominator 7810 must be part of the argument of the SQRT function. Here is the correct formula:

p_hat-1.96*SQRT(p_hat*(1-p_hat)/7810)

Edit: SAS has also various functions for computing quantiles, so you don't need to hardcode them ("1.96"):

p_hat-probit(0.975)*SQRT(p_hat*(1-p_hat)/7810)

View solution in original post

RW9 · Posted 11-01-2018 07:21 AM

For statistical results, its not common to manually do these (in either datastep or sql). You would use an appropriate procedure. For an example, here is proc means used:

https://communities.sas.com/t5/SAS-Statistical-Procedures/95-CI-for-means-direct-output-from-proc-me...

Eibhlin_w · Posted 11-01-2018 10:38 AM

Thank you for the advice and link.

FreelanceReinh · Posted 11-01-2018 07:55 AM

Hi @Eibhlin_w and welcome to the SAS Support Communities!

I agree that common statistics such as a CI for a proportion don't need to be computed "by hand" (i.e. in PROC SQL or a DATA step). PROC FREQ offers a variety of confidence intervals for binomial proportions. Use the BINOMIAL option of the TABLES statement:

Example:

/* Create test data for demonstration */

data test;
do _n_=1 to 7810;
  c=2-(_n_<=256);
  output;
end;
run;

/* Compute proportions and their confidence intervals */

proc freq data=test;
tables c / binomial;
run;

/* Need more decimals? Use ODS output datasets. */

ods output binomial=bin;
proc freq data=test;
tables c / binomial;
run;

proc print data=bin;
format nvalue1 12.10;
run;

Results:

PROC FREQ:

The FREQ Procedure

                              Cumulative    Cumulative
c    Frequency     Percent     Frequency      Percent
------------------------------------------------------
1         256        3.28           256         3.28
2        7554       96.72          7810       100.00


      Binomial Proportion
             c = 1

Proportion                0.0328
ASE                       0.0020
95% Lower Conf Limit      0.0288
95% Upper Conf Limit      0.0367

Exact Conf Limits
95% Lower Conf Limit      0.0289
95% Upper Conf Limit      0.0370

PROC PRINT (using ODS output from PROC FREQ):

Obs     Table     Name1     Label1                  Value1         nValue1

 1     Table c    _BIN_     Proportion              0.0328    0.0327784891
 2     Table c    E_BIN     ASE                     0.0020    0.0020147999
 3     Table c    L_BIN     95% Lower Conf Limit    0.0288    0.0288295539
 4     Table c    U_BIN     95% Upper Conf Limit    0.0367    0.0367274244
 5     Table c                                                 .
 6     Table c              Exact Conf Limits                  .
 7     Table c    XL_BIN    95% Lower Conf Limit    0.0289    0.0289407165
 8     Table c    XU_BIN    95% Upper Conf Limit    0.0370    0.0369698724

As you see, you get both approximate (Wald) confidence limits using the normal approximation [0.0288, 0.0367] and exact (Clopper-Pearson) confidence limits [0.0289, 0.0370]. With the CL= suboption of the BINOMIAL option you can request even more types of confidence intervals (e.g. Agresti-Coull or Wilson), see documentation.

Note that the formula (for the lower Wald confidence limit) you used in your PROC SQL step is incorrect: The denominator 7810 must be part of the argument of the SQRT function. Here is the correct formula:

p_hat-1.96*SQRT(p_hat*(1-p_hat)/7810)

Edit: SAS has also various functions for computing quantiles, so you don't need to hardcode them ("1.96"):

p_hat-probit(0.975)*SQRT(p_hat*(1-p_hat)/7810)

Eibhlin_w · Posted 11-01-2018 10:42 AM

Hi FreelanceReinhard, thank you so much for the example of code and output, it's really appreciated. The proc freq worked for me and when I ran the solution you changed that worked too! 🙂

How do I compute confidence intervals in SAS for sample proportions (percentages)

Re: How do I compute confidence intervals in SAS for sample proportions (percentages)

Re: How do I compute confidence intervals in SAS for sample proportions (percentages)

Re: How do I compute confidence intervals in SAS for sample proportions (percentages)

Re: How do I compute confidence intervals in SAS for sample proportions (percentages)

Re: How do I compute confidence intervals in SAS for sample proportions (percentages)

How do I compute confidence intervals in SAS for sample proportions (percentages)

Re: How do I compute confidence intervals in SAS for sample proportions (percentages)

Re: How do I compute confidence intervals in SAS for sample proportions (percentages)

Re: How do I compute confidence intervals in SAS for sample proportions (percentages)

Re: How do I compute confidence intervals in SAS for sample proportions (percentages)

Re: How do I compute confidence intervals in SAS for sample proportions (percentages)

Registration is open