Confidence limits in tabulate

RichardDeVen · Posted 08-21-2015 04:05 AM

I was asked to report the % of some event occurring and the 95% confidence interval about that %.

In this sample, a flag variable 'replied' is 0 or 1, and represents wether or not some survey question was answered.

'replied' is used categorically as a CLASS variable in table so that an across percentage can be specified in the table statement.

'replied' can not be reused analytically as a VAR variable, so replied_CL is created to have a something to work with in tabulate.

My question is this... is there any statistical problems with computing a LCLM UCLM from a two-valued variable ?

Thanks for listening.

Richard

data have;

do region = 'A', 'B';

do year = 2005 to 2015;

do _i = 1 to 100 + 50 *ranuni(123);

id + 1;

replied = ranuni(123) < 0.15;

replied_CL = 100 * replied;

output;

end;

drop _:;

run;

proc format;

value replied 0='No Reply' 1='Reply';

options nocenter;

proc tabulate data=have;

class region year replied;

var replied_CL;

table

year

, region

* (replied='' * (N pctn<replied>=' % of region')

replied_CL = '% of region Replied(*ESC*){newline}confidence interval' * ( LCLM='95% CI LB' UCLM='95% CI UB' )

)

;

format replied replied.;;

run;

* i guess this is a more 'canonical' way to get CI;

proc freq;

by region year;

table replied / binomial;

table replied_CL / binomial;

run;

Ksharp · Posted 08-21-2015 08:00 AM

I was surprised by you can use LCLM ,UCLM in proc tabulate , so I quickly check the documentation . Here is :

"Use both LCLM and UCLM to compute a two-sided confidence limit for the mean."

According to documentation, LCLM ,UCLM is for T statistical estimator, in other words, it test the H0: mu=0 .

It is not the confidence limit you are talking about (univariate's binormial distribution ) . I think you should merge it back to your original dataset.

ballardw · Posted 08-21-2015 11:35 AM

Other concerns with any tabulate output involving confidence limits and a survey relates to weights and which divisor to use in calculations of variance / standard deviation. The VARDEF option on the Proc Tabulate statement indicates which divisor to use.

Second is if your survey design involved any form of sampling other than a simple random sample then the weights aren't quite applied correctly and should use one of the Survey procedures to generate confidence limits.

And since you didn't include a weight statement at all your results cannot be applied to the population only the respondent pool unless your data is an actual census of the population of interest.

Confidence limits in tabulate

Re: Confidence limits in tabulate

Re: Confidence limits in tabulate

Confidence limits in tabulate

Re: Confidence limits in tabulate

Re: Confidence limits in tabulate

Registration is open

SAS Training: Just a Click Away