BookmarkSubscribeRSS Feed
data_null__
Jade | Level 19

What is the CI for 2 values when they are equal?

5 REPLIES 5
Rick_SAS
SAS Super FREQ

A surprisingly subtle question. For constant data, I think the official answer is undefined (or missing).  I'll try to find a reference when I get into the office.

On the other hand, the LIMIT of this situation is zero, which is probably why you are writing. In other words, if your data are {0, delta}, then the width of the CLM approaches zero.  You can see this graphically by running the following SAS code:

data a;

keep sample x;

do y = 1 to 0 by -0.05;

   sample + 1;

   x = 0;   output;

   x = y; output;

end;

sample + 1;

x=0; output; x=0; output; /* exact zero */

run;

proc means data=a noprint;

   by sample; 

   var x;

   output out=out range=range lclm=lclm mean=mean uclm=uclm;

run;

proc sgplot data=out;

   band x=range upper=uclm lower=lclm;

run;

Rick_SAS
SAS Super FREQ

I think it comes out of the derivation of the CLM formula.  You derive the formula by looking at the expression

t = (sampleAverage - populationMean)/ (sampleStdDev/sqrt(N))

You then argue that if N is large and the x are normally distributed (yada, yada, yada) then the statistics has a certain distribution.

When you have constant data, the sample std dev is 0, and therefore the expression is undefined.

data_null__
Jade | Level 19

Thanks Rick that's what I thought too.  So what is being estimated by PROC UNIVARIATE CIBASIC?

Rick_SAS
SAS Super FREQ

I suspect they are just plugging into the formula

avg +/- t(1-alpha) s/sqrt(N)

and since s=0, the CI is assigned zero width.

It's really not clear to me what the correct answer should be for these degenerate data. Both can be justified. 

Let's see how other statisticians weigh-in.

SteveDenham
Jade | Level 19

If a sample standard error is zero, by implication the population standard error is zero (why? because it is the only inference you can make about the population parameter), and thus there is no variability in the population.  So any sample will give a CI of zero width.

This is one of the killers of small sample size, and why, in our shop, there was a caveat that there would be no analysis unless N>2 for every group.  (Note that this fails to account for internal replication when there are multiple groups.)

Steve Denham

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1292 views
  • 0 likes
  • 3 in conversation