BookmarkSubscribeRSS Feed
data_null__
Jade | Level 19

What is the CI for 2 values when they are equal?

5 REPLIES 5
Rick_SAS
SAS Super FREQ

A surprisingly subtle question. For constant data, I think the official answer is undefined (or missing).  I'll try to find a reference when I get into the office.

On the other hand, the LIMIT of this situation is zero, which is probably why you are writing. In other words, if your data are {0, delta}, then the width of the CLM approaches zero.  You can see this graphically by running the following SAS code:

data a;

keep sample x;

do y = 1 to 0 by -0.05;

   sample + 1;

   x = 0;   output;

   x = y; output;

end;

sample + 1;

x=0; output; x=0; output; /* exact zero */

run;

proc means data=a noprint;

   by sample; 

   var x;

   output out=out range=range lclm=lclm mean=mean uclm=uclm;

run;

proc sgplot data=out;

   band x=range upper=uclm lower=lclm;

run;

Rick_SAS
SAS Super FREQ

I think it comes out of the derivation of the CLM formula.  You derive the formula by looking at the expression

t = (sampleAverage - populationMean)/ (sampleStdDev/sqrt(N))

You then argue that if N is large and the x are normally distributed (yada, yada, yada) then the statistics has a certain distribution.

When you have constant data, the sample std dev is 0, and therefore the expression is undefined.

data_null__
Jade | Level 19

Thanks Rick that's what I thought too.  So what is being estimated by PROC UNIVARIATE CIBASIC?

Rick_SAS
SAS Super FREQ

I suspect they are just plugging into the formula

avg +/- t(1-alpha) s/sqrt(N)

and since s=0, the CI is assigned zero width.

It's really not clear to me what the correct answer should be for these degenerate data. Both can be justified. 

Let's see how other statisticians weigh-in.

SteveDenham
Jade | Level 19

If a sample standard error is zero, by implication the population standard error is zero (why? because it is the only inference you can make about the population parameter), and thus there is no variability in the population.  So any sample will give a CI of zero width.

This is one of the killers of small sample size, and why, in our shop, there was a caveat that there would be no analysis unless N>2 for every group.  (Note that this fails to account for internal replication when there are multiple groups.)

Steve Denham

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1341 views
  • 0 likes
  • 3 in conversation