05-03-2012 09:44 AM
I'm trying to do a pairwise test to calculate p-values between various products that are being evaluated by several assessors. I'm trying to understand what SAS is doing, and I'm running into an issue in one particular situation: when the items being compared (in my case SAMPLE) are complete for one item in the pairwise comparison but not the other. Here's an example:
INPUT JUDGE REP SAMPLE X1;
1 1 1 3.0
1 2 1 4.0
2 1 1 7.0
2 2 1 2.0
1 1 2 1.0
1 2 2 6.0
2 1 2 3.0
proc glm data = raw;
class judge sample rep;
model X1 = judge sample rep judge * sample / ss3;
lsmeans sample / stderr tdiff alpha=0.10 e=judge*sample;
In this case we have a JUDGE (2) who is missing REP 2 for SAMPLE 2. SAMPLE 1 has a complete set of data. When I calculate the pairwise comparison between SAMPLE 1 and 2, I get a p-value of 0.48973, whereas in SAS it's an even 0.5. To get my value I'm doing the following:
mse = interaction_sum_of_squares / degrees of freedom = 0.66666666 / 1
constant = sum of (1/n1j + 1/n2j) = (1/2 + 1/2 + 1/2 + 1) = 2.5
sd = sqrt( (mse / num_judges^2) * constant)) = 0.6454972
t-value = lsmean1 - lsmean2 / sd = (4.0 - 3.3333333) / (0.6454972) = 1.032796
Meanwhile, the SAS t-value is 1.0.
I'm matching the lsmeans that SAS is getting, and the interaction_sum_of_squares and the degrees of freedom match what SAS is putting out as well. That really only leaves the constant, or something else SAS is doing which I can't figure out.
I match SAS perfectly when the data is balanced, it's only when it's unbalanced that it's not working.