I'm getting different results when using SAS (9.4) and Stata (v11) for jackknife regression. Anyone know why these are different?
Below are the key parts of my SAS and Stata code. If I run surveyreg using the default Taylor series linearisation, and also run the corresponding Stata code on the same data, I get (near enough to) identical results. When I use the jackknife approach, the std errors are essentially the same as the Taylor series approach in Stata, but about 10% larger in SAS.
SAS CODE
title "Regression: Taylor series linearisation";
proc surveyreg data=temp1 total=lsac.stratum74with73 ;
strata stratum74with73;
cluster pcodes;
weight defwt;
model ready5_as = edl edm edh /noint;
run;
title "Regression: Jackknife method (10% larger SEs)";
proc surveyreg data=temp1 varmethod=jk ;
strata stratum74with73;
cluster pcodes;
weight defwt;
model ready5_as = edl edm edh /noint;
run;
STATA CODE
svyset pcodes [pweight=defwt], strata(stratum74with73) fpc(totalstratums)
* Taylor series linearisation
svy: regress ready5_as edl edm edh, noconstant
* Jackknife calculation
svy jackknife: regress ready5_as edl edm edh, noconstant
I think it has something to do with the FPC. If I remove the FPC information from the code, all four approaches give essentially the same results (the larger std errors). But presumably the optimal estimates should take account of this? The SAS documentation implies that the jackknife estimate doesn't, and shouldn't, take account of the FPC, but it would seem the Stata code does in some way. Which is correct?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.