Re: Unexpected increase in std error with covariate included

emorway · Posted 11-28-2010 08:13 AM

Hello,

I've added comments to the code below to try and explain the problem
I'm having. Basically, I have found that the average standard error
increased from case 1 to case 2, yet a covariate was included in the
second case.

I can provide the input datafile (kind of surprised I can't attach it
to this post) if one would like to run the code. I always like being
able to reproduce the problem myself.

My question is this: At worst, wouldn't the error term
in case 2 (code below "title 4") be less than or equal to the error
term in case 1 (code below title 1)? I wouldn't expect a covariate
to increase the error? Average errors are provided in the commented
code below.

Thanks,
Eric

/* Some preliminary code - Read in data and some additional */
/* processing */

data morway; infile 'c:\temp
\All_Soil_Salinity_Data_For_AOV_Update_With_Covariants.txt'
expandtabs;
input loc $ field $ season $ year Ece logECe ECE_sd WTD ECgw SM Sand
Silt Clay Theta_resid Theta_sat Ks;

if season='Early' then time = 2*(year-1999)+1;
if season='Late' then time = 2*(year-1999)+2;
if loc ne 'loc';
*if year ge 2002;
proc print data=morway(obs=21);run;

proc sort data=morway; by loc;run;

/* Original analysis with odd behavoir in standard error between */
/* analyses titled 1 & 4 for the 'loc=DS' case. */
/* Avg stderr in title 1, loc=DS = 0.02118134 */
/* Avg stderr in title 4, loc=DS = 0.021399158 */

title '1) reduced sample size - no cov - remove rows with missing
Sand, Silt, Clay, wtd, ECgw, & SM';
proc mixed data=morway; class field year loc season;by loc;
model logece = season|year /ddfm=kr; where not missing(Sand) and not
missing(Silt) and not missing(Clay) and not missing(wtd) and not
missing(ECgw) and not missing(SM) and not missing(Theta_resid) and not
missing(Theta_sat) and not missing(Ks);
repeated year*season/subject=field(loc) r rcorr type=sp(pow)(time);
lsmeans season*year/ adjust=tukey;
ods output lsmeans = lsmeans;
run;

title '4) reduced sample size - with cov SM - remove rows with
missing Sand, Silt, Clay, wtd, ECgw, & SM';
proc mixed data=morway; class field year loc season;by loc;
model logece = season|year SM /ddfm=kr; where not missing(Sand) and
not missing(Silt) and not missing(Clay) and not missing(wtd) and not
missing(ECgw) and not missing(SM) and not missing(Theta_resid) and not
missing(Theta_sat) and not missing(Ks);
repeated year*season/subject=field(loc) r rcorr type=sp(pow)(time);
lsmeans season*year/ adjust=tukey;
ods output lsmeans = lsmeans;
run;

/* Modified analysis with odd behavoir persisting in standard */
/* error between analyses titled 1 & 4 for the 'loc=DS' case. */
/* ddfm=kr (Kenwood-Rodgers) argument removed */
/* 'repeated' line commented out */
/* Avg stderr in title 1, loc=DS = 0.023698976 */
/* Avg stderr in title 4, loc=DS = 0.023820179 */

title '1) reduced sample size - no cov - remove rows with missing
Sand, Silt, Clay, wtd, ECgw, & SM';
proc mixed data=morway; class field year loc season;by loc;
model logece = season|year /; where not missing(Sand) and not
missing(Silt) and not missing(Clay) and not missing(wtd) and not
missing(ECgw) and not missing(SM) and not missing(Theta_resid) and not
missing(Theta_sat) and not missing(Ks);
*repeated year*season/subject=field(loc) r rcorr type=sp(pow)(time);
lsmeans season*year/ adjust=tukey;
ods output lsmeans = lsmeans;
run;

title '4) reduced sample size - with cov - SM';
proc mixed data=morway; class field year loc season;by loc;
model logece = season|year SM /; where not missing(Sand) and not
missing(Silt) and not missing(Clay) and not missing(wtd) and not
missing(ECgw) and not missing(SM) and not missing(Theta_resid) and not
missing(Theta_sat) and not missing(Ks);
*repeated year*season/subject=field(loc) r rcorr type=sp(pow)(time);
lsmeans season*year/ adjust=tukey;
ods output lsmeans = lsmeans;
run;

TMorville · Posted 11-29-2010 05:38 AM

Hi Eric.

I think you need to provide us with a small sample of your data and some code. Or seriously re-write your post. Its completely not-understandable.

Are you redefineing your model, or adding more restrictions with the covariates?

Is the variance in covariate matrix estimated or approximated?

We need more background info. If a covariate matrix is wrong, it might increase the error term as a result of more variance.

1) What kind of data are you working with?
2) What is your model?
3) How are you changeing your model, and why do you have a reason to believe that your covariatematrix will make the std smaller? Message was edited by: TMorville

lvm · Posted 12-01-2010 04:12 PM

With the KR df option, almost anything can happen. I like the KR method for denominator df estimation with correlated data (and unbalanced data sets), but investigators must be aware that the df results depend on the model structure and the data (estimated variance-covariances for the random effects).

Unexpected increase in std error with covariate included

Re: Unexpected increase in std error with covariate included

Re: Unexpected increase in std error with covariate included

Registration is open

SAS Training: Just a Click Away