BookmarkSubscribeRSS Feed
CMC-Stan
Fluorite | Level 6

I'm looking for a way to calculate adjusted degrees of freedom for a linear combination of variance components, with possible correlations. I know that there can be some distributional issues with this, so at best it's an approximate solution.  The estimate statement does not work for this in MIXED.  

20 REPLIES 20
Rick_SAS
SAS Super FREQ

Please post the code that you are using (even though it does not work) so we can see the model you are fitting and the estimates you are trying to make. If you have sample data, that's even better.

CMC-Stan
Fluorite | Level 6
Rick I did discuss this with Jill Tao and she mentioned that the calculation i want to make is not available with the ESTIMATE statement. So I was wondering if someone had looked at this. I can send you the code, but it doesn't show the linear combination of fixed and variance components that I want to carry out. If you send me your email, I can send you the expression on a powerpoint slide. I can't copy the equations here. My email is saltan@its,jnj.com.

Rick_SAS
SAS Super FREQ

Sorry, I don't give personal advice. Post the code here if you want someone to help.

 

Jill is top notch, so if she says that the ESTIMATE statement doesn't cover your use case, you can believe her. Sometimes, this situation is because you are trying to estimate a nonlinear function of the parameter. If you know the expression that you are trying to estimate, you might try PROC NLMIXED. For an example, see https://blogs.sas.com/content/iml/2018/10/17/parameter-estimates-for-different-parameterizations.htm... 

CMC-Stan
Fluorite | Level 6

The problem is that I'm trying to describe an inference space that is not covered by SAS. SAS has broad, intermediate and narrow. But the space I am trying to describe does not fit into this trichotomization. So for example, if I have mixed model, and to make it simple, say just a single variance component, say v_b, and then residual error, v_r. I have say a linear combination of the fixed effects, say L'B0, then I want an interval estimate based on Var(L'B0_hat) + v_b_hat. It's not a confidence interval, it's a population interval of the population of units belonging to random factor b. I don't know how to describe this in words very clearly I know, but I cannot copy notation to this page.  

Rick_SAS
SAS Super FREQ

I think your chances of obtaining an answer will be improved if you post some sample data, show the model you are considering, and then ask the question in terms of the variables in the problem. 

 

> I'm trying to describe an inference space that is not covered by SAS.

That remains to be seen. The first step is to define the estimator. Even if there is not a built-in procedure that can estimate the interval, SAS supports programming languages such as SAS/IML that can implement any well-formulated algorithm.

 

> I don't know how to describe this in words very clearly 

If it is easier, perhaps you could link to a journal article or textbook in which your problem is described. 

 

CMC-Stan
Fluorite | Level 6

Here's some sample data and proc mixed code that describes the model and a provisional ESTIMATE statement. 

 

/* read in raw data for 9 batches */ 

data a0 ; 
input Month B01 B02 B05 B06 B08 B09 B10 B12 B20 ; 
datalines ; 
0  103.7 103.7 103.7 100.8 101.2 100.1  99.9 102.7 101.5
3   99.4 103.1 100.9 100.8  98.1 101.3 101.6 101.7 104.7
6  102.4 102.4 102.4  93.4  99.9 100.1 	95.1  99.4 100.8
9  101.0  98.9  99.7 101.2  98.2  98.4  99.5  98.7  99.9
12 100.0 101.2  99.3  96.5  92.7  92.3  98.1  98.7  97.8
18  99.2  91.7  98.7  94.3  95.7  96.1  95.9 100.1  95.1
24  95.5  96.2  94.3  93.4  94.4  92.8  96.1  94.8  95.9
; 
run; 

proc print data=a0 ; 
title1 "Raw Data 9 Batches Page 8/22" ;
run ; 

/* Create vector of responses by batch */ 

data a1 ; set a0 ; 
array batx (9) B01 B02 B05 B06 B08 B09 B10 B12 B20 ;
array baty (9) $3 ('B01' 'B02' 'B05' 'B06' 'B08' 'B09' 'B10' 'B12' 'B20') ; 
do i = 1 to 9 ; 
y = batx(i) ; 
batch =baty(i) ; 
output; 
end ; 
keep batch month y; 
run ;   

proc print data=a1 ; 
title1 "Data for analysis" ;
run; 

proc mixed data=a1 nobound covtest ranks mmeq mmeqsol ;
   class  batch ;
   model y = month / solution covb outpm = out1 alphap=0.10 solution  ddfm=kr ; 
   random intercept / subject=batch;
   estimate "A2 - Process Model 12 months Including batch component"
            intercept 1 month 12 | Intercept 1
            / cl alpha=0.10 e  ;
  ods output mmeqsol=invmmeq coef=e;
  title1 "Mixed regression model with random batch intercept only";
  title2 "Batch mean and Product calculations for 0.50 OOS, 95% Coverage" ;
run;

 

the point estimate is correct in the estimate card. But what I want is 

      sqrt[Var(InterceptF + 12*Month) + Var(InterceptR)] 

where Var=Variance,  InterceptF is the fixed intercept estimate, Month is the fixed time covariate and InterceptR is the estimate of the random component on intercept due to Batch.  

 

This is not really a standard error as such, it's the standard deviation of the population of batches at time 12 months. The challenge here is to get the Satterthwaite adjusted degrees of freedom for this sum of variances. Actually this is the simple case. If we add the random component due to batch on the time covariate, Month, and assume an unstructured covariance matrix, then we have the sum of 3 variances where 2 are correlated. In addition, note that Var(InterceptF + 12*Month) is not a chi-square.    

SteveDenham
Jade | Level 19

Not real sure how this is going to work.  It appears to me that you could get Var(interceptR) by adding solution as an option for the random statement.  From the BLUP, you could then calculate the combined variance.

 

SteveDenham

(this is a real shot in the dark, though)

CMC-Stan
Fluorite | Level 6
Yes that could be done and in fact is what i would do. The problem is getting a degrees of freedom for that to put an uncertainty limit on it. This is one of the reasons why a frequentist approach to this kind of problem is not good. You have to deal with these kinds of complications.
lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

I see you are trying to calculate prediction intervals rather than confidence intervals. These are not straightforward for mixed models in SAS. I usually do this "brute force", but I have not dealt with the complexity of Satterthwaite or Kenward-Roger df adjustments (although I am a huge fan of KR adjustments, in general). These calculations can be quite challenging, but I think there may be some IML programs out there to do this (but maybe not for your problem). 

 

Your question made me think of the following article that may be of help. I have not studied it -- it is sitting on my desk to study at some point. It deals with prediction intervals for mixed models, and the online supplement has extensive SAS code. 

 

https://www.tandfonline.com/doi/full/10.1080/19466315.2020.1776762

 

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

And I just remembered this article that deals with prediction intervals for mixed models, also with extensive SAS code in the online supplement. I know that address Satterthwaite.

https://onlinelibrary.wiley.com/doi/epdf/10.1002/sim.8386

 

SteveDenham
Jade | Level 19

This one isn't behind a paywall, and looks like it covers much of the same material as the other Francq paper.

 

SteveDenham

CMC-Stan
Fluorite | Level 6
Thank you for that reference. I am familiar with the Francq paper, he lays out the algebra in a clear way. Unfortunately he doesn't address the df question.
SteveDenham
Jade | Level 19

If you have 3 components, you might try doing 3 stepwise Satterthwaite calculations (A with B, and then the result with C; A with C, and then the result with B; and B with C, and then the result with A), and then averaging those to get approximate df.  Sounds a bit brute force, though.

 

SteveDenham

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 20 replies
  • 702 views
  • 3 likes
  • 5 in conversation