BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
charles_pignon1
Calcite | Level 5

Hi everyone,

I am running into problems when using the weight statement in proc mixed. I am trying to run a simple mixed model with two fixed effects (species and canopy_position) and their interaction, along with one random effect. Because of unequal variances between canopy_position groups, I want to run this model with an unstructured variance/covariance matrix, obtained from the repeated statement.

The values for phico2max are associated with some variability. To account for this I introduced the weight statement with the variable phico2max_weight, equal to (1/individual_variance)/(1/total_variance). This gives more power to the more precise (i.e. lower individual_variance) values of phico2max.

This model works when the weight statement is present but the random and repeated statements are left out, or when the random and/or repeated statements are present but the weight statement is left out. If the weight statement is present together with the random or repeated statements, the model fails and the following error message appears in the log:

NOTE: An infinite likelihood is assumed in iteration 0 because of a nonpositive

      definite estimated R matrix for Subject 1.

How can I get the weight statement to function with a random statement and unstructured variance/covariance?

proc mixed data=aq; class species canopy_position year;

model phico2max=species|canopy_position;

repeated/type=un;

weight phico2max_weight;

random year;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

My belief is that the sum of the weighting factors does not have to equal 1.  What happens when you use 1/individual_variance as the weight?  Do you still obtain an infinite likelihood error?

Use of type=un in the repeated statement will lead to estimating a covariance between the canopy_position levels, and may be the cause of the infinite likelihood due to (possible only) number of parameters being estimated.  What about trying:

repeated /group=canopy_position;

which would estimate a separate residual variance for each level of canopy_position.  This is a more common way of modeling heteroskedasticity.

Steve Denham

View solution in original post

5 REPLIES 5
SteveDenham
Jade | Level 19

Walk me through the design, and see if I have it correct.  You have multiple measures on phico2max, which have been summarized from individual measures, so that you know the total variation and the variation for the measure at that point.  If so, then the weight should be 1/individual_variance, if you want unbiased estimates from the procedure.The total_variance will be estimated by the model.

I suspect the infinite likelihood is due to multiple measures within a subject, and you have not specified the subject= option in the repeated statement to disambiguate these multiple measurements.  So, one more design question: what is the experimental unit here that has repeated measures?  I believe that needs to be captured in your code,

Steve Denham

charles_pignon1
Calcite | Level 5

phico2max is a slope derived from linear regression of a Y variable to a X variable. To simplify the model, neither this Y nor this X variable are included, but instead I am running my statistical analysis on the values for phico2max. Therefore each value of phico2max only appears once and has a unique identifier, and there are no repeated measures. The model is then simply phico2max=μ + canopy_position + species + canopy_position*species + year + ε

there is replication within each canopy_position, each species, and each year.

The weight statement is used because each value of phico2max was obtained from a linear regression, and so is associated with a standard error: I want to give more weight to slopes associated with a small standard error, hence the (1/individual_variance)/(1/total_variance).

The unstructured variance/covariance matrix is used because residual variance is unequal between canopy_positions.

The random statement is used because year is considered random.

Using 1/individual_variance as the weighting factor actually allowed the code to run correctly, but I believe that for the weighting factor to be correct the sum of all weighting factors must be equal to 1. This isn't the case if the weighting factor is 1/individual_variance. Does my issue lie in the math, or in the code?

SteveDenham
Jade | Level 19

My belief is that the sum of the weighting factors does not have to equal 1.  What happens when you use 1/individual_variance as the weight?  Do you still obtain an infinite likelihood error?

Use of type=un in the repeated statement will lead to estimating a covariance between the canopy_position levels, and may be the cause of the infinite likelihood due to (possible only) number of parameters being estimated.  What about trying:

repeated /group=canopy_position;

which would estimate a separate residual variance for each level of canopy_position.  This is a more common way of modeling heteroskedasticity.

Steve Denham

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

As Steve says, there is no reason to rescale the weights. Just use 1/(individual_variance).

Using repeated /type=un

will just give you a single variance (even though you specified unstructed). THis is because you do not have a subject= option. Without this option, each observation is a single subject. If your goal is to have a different weight for each canopy position, then use the code given by Steve:

repeated / group = canopy_position;

charles_pignon1
Calcite | Level 5

Hello,

I looked back into this and you are both right, 1/(individual_variance) works. I was concerned that using weight values that did not sum to 1 would bias the parameter estimates, however it turns out this is not an issue in regression analysis. Separating residual variance between canopy_position is also closer to what I wanted to do (instead of a generalized unstructured variance-covariance matrix). Thanks for your help!

Charles

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 3910 views
  • 6 likes
  • 3 in conversation