BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KLS
Calcite | Level 5 KLS
Calcite | Level 5

I am trying to create an estimate statement for a nested term. See model below...

I have 7 families, 6 isolates, and 2 lineages. The 6 isolates are nested within lineages. Isolates 1, 2, and 3 are the NA1 lineage and Isolates 4, 5, and 6 are the EU1 Lineage.

However, Isolate is random and Lineage is fixed.

 

Proc mixed data = new covtest;
CLASS Family Isolate Lineage Tree;
Model Lesion = Lineage /ddfm=KR outp=dat;
random Family Isolate(Lineage) Family*Isolate(Lineage);
run;

 

These are the estimate statements that I am attempting to run. I am getting estimates, but they don't really make sense. Ones that have the smallest lesions have very large estimates, for example. 

 

estimate "EU1 4" intercept 1 Lineage 1 0 | Isolate(Lineage) 0 0 0 1 0 0 /CL;
estimate "EU1 5" intercept 1 Lineage 1 0 | Isolate(Lineage) 0 0 0 0 1 0 /CL;
estimate "EU1 6" intercept 1 Lineage 1 0 | Isolate(Lineage) 0 0 0 0 0 1 /CL;
estimate "NA1 1" intercept 1 Lineage 0 1 | Isolate(Lineage) 1 0 0 0 0 0 /CL;
estimate "NA1 2" intercept 1 Lineage 0 1 | Isolate(Lineage) 0 1 0 0 0 0 /CL;
estimate "NA1 3" intercept 1 Lineage 0 1 | Isolate(Lineage) 0 0 1 0 0 0 /CL;
run;

 

I am also trying to get estimates for "Family" as well as the interactions term "Family*Isolate(Lineage)". I thought I did family right... see below. But I'm not sure these make sense either..

 

estimate "1" intercept 1 | Family 1 0 0 0 0 0 0 0 0 0 0 0 0 0 ;
estimate "2" intercept 1 | Family 0 1 0 0 0 0 0 0 0 0 0 0 0 0 ;
estimate "3" intercept 1 | Family 0 0 1 0 0 0 0 0 0 0 0 0 0 0 ;
estimate "4A" intercept 1 | Family 0 0 0 1 0 0 0 0 0 0 0 0 0 0 ;
estimate "4B" intercept 1 | Family 0 0 0 0 1 0 0 0 0 0 0 0 0 0 ;
estimate "4C" intercept 1 | Family 0 0 0 0 0 1 0 0 0 0 0 0 0 0 ;
estimate "4D" intercept 1 | Family 0 0 0 0 0 0 1 0 0 0 0 0 0 0 ;
estimate "4E" intercept 1 | Family 0 0 0 0 0 0 0 1 0 0 0 0 0 0 ;
estimate "4F" intercept 1 | Family 0 0 0 0 0 0 0 0 1 0 0 0 0 0 ;
estimate "4G" intercept 1 | Family 0 0 0 0 0 0 0 0 0 1 0 0 0 0 ;
estimate "G5" intercept 1 | Family 0 0 0 0 0 0 0 0 0 0 1 0 0 0 ;
estimate "GR1" intercept 1 | Family 0 0 0 0 0 0 0 0 0 0 0 1 0 0 ;
estimate "GR2" intercept 1 | Family 0 0 0 0 0 0 0 0 0 0 0 0 1 0 ;
estimate "Gr3" intercept 1 | Family 0 0 0 0 0 0 0 0 0 0 0 0 0 1 ;
run;

 

Any ideas on how the interaction term  "Family*Isolate(Lineage)" should be set up? Any ideas about if these estimate statements are correct?

1 ACCEPTED SOLUTION

Accepted Solutions
sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

Your Isolate(Lineage) coefficients are in the wrong places.

 

Add the solution option to the random statement so that you can see the implicit order of Isolate(Lineage)

 

random Family Isolate(Lineage) Family*Isolate(Lineage) / solution;

estimate "EU1 4" intercept 1 Lineage 1 0 | Isolate(Lineage) 1 0 0 0 0 0 /CL;
estimate "EU1 5" intercept 1 Lineage 1 0 | Isolate(Lineage) 0 1 0 0 0 0 /CL;
estimate "EU1 6" intercept 1 Lineage 1 0 | Isolate(Lineage) 0 0 1 0 0 0 /CL;
estimate "NA1 1" intercept 1 Lineage 0 1 | Isolate(Lineage) 0 0 0 1 0 0 /CL;
estimate "NA1 2" intercept 1 Lineage 0 1 | Isolate(Lineage) 0 0 0 0 1 0 /CL;
estimate "NA1 3" intercept 1 Lineage 0 1 | Isolate(Lineage) 0 0 0 0 0 1 /CL;

I expect you'll get estimates closer to what you obtained with Excel, although not exactly the same: your dataset is unbalanced, and these are shrinkage estimations.

 

An excellent resource is Chapter 9 (Best Linear Unbiased Prediction) in the text by Walt Stroup:

https://www.crcpress.com/Generalized-Linear-Mixed-Models-Modern-Concepts-Methods-and-Applications/St...

 

Hope this gets you closer to where you want to be.

 

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

but they don't really make sense

 

Explain. We have no idea what you mean.

 

Also, I haven't gone through the math in my head, but the estimates you are requesting ... aren't these just the least squares means? What happens if you request the least squares means?

--
Paige Miller
KLS
Calcite | Level 5 KLS
Calcite | Level 5

I mean that the averages that are calculated in excel don't even come close to the estimates that are given in SAS. Isolate 2 is supposed to have a very small average since all of my observations are very small, but the estimate that SAS gives me is very large.

 

LSmeans doesn't work for random effects. Is there another way to get least square means for random effects? This is the only other way I know of to get the "averages" for each term. 

 

I have moved the term into the fixed effects, and tried LSmeans and those averages make more sense, but I can't really use that data since the term is a random effect.

PaigeMiller
Diamond | Level 26

LSMeans on Random effects? That's not kosher!

 

While my rabbi would not approve, there's nothing stopping you from running the model again by rewriting the model as a fixed effects model and removing the RANDOM statement so you can get LSMeans. But you did that already ... did you modify the model properly? If you really want LSMeans on RANDOM effects, I don't know why you would say "... but I can't really use that data since the term is a random effect."

 

You said:

 

I mean that the averages that are calculated in excel don't even come close to the estimates that are given in SAS

 

Explain. We have no idea what you mean.

 

You have to provide us with enough information so we can answer the underlying problem. Provide the numbers. Attach the raw data as a text file, with appropriate SAS code to read in the data so we can reproduce the problem. Don't just attach the raw data. We have to have SAS code to read it in. Do not attach Excel or other MS Office files.

 

And why should Excel's answer be the same as an LSMean? Excel doesn't compute LSMeans or even know about your statistical model.

--
Paige Miller
KLS
Calcite | Level 5 KLS
Calcite | Level 5

I can't make my terms fixed effects if they are random! I chose my isolates randomly, so Isolate is a random effect. My model is not accurate if I just put terms wherever I want, instead of where they should go. I will not get accurate answers if I throw terms wherever I want them to go. Maybe I should just throw out my nested effect since that is what is confusing. I can't do that since Isolate is nested within Lineage. 

 

I tried LSmeans just to see what I would get, but I had to move the term to be a fixed effect instead of random. But I can't use that data since those terms aren't actually fixed effects. LSmeans does not work on random effects. The model does not run. It does not work in SAS. Therefore, I can't use LSmeans to get estimates. 

 

I am trying to explain that the estimates that I am getting while using the estimate statement do not seem like they are correct. FOR EXAMPLE, if all of my observations are between 1-5 and my estimate is 853, that doesn't really make sense. That is what I mean. 

 

Attached is my data file. 

This is the model I am running. I want to know if I am writing my estimate statement for the nested term correctly.

 

Proc mixed data = new covtest;
CLASS Family Isolate Lineage Tree;
Model Lesion = Lineage /ddfm=KR outp=dat;
random Family Isolate(Lineage) Family*Isolate(Lineage);

estimate "EU1 4" intercept 1 Lineage 1 0 | Isolate(Lineage) 0 0 0 1 0 0 /CL;
estimate "EU1 5" intercept 1 Lineage 1 0 | Isolate(Lineage) 0 0 0 0 1 0 /CL;
estimate "EU1 6" intercept 1 Lineage 1 0 | Isolate(Lineage) 0 0 0 0 0 1 /CL;
estimate "NA1 1" intercept 1 Lineage 0 1 | Isolate(Lineage) 1 0 0 0 0 0 /CL;
estimate "NA1 2" intercept 1 Lineage 0 1 | Isolate(Lineage) 0 1 0 0 0 0 /CL;
estimate "NA1 3" intercept 1 Lineage 0 1 | Isolate(Lineage) 0 0 1 0 0 0 /CL;
run;

 

This is the output I am getting.

 

SAS Output

Estimates Label Estimate Standard
Error DF t Value Pr > |t| Alpha Lower Upper EU1 4 EU1 5 EU1 6 NA1 1 NA1 2 NA1 3
4.81770.75904.116.350.00290.052.73296.9026
4.23000.75904.115.570.00470.052.14526.3149
6.38870.75904.118.420.00100.054.30398.4736
3.00410.75894.113.960.01580.050.91935.0889
3.93550.75904.115.190.00610.051.85076.0202
4.12540.75914.115.430.00510.052.04066.2102
PaigeMiller
Diamond | Level 26

I can't make my terms fixed effects if they are random! I chose my isolates randomly, so Isolate is a random effect. My model is not accurate if I just put terms wherever I want, instead of where they should go. I will not get accurate answers if I throw terms wherever I want them to go. Maybe I should just throw out my nested effect since that is what is confusing. I can't do that since Isolate is nested within Lineage. 

 

Maybe you shouldn't be trying to get the equivalent of LSMeans on random effects. I think this is the real problem. Why do you want these linear combinations of effects anyway?

 

Attached is my data file. 

This is the model I am running. I want to know if I am writing my estimate statement for the nested term correctly.

 

Please read what I said earlier about providing data.

 

 

--
Paige Miller
KLS
Calcite | Level 5 KLS
Calcite | Level 5

"Maybe you shouldn't be trying to get the equivalent of LSMeans on random effects. I think this is the real problem. Why do you want these linear combinations of effects anyway?"

 

Really? I need the estimates in order to publish my research. I want to get the interaction to show there is a family by Isolate(lineage) interaction.. but I need to try get Isolate(lineage) to work first. 

 

Attached is a text document of my data. I don't know how else to give you the SAS code.

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

Your Isolate(Lineage) coefficients are in the wrong places.

 

Add the solution option to the random statement so that you can see the implicit order of Isolate(Lineage)

 

random Family Isolate(Lineage) Family*Isolate(Lineage) / solution;

estimate "EU1 4" intercept 1 Lineage 1 0 | Isolate(Lineage) 1 0 0 0 0 0 /CL;
estimate "EU1 5" intercept 1 Lineage 1 0 | Isolate(Lineage) 0 1 0 0 0 0 /CL;
estimate "EU1 6" intercept 1 Lineage 1 0 | Isolate(Lineage) 0 0 1 0 0 0 /CL;
estimate "NA1 1" intercept 1 Lineage 0 1 | Isolate(Lineage) 0 0 0 1 0 0 /CL;
estimate "NA1 2" intercept 1 Lineage 0 1 | Isolate(Lineage) 0 0 0 0 1 0 /CL;
estimate "NA1 3" intercept 1 Lineage 0 1 | Isolate(Lineage) 0 0 0 0 0 1 /CL;

I expect you'll get estimates closer to what you obtained with Excel, although not exactly the same: your dataset is unbalanced, and these are shrinkage estimations.

 

An excellent resource is Chapter 9 (Best Linear Unbiased Prediction) in the text by Walt Stroup:

https://www.crcpress.com/Generalized-Linear-Mixed-Models-Modern-Concepts-Methods-and-Applications/St...

 

Hope this gets you closer to where you want to be.

 

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

For future reference, here is a process by which you can provide data to the community:

https://blogs.sas.com/content/sastraining/2016/03/11/jedi-sas-tricks-data-to-data-step-macro/

 

Most people dislike downloading MS Office files (e.g., Excel or Word) with unknown content. And a text file (e.g., .txt or .csv) of data does not include the information that might be needed to import it into SAS.

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 4359 views
  • 1 like
  • 3 in conversation