Hello, I know there are similar questions on this forum but being a SAS noob, I have been unable to get anything to work for my particular use case. Below I provide a simple example dataset that illustrates the issue. I have an "unbalanced" dataset where I have 3 species, sp1, sp2, and sp3, for which biomass was measured. But sp1 was measured in months 6,7,8,9 while sp2 and sp3 were not measured in month 6 but only in 7,8,9.
I understand why the lsmeans statement returns non-estimable for species 2 and 3 because you cannot get a least-square mean over empty cells. However I want to write an estimate statement to get means for sp2 and sp3, only over months 7,8,9 and not considering month 6. I still get non-estimable in that case. I don't understand whether I am writing the statement incorrectly or whether I am misunderstanding what the estimate statement is doing.
Thanks for your help!
DATA example;
INPUT pasture $ block $ month $ year $ species $ biomass;
cards;
3 1 6 2021 sp1 745
6 2 6 2021 sp1 1011
9 3 6 2021 sp1 1129
3 1 7 2021 sp1 1612
6 2 7 2021 sp1 2034
9 3 7 2021 sp1 2210
3 1 8 2021 sp1 2795
6 2 8 2021 sp1 2975
9 3 8 2021 sp1 2699
3 1 9 2021 sp1 499
6 2 9 2021 sp1 624
9 3 9 2021 sp1 759
1 1 7 2021 sp2 2422
2 1 7 2021 sp3 2685
4 2 7 2021 sp3 1276
5 2 7 2021 sp2 2303
7 3 7 2021 sp2 2269
8 3 7 2021 sp3 784
1 1 8 2021 sp2 6833
2 1 8 2021 sp3 3604
4 2 8 2021 sp3 2231
5 2 8 2021 sp2 8447
7 3 8 2021 sp2 6303
8 3 8 2021 sp3 1656
1 1 9 2021 sp2 3675
2 1 9 2021 sp3 1129
4 2 9 2021 sp3 1552
5 2 9 2021 sp2 6495
7 3 9 2021 sp2 5575
8 3 9 2021 sp3 647
;
proc glimmix data=example;
class species month pasture block;
model biomass=species|month / e dist=gamma link=log s;
random pasture(block);
lsmeans species; * Species 2 and 3 are not estimable which makes sense because they have no data for month 6;
estimate "Species 2, only months 7-9" species 0 1 0 month 0 1 1 1; * Also not estimable but I don't understand why not because I am not averaging over any empty cells;
run;
Thank you for this very helpful comment! It got me 99% of the way there. The bylevel option did not give me exactly what I wanted because I wanted the lsmean of each species only across the months 7-9 cells. Using bylevel gives the correct lsmean for species 2 and 3 but not species 1. But your suggestion to use OM inspired me to look in the documentation. I found that creating a new dataset for obsmargins that does not include month 6, and passing that as the obsmargins dataset in proc plm, I can get the correct lsmeans that I was looking for, see code below.
data nojune; set example;
where month <> '6';
run;
proc glimmix data=example;
class species month pasture block;
model biomass=species|month / dist=gamma link=log;
random pasture(block);
store out=examplemod;
run;
proc plm restore=examplemod;
lsmeans species / cl obsmargins=nojune pdiff adjust=tukey;
run;
Thank you for this very helpful comment! It got me 99% of the way there. The bylevel option did not give me exactly what I wanted because I wanted the lsmean of each species only across the months 7-9 cells. Using bylevel gives the correct lsmean for species 2 and 3 but not species 1. But your suggestion to use OM inspired me to look in the documentation. I found that creating a new dataset for obsmargins that does not include month 6, and passing that as the obsmargins dataset in proc plm, I can get the correct lsmeans that I was looking for, see code below.
data nojune; set example;
where month <> '6';
run;
proc glimmix data=example;
class species month pasture block;
model biomass=species|month / dist=gamma link=log;
random pasture(block);
store out=examplemod;
run;
proc plm restore=examplemod;
lsmeans species / cl obsmargins=nojune pdiff adjust=tukey;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.