<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: LSMEANS oddity in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/LSMEANS-oddity/m-p/24239#M850</link>
    <description>Hey Peter,&lt;BR /&gt;
&lt;BR /&gt;
Look at this (code added to your original):&lt;BR /&gt;
&lt;BR /&gt;
proc format;&lt;BR /&gt;
value racfmt 1 = 'Black'&lt;BR /&gt;
2 = 'White'&lt;BR /&gt;
3 = 'Latino';&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
data today;&lt;BR /&gt;
input catv1 catv2 catv3 @@;&lt;BR /&gt;
dv = catv1 * 3 + catv2 * 5 + catv3 + rannor(123);&lt;BR /&gt;
format catv3 racfmt.;&lt;BR /&gt;
datalines;&lt;BR /&gt;
0 1 2 0 1 1 0 1 3 1 0 1 0 0 3 0 1 3 0 1 2 1 0 3 1 0 1 0 1 2 0 0 3 &lt;BR /&gt;
1 1 2 1 1 1 0 1 3 1 1 1 1 0 3 0 1 3 1 1 2 1 0 3 1 0 1 0 1 2 1 1 3 &lt;BR /&gt;
0 0 2 1 0 1 1 1 3 0 0 1 0 0 3 0 0 3 0 0 2 1 0 3 1 0 1 0 1 2 1 1 3 &lt;BR /&gt;
;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
title 'Version with all on CLASS statement';&lt;BR /&gt;
proc glm data = today;&lt;BR /&gt;
class catv1 catv2 catv3;&lt;BR /&gt;
model dv = catv1 catv2 catv3;&lt;BR /&gt;
lsmeans catv1 catv2 catv3;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
title 'Version with only race on CLASS statement';&lt;BR /&gt;
proc glm data = today;&lt;BR /&gt;
class catv3;&lt;BR /&gt;
model dv = catv1 catv2 catv3;&lt;BR /&gt;
lsmeans catv3;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
&lt;B&gt;proc means data=today;&lt;BR /&gt;
var dv catv1 catv2 catv3;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
title 'Version with only race on CLASS statement, with at=0.5';&lt;BR /&gt;
title2 'Results same as all on CLASS statement';&lt;BR /&gt;
proc glm data = today;&lt;BR /&gt;
class catv3;&lt;BR /&gt;
model dv = catv1 catv2 catv3;&lt;BR /&gt;
lsmeans catv3/at (catv1 catv2)=(0.5 0.5);&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
title 'Version with only race on CLASS statement, with at=&lt;MEAN values="" for="" catv1="" and="" catv2=""&gt;';&lt;BR /&gt;
title2 'Results same as only race on CLASS statement';&lt;BR /&gt;
proc glm data = today;&lt;BR /&gt;
class catv3;&lt;BR /&gt;
model dv = catv1 catv2 catv3;&lt;BR /&gt;
lsmeans catv3/at (catv1 catv2)=(0.48484848485 0.5151515151515);&lt;BR /&gt;
run;&lt;/MEAN&gt;&lt;/B&gt;&lt;BR /&gt;
&lt;BR /&gt;
So the difference is in the solution for the OLS equations.  The first calculates lsmeans with equal weighting by class membership (which was the whole point of Searle, Speed and Milliken (1980), I think), while the second calculates lsmeans at the mean value.&lt;BR /&gt;
&lt;BR /&gt;
I would bet that the large differences in your real data arise from substantial differences in class size.&lt;BR /&gt;
&lt;BR /&gt;
Good luck.</description>
    <pubDate>Tue, 28 Apr 2009 12:05:32 GMT</pubDate>
    <dc:creator>SteveDenham</dc:creator>
    <dc:date>2009-04-28T12:05:32Z</dc:date>
    <item>
      <title>LSMEANS oddity</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/LSMEANS-oddity/m-p/24238#M849</link>
      <description>I ran into this oddity at work.... I can't show the real data, but I made up some (see below).&lt;BR /&gt;
In this example, the differences are quite small, but in the data at work they were not so small.&lt;BR /&gt;
&lt;BR /&gt;
Suppose you have a data set with a dependent variable and some categorical variables.  Some of these&lt;BR /&gt;
are coded 0-1, some have several levels.  I was under the impression that with 0-1 variables, it did&lt;BR /&gt;
not matter whether you included them on the CLASS statement, and, indeed, the parameter estimates are&lt;BR /&gt;
identical for the two versions below.  But LSMEANS are not identical, even for the variable with more &lt;BR /&gt;
than 2 levels, which is always on the CLASS statement.&lt;BR /&gt;
&lt;BR /&gt;
So ...&lt;BR /&gt;
&lt;BR /&gt;
&lt;B&gt;proc format;&lt;BR /&gt;
 value racfmt 1 = 'Black'&lt;BR /&gt;
              2 = 'White'&lt;BR /&gt;
			  3 = 'Latino';&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
data today;&lt;BR /&gt;
  input catv1 catv2 catv3 @@;&lt;BR /&gt;
  dv = catv1 * 3 + catv2 * 5 + catv3 + rannor(123);&lt;BR /&gt;
  format catv3 racfmt.;&lt;BR /&gt;
  datalines;&lt;BR /&gt;
  0 1 2  0 1 1  0 1 3  1 0 1   0 0 3  0 1 3  0 1 2  1 0 3  1 0 1  0 1 2  0 0 3 &lt;BR /&gt;
  1 1 2  1 1 1  0 1 3  1 1 1   1 0 3  0 1 3  1 1 2  1 0 3  1 0 1  0 1 2  1 1 3 &lt;BR /&gt;
  0 0 2  1 0 1  1 1 3  0 0 1   0 0 3  0 0 3  0 0 2  1 0 3  1 0 1  0 1 2  1 1 3 &lt;BR /&gt;
;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
title 'Version with all on CLASS statement';&lt;BR /&gt;
proc glm data = today;&lt;BR /&gt;
 class catv1 catv2 catv3;&lt;BR /&gt;
 model dv = catv1 catv2 catv3;&lt;BR /&gt;
 lsmeans catv1 catv2 catv3;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
title 'Version with only race on CLASS statement';&lt;BR /&gt;
proc glm data = today;&lt;BR /&gt;
 class catv3;&lt;BR /&gt;
 model dv = catv1 catv2 catv3;&lt;BR /&gt;
 lsmeans catv3;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
&lt;/B&gt;and there are differences ....&lt;BR /&gt;
&lt;BR /&gt;
I understand the models are parameterized a bit differently, with different intercepts, but&lt;BR /&gt;
shouldn't LSMEANS be the same?  And which are 'correct'?</description>
      <pubDate>Tue, 28 Apr 2009 01:27:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/LSMEANS-oddity/m-p/24238#M849</guid>
      <dc:creator>plf515</dc:creator>
      <dc:date>2009-04-28T01:27:41Z</dc:date>
    </item>
    <item>
      <title>Re: LSMEANS oddity</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/LSMEANS-oddity/m-p/24239#M850</link>
      <description>Hey Peter,&lt;BR /&gt;
&lt;BR /&gt;
Look at this (code added to your original):&lt;BR /&gt;
&lt;BR /&gt;
proc format;&lt;BR /&gt;
value racfmt 1 = 'Black'&lt;BR /&gt;
2 = 'White'&lt;BR /&gt;
3 = 'Latino';&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
data today;&lt;BR /&gt;
input catv1 catv2 catv3 @@;&lt;BR /&gt;
dv = catv1 * 3 + catv2 * 5 + catv3 + rannor(123);&lt;BR /&gt;
format catv3 racfmt.;&lt;BR /&gt;
datalines;&lt;BR /&gt;
0 1 2 0 1 1 0 1 3 1 0 1 0 0 3 0 1 3 0 1 2 1 0 3 1 0 1 0 1 2 0 0 3 &lt;BR /&gt;
1 1 2 1 1 1 0 1 3 1 1 1 1 0 3 0 1 3 1 1 2 1 0 3 1 0 1 0 1 2 1 1 3 &lt;BR /&gt;
0 0 2 1 0 1 1 1 3 0 0 1 0 0 3 0 0 3 0 0 2 1 0 3 1 0 1 0 1 2 1 1 3 &lt;BR /&gt;
;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
title 'Version with all on CLASS statement';&lt;BR /&gt;
proc glm data = today;&lt;BR /&gt;
class catv1 catv2 catv3;&lt;BR /&gt;
model dv = catv1 catv2 catv3;&lt;BR /&gt;
lsmeans catv1 catv2 catv3;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
title 'Version with only race on CLASS statement';&lt;BR /&gt;
proc glm data = today;&lt;BR /&gt;
class catv3;&lt;BR /&gt;
model dv = catv1 catv2 catv3;&lt;BR /&gt;
lsmeans catv3;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
&lt;B&gt;proc means data=today;&lt;BR /&gt;
var dv catv1 catv2 catv3;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
title 'Version with only race on CLASS statement, with at=0.5';&lt;BR /&gt;
title2 'Results same as all on CLASS statement';&lt;BR /&gt;
proc glm data = today;&lt;BR /&gt;
class catv3;&lt;BR /&gt;
model dv = catv1 catv2 catv3;&lt;BR /&gt;
lsmeans catv3/at (catv1 catv2)=(0.5 0.5);&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
title 'Version with only race on CLASS statement, with at=&lt;MEAN values="" for="" catv1="" and="" catv2=""&gt;';&lt;BR /&gt;
title2 'Results same as only race on CLASS statement';&lt;BR /&gt;
proc glm data = today;&lt;BR /&gt;
class catv3;&lt;BR /&gt;
model dv = catv1 catv2 catv3;&lt;BR /&gt;
lsmeans catv3/at (catv1 catv2)=(0.48484848485 0.5151515151515);&lt;BR /&gt;
run;&lt;/MEAN&gt;&lt;/B&gt;&lt;BR /&gt;
&lt;BR /&gt;
So the difference is in the solution for the OLS equations.  The first calculates lsmeans with equal weighting by class membership (which was the whole point of Searle, Speed and Milliken (1980), I think), while the second calculates lsmeans at the mean value.&lt;BR /&gt;
&lt;BR /&gt;
I would bet that the large differences in your real data arise from substantial differences in class size.&lt;BR /&gt;
&lt;BR /&gt;
Good luck.</description>
      <pubDate>Tue, 28 Apr 2009 12:05:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/LSMEANS-oddity/m-p/24239#M850</guid>
      <dc:creator>SteveDenham</dc:creator>
      <dc:date>2009-04-28T12:05:32Z</dc:date>
    </item>
    <item>
      <title>Re: LSMEANS oddity</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/LSMEANS-oddity/m-p/24240#M851</link>
      <description>Thanks Steve!&lt;BR /&gt;
&lt;BR /&gt;
That is very clear.&lt;BR /&gt;
&lt;BR /&gt;
Peter</description>
      <pubDate>Tue, 28 Apr 2009 14:30:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/LSMEANS-oddity/m-p/24240#M851</guid>
      <dc:creator>plf515</dc:creator>
      <dc:date>2009-04-28T14:30:01Z</dc:date>
    </item>
  </channel>
</rss>

