<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Calculating Mallow's CP correctly in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Calculating-Mallow-s-CP-correctly/m-p/15602#M353</link>
    <description>Just a guess.  PROC REG handles missing values such that if any variable needed for any regression is missing, the observation is excluded from all estimates.  If you had missing values for some of the independent variables that were NOT included in the final model, then the sample size, and hence Cp, would be different.&lt;BR /&gt;
&lt;BR /&gt;
Steve Denham</description>
    <pubDate>Fri, 17 Jun 2011 11:31:42 GMT</pubDate>
    <dc:creator>SteveDenham</dc:creator>
    <dc:date>2011-06-17T11:31:42Z</dc:date>
    <item>
      <title>Calculating Mallow's CP correctly</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Calculating-Mallow-s-CP-correctly/m-p/15601#M352</link>
      <description>Hi all,&lt;BR /&gt;
&lt;BR /&gt;
In order to assess the aptness of several possible subsets for multiple regression, I wanted to use amongst others Mallow's CP criterion.  However, a very strange thing seems to be happening.  When I perform the following commands, running on exactly the same data set, different CP values for the same model seem to appear. The first command:&lt;BR /&gt;
&lt;BR /&gt;
&lt;I&gt;proc reg data=model2 ;&lt;BR /&gt;
  model lny = X3-X8 X12-X22/selection=rsquare adjrsq cp press mse sse;&lt;BR /&gt;
run;&lt;BR /&gt;
quit;&lt;/I&gt;&lt;BR /&gt;
&lt;BR /&gt;
This generated, as wanted, a list with all possible combinations of subsets of X variables, together with the specified selection criteria, such as CP.  Then running a second command focusing on one particalur model:&lt;BR /&gt;
&lt;BR /&gt;
&lt;I&gt;proc reg data=model2 outest=temp ;&lt;BR /&gt;
  model lny = X3 X5 X12 X14 X15 X16 X17 X19 X20 X22/cp;&lt;BR /&gt;
run;&lt;BR /&gt;
quit;&lt;/I&gt;&lt;BR /&gt;
&lt;BR /&gt;
The CP values for this specific model by using the first command differs compared to the second one.  The only cause I could think of is that some other definitions for CP are used by both commands, due to the "selection" statement or something?&lt;BR /&gt;
&lt;BR /&gt;
Can anyone understand the possible cause of this?</description>
      <pubDate>Thu, 16 Jun 2011 22:19:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Calculating-Mallow-s-CP-correctly/m-p/15601#M352</guid>
      <dc:creator>peterdbr</dc:creator>
      <dc:date>2011-06-16T22:19:41Z</dc:date>
    </item>
    <item>
      <title>Re: Calculating Mallow's CP correctly</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Calculating-Mallow-s-CP-correctly/m-p/15602#M353</link>
      <description>Just a guess.  PROC REG handles missing values such that if any variable needed for any regression is missing, the observation is excluded from all estimates.  If you had missing values for some of the independent variables that were NOT included in the final model, then the sample size, and hence Cp, would be different.&lt;BR /&gt;
&lt;BR /&gt;
Steve Denham</description>
      <pubDate>Fri, 17 Jun 2011 11:31:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Calculating-Mallow-s-CP-correctly/m-p/15602#M353</guid>
      <dc:creator>SteveDenham</dc:creator>
      <dc:date>2011-06-17T11:31:42Z</dc:date>
    </item>
    <item>
      <title>Re: Calculating Mallow's CP correctly</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Calculating-Mallow-s-CP-correctly/m-p/15603#M354</link>
      <description>Dear Steve,&lt;BR /&gt;
&lt;BR /&gt;
Thanks for your fast reply.&lt;BR /&gt;
&lt;BR /&gt;
The data set I was using to run both commands is quite "clean" in the sense that no missing values are present: for each case all variables have a specified value, so I don't think the problem is related to that aspect...&lt;BR /&gt;
&lt;BR /&gt;
Kind regards,&lt;BR /&gt;
&lt;BR /&gt;
Peter</description>
      <pubDate>Fri, 17 Jun 2011 13:06:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Calculating-Mallow-s-CP-correctly/m-p/15603#M354</guid>
      <dc:creator>peterdbr</dc:creator>
      <dc:date>2011-06-17T13:06:37Z</dc:date>
    </item>
    <item>
      <title>Re: Calculating Mallow's CP correctly</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Calculating-Mallow-s-CP-correctly/m-p/15604#M355</link>
      <description>From the SAS online documentation, here is the definition of Cp:&lt;BR /&gt;
&lt;BR /&gt;
 &amp;nbsp; &amp;nbsp; &amp;nbsp; Cp = [(SSEp)/(s2)] - (N - 2p) &lt;BR /&gt;
&lt;BR /&gt;
 &amp;nbsp; where s2[=s**2] is the MSE for the full model, and SSEp is the&lt;BR /&gt;
 &amp;nbsp; sum-of-squares error for a model with p parameters&lt;BR /&gt;
&lt;BR /&gt;
Since s2 is the MSE for the full model (which includes all candidate variables), then changing the set of candidate variables will change the value of s2.  Hence, you can expect to get a different value for Mallow's Cp if you change the set of candidate variables.&lt;BR /&gt;
&lt;BR /&gt;
Now, if your restricted set of candidate variables includes all of the important variables, then the expectation for s2 in the restricted and complete variable sets should be the same.  So, you might not see much difference in Mallows' Cp if the restricted variable list contains all of the important predictors.  But if the restricted set results in the loss of important predictors, then E(s2) for the restricted variable set will be larger than E(s2) for the full variable set.  In that case, Mallows' Cp should go down in the restricted variable set.</description>
      <pubDate>Wed, 22 Jun 2011 20:25:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Calculating-Mallow-s-CP-correctly/m-p/15604#M355</guid>
      <dc:creator>Dale</dc:creator>
      <dc:date>2011-06-22T20:25:19Z</dc:date>
    </item>
    <item>
      <title>Re: Calculating Mallow's CP correctly</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Calculating-Mallow-s-CP-correctly/m-p/516923#M26359</link>
      <description>could you please explain how to plot cp versus p</description>
      <pubDate>Thu, 29 Nov 2018 00:25:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Calculating-Mallow-s-CP-correctly/m-p/516923#M26359</guid>
      <dc:creator>shahd</dc:creator>
      <dc:date>2018-11-29T00:25:11Z</dc:date>
    </item>
  </channel>
</rss>

