<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hosmer-Lemeshaw Test in Enterprise Miner in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Hosmer-Lemeshaw-Test-in-Enterprise-Miner/m-p/34772#M193</link>
    <description>Prior to using EM, I often undertook classification problems using PROC LOGISTIC. With the correct options, I could call for a Hosmer-Lemeshaw test which I believe gave me a good idea of how well my model could predict  overall on a 'decile by decile' basis (as opposed to misclassification rate that is just an overall measure of fit).&lt;BR /&gt;
&lt;BR /&gt;
Is there any way to call for this test in EM - like in the model assessment node or otherwise? &lt;BR /&gt;
&lt;BR /&gt;
Correct me if I am wrong about the interpretation and use of the HL test. ( I know there are criticisms of it having low power for n &amp;lt; 400, but I typically deal in 1000's of obs)&lt;BR /&gt;
&lt;BR /&gt;
Thanks.</description>
    <pubDate>Wed, 17 Nov 2010 16:28:41 GMT</pubDate>
    <dc:creator>SlutskyFan</dc:creator>
    <dc:date>2010-11-17T16:28:41Z</dc:date>
    <item>
      <title>Hosmer-Lemeshaw Test in Enterprise Miner</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Hosmer-Lemeshaw-Test-in-Enterprise-Miner/m-p/34772#M193</link>
      <description>Prior to using EM, I often undertook classification problems using PROC LOGISTIC. With the correct options, I could call for a Hosmer-Lemeshaw test which I believe gave me a good idea of how well my model could predict  overall on a 'decile by decile' basis (as opposed to misclassification rate that is just an overall measure of fit).&lt;BR /&gt;
&lt;BR /&gt;
Is there any way to call for this test in EM - like in the model assessment node or otherwise? &lt;BR /&gt;
&lt;BR /&gt;
Correct me if I am wrong about the interpretation and use of the HL test. ( I know there are criticisms of it having low power for n &amp;lt; 400, but I typically deal in 1000's of obs)&lt;BR /&gt;
&lt;BR /&gt;
Thanks.</description>
      <pubDate>Wed, 17 Nov 2010 16:28:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Hosmer-Lemeshaw-Test-in-Enterprise-Miner/m-p/34772#M193</guid>
      <dc:creator>SlutskyFan</dc:creator>
      <dc:date>2010-11-17T16:28:41Z</dc:date>
    </item>
    <item>
      <title>Re: Hosmer-Lemeshaw Test in Enterprise Miner</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Hosmer-Lemeshaw-Test-in-Enterprise-Miner/m-p/387920#M5799</link>
      <description>&lt;H3&gt;&lt;FONT size="4"&gt;The short answer is that other than Mallow's Cq, SAS Enterprise Miner does not provide a Hosmer-Lemeshow test. &amp;nbsp;You could write code to run the LOGISTIC procedure in a SAS Code node but any results would be contained in that node so it would not likely benefit the user more than just running it in SAS.&lt;/FONT&gt;&lt;/H3&gt;
&lt;H3&gt;&amp;nbsp;&lt;/H3&gt;
&lt;H3&gt;&lt;FONT size="4"&gt;Looking further, a quick search of the SAS Enterprise Miner help utility (available by clicking on &lt;STRONG&gt;Help&lt;/STRONG&gt; --&amp;gt; &lt;STRONG&gt;Contents&lt;/STRONG&gt; and then clicking on the magnifying glass to reveal the search window) only shows one place where the term "Hosmer-Lemeshow" even appears and it is related to Mallow's Cq. &amp;nbsp; Here is an extract from the help:&amp;nbsp;&lt;/FONT&gt;&lt;/H3&gt;
&lt;H3&gt;&amp;nbsp;&lt;/H3&gt;
&lt;H3&gt;&lt;A name="p1vpd4lqgjv71xn1rnxpqgqak1q2" target="_blank"&gt;&lt;/A&gt; Statistical Measures&lt;/H3&gt;
&lt;P&gt;&lt;A name="n1gv6xl9y5b64nn13j4vy1iwonzn" target="_blank"&gt;&lt;/A&gt; Statistical measures of model performance are based on both model error and degrees of freedom. Some SAS Enterprise Miner models, such as Decision Tree models, do not output degrees of freedom and are not suitable for benchmarking using the statistical measures listed here. The information that follows pertains to Mallows’ Cq, Akaike’s Information Criterion, Bayesian Information Criterion, and Kolmogorov-Smirnov Statistic and is suitable only for specific models.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Mallow’s Cq&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Mallows' Cq (Hosmer and Lemeshow, 2000) is a variant of Mallows Cp measure (1973), which can be used to analyze linear regression models for assessment. Hosmer and Lemeshow derived the corresponding Cq statistic to evaluate logistic regression models, using the following equation&lt;/P&gt;
&lt;DIV id="n18sc1ed19v2l7n1el3hr2fnparl" class="xis-figure"&gt;&lt;IMG src="https://communities.sas.com/../images/cq_equation.png" border="0" alt="C(sub-q) = [(x^2 + Lambda^*)/((x^2)/(n - p - 1))] + 2(q + 1) - n" /&gt;&lt;/DIV&gt;
&lt;P&gt;where&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;I&gt;q&lt;/I&gt; is the number of selected variables,&lt;/LI&gt;
&lt;LI&gt;&lt;I&gt;p&lt;/I&gt; is the number of candidate variables&lt;/LI&gt;
&lt;LI&gt;&lt;IMG src="https://communities.sas.com/../images/pearson_chi_square_stat.png" border="0" alt="X^2 = Sum[(y(sub-i) – pi-hat(sub-i))^2/(pi-hat(sub-i)(1– pi-hat(sub-i))]" /&gt;
&lt;P&gt;is the Pearson chi-square statistic for the model with p variables&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;λ is the multivariable Wald statistic that measures significance of p-q coefficients that are not in the model&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The expected value of C&lt;SUB&gt;q&lt;/SUB&gt; is q + 1. Models with C&lt;SUB&gt;q&lt;/SUB&gt; values near q + 1 are candidates for final models.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="4"&gt;&amp;nbsp;In general, data mining problems have massive numbers of variables which leads to a high likelihood of missing values. Given the typical data size, several things are often true of these problems:&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="4"&gt;* missing values must be imputed&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="4"&gt;* the imputed data will have a large number of observations&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="4"&gt;* the number of usable observations will be inflated by the imputation&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="4"&gt;* the presence of imputed data makes many of the classical estimates of error more questionable.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="4"&gt;* holdout data is present to validate/test the fitted model (empirical validation, statistical validation less critical)&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="4"&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="4"&gt;For these reasons, classical statistical scenarios such as those appropriate for treatment by Hosmer-Lemeshow are not routinely calculated in SAS Enterprise Miner.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;I hope this helps!&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="4"&gt;Doug&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 14 Aug 2017 19:07:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Hosmer-Lemeshaw-Test-in-Enterprise-Miner/m-p/387920#M5799</guid>
      <dc:creator>DougWielenga</dc:creator>
      <dc:date>2017-08-14T19:07:35Z</dc:date>
    </item>
  </channel>
</rss>

