<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Enterprise Miner Assessment  Score Rankings Output Interpretation in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Assessment-Score-Rankings-Output-Interpretation/m-p/10531#M32</link>
    <description>Replying to my own post, I figured out the mean posterior probability calculation. My mind was stuck on calculating 'cumulative quantities. Iterating the following code, selecting row numbers that correspond to the given percentile, will match the output Miner gives for Mean Posterior Probability. &lt;BR /&gt;
&lt;BR /&gt;
For the 10th Percentile (in the chart I linked to earlier - &lt;A href="http://econometricsense.blogspot.com/2011/02/sample-assessment-score-rankings-for.html" target="_blank"&gt;http://econometricsense.blogspot.com/2011/02/sample-assessment-score-rankings-for.html&lt;/A&gt; ) &lt;BR /&gt;
&lt;BR /&gt;
%LET FROM = 10; &lt;BR /&gt;
%LET TO = 20; &lt;BR /&gt;
&lt;BR /&gt;
PROC MEANS DATA = VALIDATE; &lt;BR /&gt;
VAR Predicted__TERM_GPA_Less_than_1_; &lt;BR /&gt;
WHERE ROW_NUM &amp;gt; &amp;amp;FROM AND ROW_NUM LE &amp;amp;TO; &lt;BR /&gt;
RUN;&lt;BR /&gt;
&lt;BR /&gt;
I can't believe I bothered technical support over this. I just don't trust my results without solid documentation, or just checking the raw data myself.</description>
    <pubDate>Tue, 15 Feb 2011 21:00:48 GMT</pubDate>
    <dc:creator>SlutskyFan</dc:creator>
    <dc:date>2011-02-15T21:00:48Z</dc:date>
    <item>
      <title>Enterprise Miner Assessment  Score Rankings Output Interpretation</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Assessment-Score-Rankings-Output-Interpretation/m-p/10529#M30</link>
      <description>The documentation for output in EM is not thorough. Basically I'd like to know &lt;BR /&gt;
&lt;BR /&gt;
1)how the column Mean Posterier Probability is computed. I'd also like to verify &lt;BR /&gt;
&lt;BR /&gt;
2) that the values under Cumulative % Response are based on *OBSERVED* class values, not predicted. &lt;BR /&gt;
&lt;BR /&gt;
Below is my output and the base SAS code I used to try to verify my conception of this output using the scored validation data from EM, but I'm not able to do so. &lt;BR /&gt;
&lt;BR /&gt;
Could someone help me with interpretation?  For more details, say I have the following Output From EM:&lt;BR /&gt;
&lt;BR /&gt;
                                                                                                                 Mean&lt;BR /&gt;
                                       Cumulative      %          Cumulative    Number of    Posterior&lt;BR /&gt;
Percentile    Gain     Lift      Lift           Response  % Response  Observations   Probabil &lt;BR /&gt;
 &lt;BR /&gt;
      5     143.210  2.43210    2.43210    100.000    100.000        10         0.81433&lt;BR /&gt;
     10     106.728  1.70247    2.06728     70.000     85.000        10         0.69886&lt;BR /&gt;
     15      78.354  1.21605    1.78354     50.000     73.333        10         0.64560&lt;BR /&gt;
     20      64.167  1.21605    1.64167     50.000     67.500        10         0.59028&lt;BR /&gt;
     25      50.790  0.97284    1.50790     40.000     62.000        10         0.53578&lt;BR /&gt;
     30      45.926  1.21605    1.45926     50.000     60.000        10         0.51555&lt;BR /&gt;
     35      44.516  1.35117    1.44516     55.556     59.420         9         0.47502&lt;BR /&gt;
     40      35.459  0.72963    1.35459     30.000     55.696        10         0.44177&lt;BR /&gt;
     45      28.437  0.72963    1.28437     30.000     52.809        10         0.41554&lt;BR /&gt;
     50      20.377  0.48642    1.20377     20.000     49.495        10         0.39732&lt;BR /&gt;
     55      18.258  0.97284    1.18258     40.000     48.624        10         0.36956&lt;BR /&gt;
     60      16.495  0.97284    1.16495     40.000     47.899        10         0.33354&lt;BR /&gt;
     65      18.777  1.45926    1.18777     60.000     48.837        10         0.31359&lt;BR /&gt;
     70      18.080  1.08093    1.18080     44.444     48.551         9         0.30182&lt;BR /&gt;
     75      13.388  0.48642    1.13388     20.000     46.622        10         0.27938&lt;BR /&gt;
     80       9.291  0.48642    1.09291     20.000     44.937        10         0.25124&lt;BR /&gt;
     85       4.233  0.24321    1.04233     10.000     42.857        10         0.20888&lt;BR /&gt;
     90       2.476  0.72963    1.02476     30.000     42.135        10         0.15998&lt;BR /&gt;
     95       2.200  0.97284    1.02200     40.000     42.021        10         0.08486&lt;BR /&gt;
    100       0.000  0.54047    1.00000     22.222     41.117         9         0.00652&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
By exporting the scored validation data from EM to csv and reading into SAS,sorting by descending  predicted probabilty and using PROC SQL to ad a row number (call this data 'VALIDATE' for reference)  I can recreate the values under 'Cumulative % Response' by &lt;BR /&gt;
&lt;BR /&gt;
/* CUM % RESP FOR 10TH PERCENTILE */&lt;BR /&gt;
&lt;BR /&gt;
PROC FREQ DATA = ;&lt;BR /&gt;
TABLES OBS_TARGET;&lt;BR /&gt;
WHERE ROW_NUM  =20;&lt;BR /&gt;
RUN;&lt;BR /&gt;
&lt;BR /&gt;
This tells me that the value in  'Cumulative % Response'  is based on the *OBSERVED* response not the predicted response. ( I can write code to get predicted response outside EM in base SAS if necessary)&lt;BR /&gt;
&lt;BR /&gt;
I am trying to interpret the values under Mean Posterior Probability. I just assumed it was the mean of the predicted probability for the target class level in that slice of the sorted data. However, if I invoke the following in base SAS code:&lt;BR /&gt;
&lt;BR /&gt;
PROC MEANS DATA = VALIDATE;&lt;BR /&gt;
VAR  Predicted_Term_GPA_Less_than_1_; /* CREATED IN EM */&lt;BR /&gt;
WHERE ROW_NUM  =20;&lt;BR /&gt;
&lt;BR /&gt;
I do not get the same number reported in the column for Mean Posterior Probabilty for the 10th percentile.  I do match exactly for the 5th percentile, but it must be a coincidence. &lt;BR /&gt;
&lt;BR /&gt;
Can someone tell me how this value is computed?</description>
      <pubDate>Tue, 15 Feb 2011 17:06:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Assessment-Score-Rankings-Output-Interpretation/m-p/10529#M30</guid>
      <dc:creator>SlutskyFan</dc:creator>
      <dc:date>2011-02-15T17:06:52Z</dc:date>
    </item>
    <item>
      <title>Re: Enterprise Miner Assessment  Score Rankings Output Interpretation</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Assessment-Score-Rankings-Output-Interpretation/m-p/10530#M31</link>
      <description>Unfortunately, the chart did not publish the way it pasted in the editor- see &lt;BR /&gt;
&lt;BR /&gt;
&lt;A href="http://econometricsense.blogspot.com/2011/02/sample-assessment-score-rankings-for.html" target="_blank"&gt;http://econometricsense.blogspot.com/2011/02/sample-assessment-score-rankings-for.html&lt;/A&gt;&lt;BR /&gt;
&lt;BR /&gt;
for more readable output. Thanks.</description>
      <pubDate>Tue, 15 Feb 2011 17:17:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Assessment-Score-Rankings-Output-Interpretation/m-p/10530#M31</guid>
      <dc:creator>SlutskyFan</dc:creator>
      <dc:date>2011-02-15T17:17:49Z</dc:date>
    </item>
    <item>
      <title>Re: Enterprise Miner Assessment  Score Rankings Output Interpretation</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Assessment-Score-Rankings-Output-Interpretation/m-p/10531#M32</link>
      <description>Replying to my own post, I figured out the mean posterior probability calculation. My mind was stuck on calculating 'cumulative quantities. Iterating the following code, selecting row numbers that correspond to the given percentile, will match the output Miner gives for Mean Posterior Probability. &lt;BR /&gt;
&lt;BR /&gt;
For the 10th Percentile (in the chart I linked to earlier - &lt;A href="http://econometricsense.blogspot.com/2011/02/sample-assessment-score-rankings-for.html" target="_blank"&gt;http://econometricsense.blogspot.com/2011/02/sample-assessment-score-rankings-for.html&lt;/A&gt; ) &lt;BR /&gt;
&lt;BR /&gt;
%LET FROM = 10; &lt;BR /&gt;
%LET TO = 20; &lt;BR /&gt;
&lt;BR /&gt;
PROC MEANS DATA = VALIDATE; &lt;BR /&gt;
VAR Predicted__TERM_GPA_Less_than_1_; &lt;BR /&gt;
WHERE ROW_NUM &amp;gt; &amp;amp;FROM AND ROW_NUM LE &amp;amp;TO; &lt;BR /&gt;
RUN;&lt;BR /&gt;
&lt;BR /&gt;
I can't believe I bothered technical support over this. I just don't trust my results without solid documentation, or just checking the raw data myself.</description>
      <pubDate>Tue, 15 Feb 2011 21:00:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Assessment-Score-Rankings-Output-Interpretation/m-p/10531#M32</guid>
      <dc:creator>SlutskyFan</dc:creator>
      <dc:date>2011-02-15T21:00:48Z</dc:date>
    </item>
  </channel>
</rss>

