<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Why does LOGISTIC show predictions outside the range of independent variable? in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Why-does-LOGISTIC-show-predictions-outside-the-range-of/m-p/289528#M15356</link>
    <description>&lt;P&gt;PROC LOGISTIC fits a model. The model is valid for any values of the continuous variables, but most statisticians agree that you should not extrapolate a model to data outside the range of the data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, SAS does not enforce that restriction when&amp;nbsp;scoring a model. If you &lt;A title="Techniques for scoring a regression model in SAS" href="http://blogs.sas.com/content/iml/2014/02/19/scoring-a-regression-model-in-sas.html" target="_self"&gt;score a regression model&lt;/A&gt;&amp;nbsp;by using the SCORE statement, PROC SCORE, or PROC PLM, you can input any value you want for a continuous variable. In the case of PROC SCORE and PROC PLM, the input data set/item store do not even have a copy of the data; they only contain the parameter estimates for the model.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The graph that you are seeing in this example is a view of the MODEL, not a view of the data. In many cases, this is what the analyst wants to see. &amp;nbsp;Notice that this model is evaluated&amp;nbsp;for Duration=16.73, which is not even a value of Duration in the data set.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you want to see ONLY the predicted values at the data points, you can use the OUTPUT= option to create an output data set that contains the predicted probabilities. You can then use PROC SGPLOT to visualize the predicted values. You should really use a scatter plot for this, but many analysts try to "connect the dots" by using a series plot. Note, however, that if you have repeated values, you might get jagged lines if you attempt to connect the data points.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For your example, the SAS code follows:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
proc logistic data=Neuralgia plots(only)=effectplot;
   class Treatment Sex;
   model Pain= Treatment Sex Treatment*Sex Age Duration / expb;
   output out=LogiPred predicted=PredProb;
run;

data M;
   set LogiPred;
   group = Treatment || " " || Sex;
run;

proc sort data=M;
   by group age;
run;
&lt;BR /&gt;title "Predicted Probabilities for Data";
proc sgplot data=M;
   series x=age y=PredProb / group=group markers;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/4432iBBA5E9858AFFA594/image-size/original?v=v2&amp;amp;px=-1" border="0" alt="predprob.png" title="predprob.png" /&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 04 Aug 2016 16:53:05 GMT</pubDate>
    <dc:creator>Rick_SAS</dc:creator>
    <dc:date>2016-08-04T16:53:05Z</dc:date>
    <item>
      <title>Why does LOGISTIC show predictions outside the range of independent variable?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Why-does-LOGISTIC-show-predictions-outside-the-range-of/m-p/289505#M15354</link>
      <description>&lt;P&gt;A user on Twitter asks:&lt;/P&gt;
&lt;BLOCKQUOTE class="twitter-tweet" data-lang="en"&gt;
&lt;P dir="ltr" lang="en"&gt;.&lt;A href="https://twitter.com/SASsoftware" target="_blank"&gt;@SASsoftware&lt;/A&gt; Why does the output graph in PROC LOGISTIC PLOTS=ALL show predicted values outside of the range of the independent variable?&lt;/P&gt;
— Sam Van Horne (@LearningPlaces) &lt;A href="https://twitter.com/LearningPlaces/status/761022582862716928" target="_blank"&gt;August 4, 2016&lt;/A&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I think the explanation might be longer than can fit in a tweet, so I thought I'd post here to see what the experts say. &amp;nbsp;I'm hoping I understand his question correctly. &amp;nbsp;I tried to create an example &lt;A href="http://support.sas.com/documentation/cdl/en/statug/68162/HTML/default/viewer.htm#statug_logistic_examples02.htm" target="_self"&gt;using the SAS documentation&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data Neuralgia;
   input Treatment $ Sex $ Age Duration Pain $ @@;
   datalines;
P  F  68   1  No   B  M  74  16  No  P  F  67  30  No
P  M  66  26  Yes  B  F  67  28  No  B  F  77  16  No
A  F  71  12  No   B  F  72  50  No  B  F  76   9  Yes
A  M  71  17  Yes  A  F  63  27  No  A  F  69  18  Yes
B  F  66  12  No   A  M  62  42  No  P  F  64   1  Yes
A  F  64  17  No   P  M  74   4  No  A  F  72  25  No
P  M  70   1  Yes  B  M  66  19  No  B  M  59  29  No
A  F  64  30  No   A  M  70  28  No  A  M  69   1  No
B  F  78   1  No   P  M  83   1  Yes B  F  69  42  No
B  M  75  30  Yes  P  M  77  29  Yes P  F  79  20  Yes
A  M  70  12  No   A  F  69  12  No  B  F  65  14  No
B  M  70   1  No   B  M  67  23  No  A  M  76  25  Yes
P  M  78  12  Yes  B  M  77   1  Yes B  F  69  24  No
P  M  66   4  Yes  P  F  65  29  No  P  M  60  26  Yes
A  M  78  15  Yes  B  M  75  21  Yes A  F  67  11  No
P  F  72  27  No   P  F  70  13  Yes A  M  75   6  Yes
B  F  65   7  No   P  F  68  27  Yes P  M  68  11  Yes
P  M  67  17  Yes  B  M  70  22  No  A  M  65  15  No
P  F  67   1  Yes  A  M  67  10  No  P  F  72  11  Yes
A  F  74   1  No   B  M  80  21  Yes A  F  69   3  No
;

ods graphics on;

/* capture value ranges */
proc means data=Neuralgia min max; run;

proc logistic data=Neuralgia plots=all;
   class Treatment Sex;
   model Pain= Treatment Sex Treatment*Sex Age Duration / expb;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Even though MAX of Age is 83, the prediction plot goes beyond that (looks like through Age 90).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/4430iFEC6BBE690D87197/image-size/large?v=v2&amp;amp;px=-1" border="0" alt="img10.png" title="img10.png" /&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 04 Aug 2016 12:42:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Why-does-LOGISTIC-show-predictions-outside-the-range-of/m-p/289505#M15354</guid>
      <dc:creator>ChrisHemedinger</dc:creator>
      <dc:date>2016-08-04T12:42:15Z</dc:date>
    </item>
    <item>
      <title>Re: Why does LOGISTIC show predictions outside the range of independent variable?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Why-does-LOGISTIC-show-predictions-outside-the-range-of/m-p/289520#M15355</link>
      <description>&lt;P&gt;Because what if an 84 year old walks into the neurologist's office tomorrow? &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The sigmoid shape of the curve does ensure that predictions fall between 0 and 1, though.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 04 Aug 2016 13:25:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Why-does-LOGISTIC-show-predictions-outside-the-range-of/m-p/289520#M15355</guid>
      <dc:creator>rayIII</dc:creator>
      <dc:date>2016-08-04T13:25:56Z</dc:date>
    </item>
    <item>
      <title>Re: Why does LOGISTIC show predictions outside the range of independent variable?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Why-does-LOGISTIC-show-predictions-outside-the-range-of/m-p/289528#M15356</link>
      <description>&lt;P&gt;PROC LOGISTIC fits a model. The model is valid for any values of the continuous variables, but most statisticians agree that you should not extrapolate a model to data outside the range of the data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, SAS does not enforce that restriction when&amp;nbsp;scoring a model. If you &lt;A title="Techniques for scoring a regression model in SAS" href="http://blogs.sas.com/content/iml/2014/02/19/scoring-a-regression-model-in-sas.html" target="_self"&gt;score a regression model&lt;/A&gt;&amp;nbsp;by using the SCORE statement, PROC SCORE, or PROC PLM, you can input any value you want for a continuous variable. In the case of PROC SCORE and PROC PLM, the input data set/item store do not even have a copy of the data; they only contain the parameter estimates for the model.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The graph that you are seeing in this example is a view of the MODEL, not a view of the data. In many cases, this is what the analyst wants to see. &amp;nbsp;Notice that this model is evaluated&amp;nbsp;for Duration=16.73, which is not even a value of Duration in the data set.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you want to see ONLY the predicted values at the data points, you can use the OUTPUT= option to create an output data set that contains the predicted probabilities. You can then use PROC SGPLOT to visualize the predicted values. You should really use a scatter plot for this, but many analysts try to "connect the dots" by using a series plot. Note, however, that if you have repeated values, you might get jagged lines if you attempt to connect the data points.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For your example, the SAS code follows:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
proc logistic data=Neuralgia plots(only)=effectplot;
   class Treatment Sex;
   model Pain= Treatment Sex Treatment*Sex Age Duration / expb;
   output out=LogiPred predicted=PredProb;
run;

data M;
   set LogiPred;
   group = Treatment || " " || Sex;
run;

proc sort data=M;
   by group age;
run;
&lt;BR /&gt;title "Predicted Probabilities for Data";
proc sgplot data=M;
   series x=age y=PredProb / group=group markers;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/4432iBBA5E9858AFFA594/image-size/original?v=v2&amp;amp;px=-1" border="0" alt="predprob.png" title="predprob.png" /&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 04 Aug 2016 16:53:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Why-does-LOGISTIC-show-predictions-outside-the-range-of/m-p/289528#M15356</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2016-08-04T16:53:05Z</dc:date>
    </item>
    <item>
      <title>Re: Why does LOGISTIC show predictions outside the range of independent variable?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Why-does-LOGISTIC-show-predictions-outside-the-range-of/m-p/289537#M15357</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS﻿&lt;/a&gt;. &amp;nbsp;I knew that wouldn't fit in a tweet. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 04 Aug 2016 13:58:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Why-does-LOGISTIC-show-predictions-outside-the-range-of/m-p/289537#M15357</guid>
      <dc:creator>ChrisHemedinger</dc:creator>
      <dc:date>2016-08-04T13:58:40Z</dc:date>
    </item>
    <item>
      <title>Re: Why does LOGISTIC show predictions outside the range of independent variable?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Why-does-LOGISTIC-show-predictions-outside-the-range-of/m-p/289545#M15358</link>
      <description>&lt;P&gt;Plots show model. Can get pred vals on data by using OUTPUT stmt. [LINK]&lt;/P&gt;</description>
      <pubDate>Thu, 04 Aug 2016 14:18:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Why-does-LOGISTIC-show-predictions-outside-the-range-of/m-p/289545#M15358</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2016-08-04T14:18:25Z</dc:date>
    </item>
  </channel>
</rss>

