<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Overfitting in logistic regression! in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/655480#M22563</link>
    <description>&lt;P&gt;If your sas is above 9.4M4 , you could try GOF option.&lt;/P&gt;
&lt;P&gt;model .........../gof ;&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS&lt;/a&gt;&amp;nbsp; wrote a blog about it before ,and compare statistic V.S. machine learning .&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If your sas version is low ,try LACKFIT option.&lt;/P&gt;
&lt;P&gt;model ........../ lackfit&amp;nbsp; &amp;nbsp;firth ;&lt;/P&gt;</description>
    <pubDate>Tue, 09 Jun 2020 11:39:10 GMT</pubDate>
    <dc:creator>Ksharp</dc:creator>
    <dc:date>2020-06-09T11:39:10Z</dc:date>
    <item>
      <title>Overfitting in logistic regression!</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/654543#M22538</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm a new sas user, so firstly I'm sorry if these questions are dumb. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm basically doing a binary logistic regression, in order to predict my target variable (inactive=0, active=1) and I've randomly split the data into training (70%) and testing data (30%).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I used the proc logistic to run the logistic regression and now I need to understand if my model is overfitting /underfitting the data or not.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does anyone have any suggestions for analyzing overfitting with proc logistic? Are we able to do learning curves?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc logistic data=train;&lt;/P&gt;&lt;P&gt;class country gender / param=glm;&lt;/P&gt;&lt;P&gt;model y(event='1')=income var2 var3 /link=logit ctable&lt;/P&gt;&lt;P&gt;selection=backward slstay=0.05 hierarchy=single technique=fisher outroc=troc maxiter=50;&lt;/P&gt;&lt;P&gt;score data=test out=valpred outroc=vroc;&lt;/P&gt;&lt;P&gt;roc; roccontrast;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you all in advance,&lt;/P&gt;&lt;P&gt;Joana&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2020 14:41:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/654543#M22538</guid>
      <dc:creator>joanatomeribeir</dc:creator>
      <dc:date>2020-06-08T14:41:56Z</dc:date>
    </item>
    <item>
      <title>Re: Overfitting in logistic regression!</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/654547#M22539</link>
      <description>ROC and ROC contrast are the curves usually used, is that what you mean by learning curves?&lt;BR /&gt;&lt;BR /&gt;You can also look at PROC PLM.</description>
      <pubDate>Mon, 08 Jun 2020 14:48:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/654547#M22539</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-06-08T14:48:50Z</dc:date>
    </item>
    <item>
      <title>Re: Overfitting in logistic regression!</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/654561#M22542</link>
      <description>&lt;P&gt;You can compare the training and validation data sets using PROC LOGISTIC&lt;/P&gt;
&lt;P&gt;&lt;A href="http://support.sas.com/kb/39/724.html" target="_blank"&gt;http://support.sas.com/kb/39/724.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2020 15:19:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/654561#M22542</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2020-06-08T15:19:19Z</dc:date>
    </item>
    <item>
      <title>Re: Overfitting in logistic regression!</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/654562#M22543</link>
      <description>By learning curves, I meant to plot the loss of the train and test over time to understand if the model is overfitted or not.</description>
      <pubDate>Mon, 08 Jun 2020 15:21:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/654562#M22543</guid>
      <dc:creator>joanatomeribeir</dc:creator>
      <dc:date>2020-06-08T15:21:18Z</dc:date>
    </item>
    <item>
      <title>Re: Overfitting in logistic regression!</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/654601#M22556</link>
      <description>I did it, i scored the data with score statement on proc logistic, but I want to understand if the model is overfitted or not..</description>
      <pubDate>Mon, 08 Jun 2020 16:52:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/654601#M22556</guid>
      <dc:creator>joanatomeribeir</dc:creator>
      <dc:date>2020-06-08T16:52:08Z</dc:date>
    </item>
    <item>
      <title>Re: Overfitting in logistic regression!</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/654634#M22559</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/317395"&gt;@joanatomeribeir&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;By learning curves, I meant to plot the loss of the train and test over time to understand if the model is overfitted or not.&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;What do you mean by "loss of the train and test over time"? The general definition of overfitting does not include a time-related component.&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2020 17:54:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/654634#M22559</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2020-06-08T17:54:59Z</dc:date>
    </item>
    <item>
      <title>Re: Overfitting in logistic regression!</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/655325#M22561</link>
      <description>&lt;P&gt;Sorry, I didn't mean over time, I meant over training set size to understand if the model is overfitting/underfitting or if it is fitting the model well...&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="joanatomeribeir_0-1591693274688.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/41469i2AEAB7569F254DF9/image-size/medium?v=v2&amp;amp;px=400" role="button" title="joanatomeribeir_0-1591693274688.png" alt="joanatomeribeir_0-1591693274688.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 09 Jun 2020 09:02:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/655325#M22561</guid>
      <dc:creator>joanatomeribeir</dc:creator>
      <dc:date>2020-06-09T09:02:06Z</dc:date>
    </item>
    <item>
      <title>Re: Overfitting in logistic regression!</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/655464#M22562</link>
      <description>&lt;P&gt;What are the two lines in these graphs? Are they the confidence intervals of the logistic regression model coefficients? Please be specific.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Jun 2020 10:51:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/655464#M22562</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2020-06-09T10:51:57Z</dc:date>
    </item>
    <item>
      <title>Re: Overfitting in logistic regression!</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/655480#M22563</link>
      <description>&lt;P&gt;If your sas is above 9.4M4 , you could try GOF option.&lt;/P&gt;
&lt;P&gt;model .........../gof ;&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS&lt;/a&gt;&amp;nbsp; wrote a blog about it before ,and compare statistic V.S. machine learning .&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If your sas version is low ,try LACKFIT option.&lt;/P&gt;
&lt;P&gt;model ........../ lackfit&amp;nbsp; &amp;nbsp;firth ;&lt;/P&gt;</description>
      <pubDate>Tue, 09 Jun 2020 11:39:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/655480#M22563</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2020-06-09T11:39:10Z</dc:date>
    </item>
    <item>
      <title>Re: Overfitting in logistic regression!</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/655481#M22564</link>
      <description>&lt;P&gt;I'm sorry, you are right! Basically its the training and the validation set and i need to compare it to understand the bias and variance between the models.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="joanatomeribeir_0-1591701833589.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/41584iCC36C268F83DA4AC/image-size/medium?v=v2&amp;amp;px=400" role="button" title="joanatomeribeir_0-1591701833589.png" alt="joanatomeribeir_0-1591701833589.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;More specifically, I would like to understand if for a binary logistic regression if it makes sense to plot the log loss (y axis) and the training set size (x axis) instead of plotting MSE.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="joanatomeribeir_1-1591702209451.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/41585i9FEA374CD8D1A279/image-size/medium?v=v2&amp;amp;px=400" role="button" title="joanatomeribeir_1-1591702209451.png" alt="joanatomeribeir_1-1591702209451.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm quite lost of this issue....&lt;/P&gt;</description>
      <pubDate>Tue, 09 Jun 2020 11:43:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/655481#M22564</guid>
      <dc:creator>joanatomeribeir</dc:creator>
      <dc:date>2020-06-09T11:43:50Z</dc:date>
    </item>
    <item>
      <title>Re: Overfitting in logistic regression!</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/655497#M22565</link>
      <description>&lt;P&gt;As far as I know, there is no built-in method in PROC LOGISTIC or PROC HPLOGISTIC to do this.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Jun 2020 12:22:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/655497#M22565</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2020-06-09T12:22:11Z</dc:date>
    </item>
    <item>
      <title>Re: Overfitting in logistic regression!</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/656377#M22575</link>
      <description>&lt;A href="https://blogs.sas.com/content/iml/2019/01/30/model-validation-machine-learning.html" target="_blank"&gt;https://blogs.sas.com/content/iml/2019/01/30/model-validation-machine-learning.html&lt;/A&gt;</description>
      <pubDate>Wed, 10 Jun 2020 12:05:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/656377#M22575</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2020-06-10T12:05:31Z</dc:date>
    </item>
    <item>
      <title>Re: Overfitting in logistic regression!</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/658021#M22625</link>
      <description>You could try PROC HPGENSELECT +PARTITION as same as I refer to Rick's blog.</description>
      <pubDate>Fri, 12 Jun 2020 11:32:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Overfitting-in-logistic-regression/m-p/658021#M22625</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2020-06-12T11:32:46Z</dc:date>
    </item>
  </channel>
</rss>

