<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: In k fold CV how use trained model (logistic regression) to compare test data set and draw ROC in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/In-k-fold-CV-how-use-trained-model-logistic-regression-to/m-p/312073#M4683</link>
    <description>&lt;P&gt;Thanks very much for you reply. Links in another reply helped me find the answer (drawing ROC and calculating AUC). But since I am still new in SAS and STAT, I was wondering if you think below procedure has no problems.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sorry&amp;nbsp;for my post seemed a little confusing. I wanted to compare two different models for their predictive power. Because the response is binary, I used logistic regression. So I used 10 fold cross validation to evaluate the two models. For each model, I used 9 folds of data to train the model and then compare the&amp;nbsp;held data to draw the ROC. I was wondering AUC can be a good indicator for the evaluation of the two models.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks again!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 16 Nov 2016 18:49:00 GMT</pubDate>
    <dc:creator>Zinan</dc:creator>
    <dc:date>2016-11-16T18:49:00Z</dc:date>
    <item>
      <title>In k fold CV how use trained model (logistic regression) to compare test data set and draw ROC</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/In-k-fold-CV-how-use-trained-model-logistic-regression-to/m-p/311920#M4678</link>
      <description>&lt;P&gt;Hi I am new to SAS and need suggestions on my &amp;nbsp;codes.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I wanted to compare two logistic regression models. I tried to use 10-fold cross-validation for both models. I first searched online and find below code which can generate different folds and run logistic regression on my training data sets. But I do not know how I can use the training model to predict and compare my test data set. If possible, I wanted to draw ROC and calculate the AUC.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I attached current codes here. Please help me for my next step. Thanks very much!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;%let&lt;/SPAN&gt; K=&lt;SPAN class="s2"&gt;&lt;STRONG&gt;10&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;%let&lt;/SPAN&gt; rate=&lt;SPAN class="s1"&gt;%&lt;/SPAN&gt;sysevalf&lt;STRONG&gt;((&lt;/STRONG&gt;&amp;amp;K-&lt;SPAN class="s2"&gt;&lt;STRONG&gt;1&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;STRONG&gt;)&lt;/STRONG&gt;/&amp;amp;K&lt;STRONG&gt;);&lt;/STRONG&gt;&lt;/P&gt;&lt;P class="p2"&gt;*Build model with all data;&lt;/P&gt;&lt;P class="p3"&gt;&lt;STRONG&gt;proc&lt;/STRONG&gt; &lt;STRONG&gt;logistic&lt;/STRONG&gt; &lt;STRONG&gt;data&lt;/STRONG&gt;&lt;SPAN class="s3"&gt;=cold&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;model&lt;/SPAN&gt; y&lt;STRONG&gt;(&lt;/STRONG&gt;event=&lt;SPAN class="s4"&gt;"1"&lt;/SPAN&gt;&lt;STRONG&gt;)&lt;/STRONG&gt;=temp &lt;SPAN class="s1"&gt;time&lt;/SPAN&gt; &lt;SPAN class="s1"&gt;time&lt;/SPAN&gt;*temp&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/P&gt;&lt;P class="p3"&gt;&lt;STRONG&gt;run&lt;/STRONG&gt;&lt;SPAN class="s3"&gt;&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p2"&gt;*Generate the cross validation sample;&lt;/P&gt;&lt;P class="p3"&gt;&lt;STRONG&gt;proc&lt;/STRONG&gt; &lt;STRONG&gt;surveyselect&lt;/STRONG&gt; &lt;STRONG&gt;data&lt;/STRONG&gt;&lt;SPAN class="s3"&gt;=cold &lt;/SPAN&gt;&lt;SPAN class="s1"&gt;out&lt;/SPAN&gt;&lt;SPAN class="s3"&gt;=&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;cv&lt;/SPAN&gt;&lt;SPAN class="s3"&gt; seed=&lt;/SPAN&gt;&lt;SPAN class="s2"&gt;&lt;STRONG&gt;231258&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;samprate=&amp;amp;rate outall reps=&lt;SPAN class="s2"&gt;&lt;STRONG&gt;10&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/P&gt;&lt;P class="p3"&gt;&lt;STRONG&gt;run&lt;/STRONG&gt;&lt;SPAN class="s3"&gt;&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p2"&gt;/* the variable selected is an automatic variable generatic by surveyselect.*/&lt;/P&gt;&lt;P class="p2"&gt;/*If selected is true then then new_y will get the value of y otherwise is missing */&lt;/P&gt;&lt;P class="p3"&gt;&lt;STRONG&gt;data&lt;/STRONG&gt; &lt;SPAN class="s1"&gt;cv&lt;/SPAN&gt;&lt;SPAN class="s3"&gt;&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p4"&gt;set cv&lt;SPAN class="s3"&gt;&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;if&lt;/SPAN&gt; selected &lt;SPAN class="s1"&gt;then&lt;/SPAN&gt; new_y=y&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/P&gt;&lt;P class="p3"&gt;&lt;STRONG&gt;run&lt;/STRONG&gt;&lt;SPAN class="s3"&gt;&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p2"&gt;/* get predicted values for the missing new_y in each replicate */&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;ods&lt;/SPAN&gt; &lt;SPAN class="s1"&gt;output&lt;/SPAN&gt; ParameterEstimates=ParamEst&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/P&gt;&lt;P class="p3"&gt;&lt;STRONG&gt;proc&lt;/STRONG&gt; &lt;STRONG&gt;logistic&lt;/STRONG&gt; &lt;STRONG&gt;data&lt;/STRONG&gt;&lt;SPAN class="s3"&gt;=&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;cv&lt;/SPAN&gt;&lt;SPAN class="s3"&gt;&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;model&lt;/SPAN&gt; new_y&lt;STRONG&gt;(&lt;/STRONG&gt;event=&lt;SPAN class="s4"&gt;"1"&lt;/SPAN&gt;&lt;STRONG&gt;)&lt;/STRONG&gt;=temp &lt;SPAN class="s1"&gt;time&lt;/SPAN&gt; &lt;SPAN class="s1"&gt;time&lt;/SPAN&gt;*temp/outroc=roc&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;by&lt;/SPAN&gt; replicate&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;output&lt;/SPAN&gt; &lt;SPAN class="s1"&gt;out&lt;/SPAN&gt;=out1&lt;STRONG&gt;(&lt;/STRONG&gt;&lt;SPAN class="s1"&gt;where&lt;/SPAN&gt;=&lt;STRONG&gt;(&lt;/STRONG&gt;new_y=&lt;SPAN class="s2"&gt;&lt;STRONG&gt;.&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;STRONG&gt;))&lt;/STRONG&gt; predicted=y_hat&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/P&gt;&lt;P class="p3"&gt;&lt;STRONG&gt;run&lt;/STRONG&gt;&lt;SPAN class="s3"&gt;&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 16 Nov 2016 07:42:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/In-k-fold-CV-how-use-trained-model-logistic-regression-to/m-p/311920#M4678</guid>
      <dc:creator>Zinan</dc:creator>
      <dc:date>2016-11-16T07:42:46Z</dc:date>
    </item>
    <item>
      <title>Re: In k fold CV how use trained model (logistic regression) to compare test data set and draw ROC</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/In-k-fold-CV-how-use-trained-model-logistic-regression-to/m-p/311931#M4679</link>
      <description>&lt;P&gt;What exactly is your final model though?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You ran a regression for each slice so you have multiple models so to speak. How are you planning to finalize that model?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 16 Nov 2016 08:46:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/In-k-fold-CV-how-use-trained-model-logistic-regression-to/m-p/311931#M4679</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2016-11-16T08:46:18Z</dc:date>
    </item>
    <item>
      <title>Re: In k fold CV how use trained model (logistic regression) to compare test data set and draw ROC</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/In-k-fold-CV-how-use-trained-model-logistic-regression-to/m-p/311998#M4680</link>
      <description>&lt;PRE&gt;
Use SCORE statement to predict test data.

https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-to-do-k-fold-CV-with-replacements-replication/m-p/303938#U303938


Calculate AUC.

http://blogs.sas.com/content/iml/2011/07/29/computing-an-roc-curve-from-basic-principles.html

http://blogs.sas.com/content/iml/2011/06/03/a-statistical-application-of-numerical-integration-the-area-under-an-roc-curve.html

http://blogs.sas.com/content/iml/2011/07/08/the-area-under-a-density-estimate-curve-nonparametric-estimates.html

http://blogs.sas.com/content/iml/2011/05/27/obtaining-area-from-a-set-of-points-on-a-curve.html
&lt;/PRE&gt;</description>
      <pubDate>Wed, 16 Nov 2016 14:43:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/In-k-fold-CV-how-use-trained-model-logistic-regression-to/m-p/311998#M4680</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2016-11-16T14:43:39Z</dc:date>
    </item>
    <item>
      <title>Re: In k fold CV how use trained model (logistic regression) to compare test data set and draw ROC</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/In-k-fold-CV-how-use-trained-model-logistic-regression-to/m-p/312070#M4682</link>
      <description>&lt;P&gt;Thanks very much! Your links help me find ways to calculate the AUC. This is awesome!&lt;/P&gt;</description>
      <pubDate>Wed, 16 Nov 2016 18:42:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/In-k-fold-CV-how-use-trained-model-logistic-regression-to/m-p/312070#M4682</guid>
      <dc:creator>Zinan</dc:creator>
      <dc:date>2016-11-16T18:42:42Z</dc:date>
    </item>
    <item>
      <title>Re: In k fold CV how use trained model (logistic regression) to compare test data set and draw ROC</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/In-k-fold-CV-how-use-trained-model-logistic-regression-to/m-p/312073#M4683</link>
      <description>&lt;P&gt;Thanks very much for you reply. Links in another reply helped me find the answer (drawing ROC and calculating AUC). But since I am still new in SAS and STAT, I was wondering if you think below procedure has no problems.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sorry&amp;nbsp;for my post seemed a little confusing. I wanted to compare two different models for their predictive power. Because the response is binary, I used logistic regression. So I used 10 fold cross validation to evaluate the two models. For each model, I used 9 folds of data to train the model and then compare the&amp;nbsp;held data to draw the ROC. I was wondering AUC can be a good indicator for the evaluation of the two models.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks again!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 16 Nov 2016 18:49:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/In-k-fold-CV-how-use-trained-model-logistic-regression-to/m-p/312073#M4683</guid>
      <dc:creator>Zinan</dc:creator>
      <dc:date>2016-11-16T18:49:00Z</dc:date>
    </item>
  </channel>
</rss>

