<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Predicting a binary response variable in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630483#M30255</link>
    <description>&lt;P&gt;I don't see a need for Time Series ARIMA or AUTOREG if there are only two measurements per student.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Simple logistic regression of the measurement in the fall to predict end_of_may test score. The two different years could be used as an additional predictor variable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In any event, I would combine both years of data, and randomly select individuals to be the training data set, and other randomly selected individuals to be the validation data set.&lt;/P&gt;</description>
    <pubDate>Sun, 08 Mar 2020 15:27:07 GMT</pubDate>
    <dc:creator>PaigeMiller</dc:creator>
    <dc:date>2020-03-08T15:27:07Z</dc:date>
    <item>
      <title>Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630404#M30251</link>
      <description>My objective is to predict if a student will be flagged to attend a summer reading camp that is determined by a test score generated during end-of-year testing in May. The variable used to predict is a reading score earned in the Fall. &lt;BR /&gt;I call the response variable camp_flag and the fall score f_read. &lt;BR /&gt;My model (I’m assuming) is something like:&lt;BR /&gt;&lt;BR /&gt;model camp_flag = f_read&lt;BR /&gt;&lt;BR /&gt;I have 2 years of data, so I want to use one year to create the model and use the other year to test the accuracy of the model’s ability to predict camp_flag. Camp_flag is 0 or 1. &lt;BR /&gt;&lt;BR /&gt;My online search is a bit overwhelming. I just need a suggestion on which procedure to learn to accomplish this.</description>
      <pubDate>Sat, 07 Mar 2020 19:16:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630404#M30251</guid>
      <dc:creator>GreggB</dc:creator>
      <dc:date>2020-03-07T19:16:41Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630405#M30252</link>
      <description />
      <pubDate>Sat, 07 Mar 2020 19:17:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630405#M30252</guid>
      <dc:creator>GreggB</dc:creator>
      <dc:date>2020-03-07T19:17:50Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630408#M30253</link>
      <description>&lt;P&gt;PROC LOGISTIC models binary outcomes&lt;/P&gt;
&lt;P&gt;However you also have time which makes it more complicated.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;PROC AUTOREG and ARIMA are probably your starting point.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13712"&gt;@GreggB&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;My objective is to predict if a student will be flagged to attend a summer reading camp that is determined by a test score generated during end-of-year testing in May. The variable used to predict is a reading score earned in the Fall. &lt;BR /&gt;I call the response variable camp_flag and the fall score f_read. &lt;BR /&gt;My model (I’m assuming) is something like:&lt;BR /&gt;&lt;BR /&gt;model camp_flag = f_read&lt;BR /&gt;&lt;BR /&gt;I have 2 years of data, so I want to use one year to create the model and use the other year to test the accuracy of the model’s ability to predict camp_flag. Camp_flag is 0 or 1. &lt;BR /&gt;&lt;BR /&gt;My online search is a bit overwhelming. I just need a suggestion on which procedure to learn to accomplish this.&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 07 Mar 2020 20:08:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630408#M30253</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-03-07T20:08:58Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630482#M30254</link>
      <description>&lt;P&gt;Is the time issue because the 2 tests are several months apart or because my 2 data sets are from 2 different years?&lt;/P&gt;</description>
      <pubDate>Sun, 08 Mar 2020 12:36:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630482#M30254</guid>
      <dc:creator>GreggB</dc:creator>
      <dc:date>2020-03-08T12:36:22Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630483#M30255</link>
      <description>&lt;P&gt;I don't see a need for Time Series ARIMA or AUTOREG if there are only two measurements per student.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Simple logistic regression of the measurement in the fall to predict end_of_may test score. The two different years could be used as an additional predictor variable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In any event, I would combine both years of data, and randomly select individuals to be the training data set, and other randomly selected individuals to be the validation data set.&lt;/P&gt;</description>
      <pubDate>Sun, 08 Mar 2020 15:27:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630483#M30255</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2020-03-08T15:27:07Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630507#M30256</link>
      <description>&lt;P&gt;For some reason I thought that time would be a factor. Can students be sent to the camp more than once? Does their previous attendance affect their likelihood to attend again?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If not, I totally agree with&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;that you should combine both years and take a random sample, BUT make sure to either include or exclude a student entirely or include them entirely. A single student shouldn't have records in both the test and training data set.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 08 Mar 2020 18:37:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630507#M30256</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-03-08T18:37:02Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630510#M30257</link>
      <description>&lt;P&gt;They would attend only once.&amp;nbsp; To be sure I can unduplicate by Student ID to make sure.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I think I read about what you're saying - the data is divided into 2 sets using ranuni. One set is used to create the model and the other half is used for prediction?&lt;/P&gt;</description>
      <pubDate>Sun, 08 Mar 2020 19:41:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630510#M30257</guid>
      <dc:creator>GreggB</dc:creator>
      <dc:date>2020-03-08T19:41:49Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630512#M30258</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13712"&gt;@GreggB&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;They would attend only once.&amp;nbsp; To be sure I can unduplicate by Student ID to make sure.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I think I read about what you're saying - the data is divided into 2 sets using ranuni. One set is used to create the model and the other half is used for prediction?&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;BR /&gt;Yes, that's one way to do it.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And make sure that the different years are a categorical predictor variable in the model.&lt;/P&gt;</description>
      <pubDate>Sun, 08 Mar 2020 20:13:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630512#M30258</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2020-03-08T20:13:08Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630525#M30259</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc logistic data = twoyears outest=estimates_2yrs;
model camp_flag = RIT;
run;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;twoyears looks like so:&amp;nbsp; (ID is unique; termName has 2 possible values; camp_flag is 0 or 1)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;termName&amp;nbsp; &amp;nbsp; &amp;nbsp;ID&amp;nbsp; &amp;nbsp; &amp;nbsp; RIT&amp;nbsp; &amp;nbsp; camp_flag&lt;/P&gt;
&lt;P&gt;2016-2017&amp;nbsp; &amp;nbsp; &amp;nbsp;001&amp;nbsp; &amp;nbsp; 249&amp;nbsp; &amp;nbsp; &amp;nbsp; 0&lt;/P&gt;
&lt;P&gt;2017-2018&amp;nbsp; &amp;nbsp; &amp;nbsp;002&amp;nbsp; &amp;nbsp; 279&amp;nbsp; &amp;nbsp; &amp;nbsp; 1&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. You're saying my model should be&amp;nbsp; camp_flag = termName RIT ?&lt;/P&gt;
&lt;P&gt;2. I want to make sure&amp;nbsp; my objective is clear:&amp;nbsp; I have a 3rd data set (termName = 2019-2020) that contains RIT and I want to predict the camp_flag value so that students most likely to have a value of 0 based on their end-of-year test can be identified now and receive academic intervention.&amp;nbsp; My next step?&lt;/P&gt;</description>
      <pubDate>Sun, 08 Mar 2020 22:00:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630525#M30259</guid>
      <dc:creator>GreggB</dc:creator>
      <dc:date>2020-03-08T22:00:11Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630530#M30260</link>
      <description>&lt;P&gt;What is RIT?&lt;/P&gt;
&lt;DIV id="tap-translate"&gt;&amp;nbsp;&lt;/DIV&gt;</description>
      <pubDate>Sun, 08 Mar 2020 22:49:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630530#M30260</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2020-03-08T22:49:26Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630531#M30261</link>
      <description>&lt;P&gt;My mistake. It is the fall reading score&amp;nbsp; &amp;nbsp;I referred to as f_read earlier.&lt;/P&gt;</description>
      <pubDate>Sun, 08 Mar 2020 22:58:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630531#M30261</guid>
      <dc:creator>GreggB</dc:creator>
      <dc:date>2020-03-08T22:58:03Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630532#M30262</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13712"&gt;@GreggB&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc logistic data = twoyears outest=estimates_2yrs;
model camp_flag = RIT;
run;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;twoyears looks like so:&amp;nbsp; (ID is unique; termName has 2 possible values; camp_flag is 0 or 1)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;termName&amp;nbsp; &amp;nbsp; &amp;nbsp;ID&amp;nbsp; &amp;nbsp; &amp;nbsp; RIT&amp;nbsp; &amp;nbsp; camp_flag&lt;/P&gt;
&lt;P&gt;2016-2017&amp;nbsp; &amp;nbsp; &amp;nbsp;001&amp;nbsp; &amp;nbsp; 249&amp;nbsp; &amp;nbsp; &amp;nbsp; 0&lt;/P&gt;
&lt;P&gt;2017-2018&amp;nbsp; &amp;nbsp; &amp;nbsp;002&amp;nbsp; &amp;nbsp; 279&amp;nbsp; &amp;nbsp; &amp;nbsp; 1&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. You're saying my model should be&amp;nbsp; camp_flag = termName RIT ?&lt;/P&gt;
&lt;P&gt;2. I want to make sure&amp;nbsp; my objective is clear:&amp;nbsp; I have a 3rd data set (termName = 2019-2020) that contains RIT and I want to predict the camp_flag value so that students most likely to have a value of 0 based on their end-of-year test can be identified now and receive academic intervention.&amp;nbsp; My next step?&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;OL&gt;
&lt;LI&gt;Yes&lt;/LI&gt;
&lt;LI&gt;So ... third data set ... this was not mentioned before. If the model finds no difference between the years, then take it out of the model, re-fit the model without year and use the SCORE command in PROC LOGISTIC to predict the results of the individuals in the third data set. If the two years are statistically different in the fitted model, then I don't think you can use the model to predict a different year that was not in the model.&lt;/LI&gt;
&lt;/OL&gt;
&lt;DIV id="tap-translate"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV id="tap-translate"&gt;&amp;nbsp;&lt;/DIV&gt;</description>
      <pubDate>Sun, 08 Mar 2020 23:06:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630532#M30262</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2020-03-08T23:06:55Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630534#M30263</link>
      <description>&lt;P&gt;updated code:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc logistic data = twoyears outest=estimates_2yrs;
class termname;
model camp_flag = termname rit;
run;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Since termname is not numeric I used a CLASS statement. Is this correct?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;if so, I interpret this as TermName not being signficant.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="branch"&gt;
&lt;DIV&gt;
&lt;DIV align="center"&gt;
&lt;TABLE class="table" summary="Procedure Logistic: Parameter Estimates" frame="box" rules="all" cellspacing="0" cellpadding="5"&gt;
&lt;THEAD&gt;
&lt;TR&gt;
&lt;TH class="c b header" colspan="7" scope="colgroup"&gt;Analysis of Maximum Likelihood Estimates&lt;/TH&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TH class="l b header" scope="col"&gt;Parameter&lt;/TH&gt;
&lt;TH class="l b header" scope="col"&gt;&amp;nbsp;&lt;/TH&gt;
&lt;TH class="r b header" scope="col"&gt;DF&lt;/TH&gt;
&lt;TH class="r b header" scope="col"&gt;Estimate&lt;/TH&gt;
&lt;TH class="r b header" scope="col"&gt;Standard&lt;BR /&gt;Error&lt;/TH&gt;
&lt;TH class="r b header" scope="col"&gt;Wald&lt;BR /&gt;Chi-Square&lt;/TH&gt;
&lt;TH class="r b header" scope="col"&gt;Pr&amp;nbsp;&amp;gt;&amp;nbsp;ChiSq&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TH class="l rowheader" scope="row"&gt;Intercept&lt;/TH&gt;
&lt;TH class="l rowheader" scope="row"&gt;&amp;nbsp;&lt;/TH&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;18.7084&lt;/TD&gt;
&lt;TD class="r data"&gt;1.8919&lt;/TD&gt;
&lt;TD class="r data"&gt;97.7879&lt;/TD&gt;
&lt;TD class="r data"&gt;&amp;lt;.0001&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TH class="l rowheader" scope="row"&gt;TermName&lt;/TH&gt;
&lt;TH class="l rowheader" scope="row"&gt;Fall 2016-2017&lt;/TH&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;-0.1980&lt;/TD&gt;
&lt;TD class="r data"&gt;0.1377&lt;/TD&gt;
&lt;TD class="r data"&gt;2.0676&lt;/TD&gt;
&lt;TD class="r data"&gt;0.1505&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TH class="l rowheader" scope="row"&gt;rit&lt;/TH&gt;
&lt;TH class="l rowheader" scope="row"&gt;&amp;nbsp;&lt;/TH&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;-0.1225&lt;/TD&gt;
&lt;TD class="r data"&gt;0.0113&lt;/TD&gt;
&lt;TD class="r data"&gt;118.1675&lt;/TD&gt;
&lt;TD class="r data"&gt;&amp;lt;.0001&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 08 Mar 2020 23:20:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630534#M30263</guid>
      <dc:creator>GreggB</dc:creator>
      <dc:date>2020-03-08T23:20:12Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630537#M30264</link>
      <description>&lt;P&gt;Yes, that's correct.&lt;/P&gt;
&lt;DIV id="tap-translate"&gt;&amp;nbsp;&lt;/DIV&gt;</description>
      <pubDate>Sun, 08 Mar 2020 23:40:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630537#M30264</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2020-03-08T23:40:06Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630542#M30265</link>
      <description>&lt;P&gt;So if a student has already attended they cannot attend again or they’re not going to be recommended to attend even if their test scores warrant it?&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13712"&gt;@GreggB&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;They would attend only once.&amp;nbsp; To be sure I can unduplicate by Student ID to make sure.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I think I read about what you're saying - the data is divided into 2 sets using ranuni. One set is used to create the model and the other half is used for prediction?&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 09 Mar 2020 00:26:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630542#M30265</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-03-09T00:26:51Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630545#M30266</link>
      <description>&lt;P&gt;&lt;SPAN&gt;The summer camp is for grade 3 only. The only way a student would attend twice would be if they are retained in grade 3 and they score low enough both times to be flagged for attendance at the summer camp. Since all the data sets have a unique student ID I can easily find scenarios like this if they occurred&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 09 Mar 2020 00:48:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630545#M30266</guid>
      <dc:creator>GreggB</dc:creator>
      <dc:date>2020-03-09T00:48:46Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630548#M30267</link>
      <description>So not the same students each year, that's better then. I'd definitely remove those records but you do need to account for them somehow.</description>
      <pubDate>Mon, 09 Mar 2020 01:55:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630548#M30267</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-03-09T01:55:46Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630759#M30268</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;/* split the data randomly with 50/50 split */
data train valid;
set twoyears; /* 2 years of data combined */
if ranuni(7) &amp;lt;= .5 then output train; else output valid;
run;
/*compare the 2 data sets */
proc logistic data = train outest=estimates_train;
model camp_flag = rit;
run;
quit;
proc logistic data = valid outest=estimates_valid;
model camp_flag = rit;
run;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN&gt;Based on what I have studied I believe this is the next step. Here is the % concordant for train and valid, respectively. Is PROC SCORE my next step, using "twoyears"?&amp;nbsp;&amp;nbsp;I'm not sure which portion of the output to look at to determine if I have a model that's good for prediction.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="branch"&gt;
&lt;DIV&gt;
&lt;DIV align="center"&gt;
&lt;TABLE class="table" summary="Procedure Logistic: Association Statistics" frame="box" rules="all" cellspacing="0" cellpadding="5"&gt;
&lt;THEAD&gt;
&lt;TR&gt;
&lt;TH class="c b header" colspan="4" scope="colgroup"&gt;Association of Predicted Probabilities and&lt;BR /&gt;Observed Responses&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TH class="l rowheader" scope="row"&gt;Percent Concordant&lt;/TH&gt;
&lt;TD class="r data"&gt;94.3&lt;/TD&gt;
&lt;TH class="l rowheader" scope="row"&gt;Somers' D&lt;/TH&gt;
&lt;TD class="r data"&gt;0.892&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TH class="l rowheader" scope="row"&gt;Percent Discordant&lt;/TH&gt;
&lt;TD class="r data"&gt;5.1&lt;/TD&gt;
&lt;TH class="l rowheader" scope="row"&gt;Gamma&lt;/TH&gt;
&lt;TD class="r data"&gt;0.898&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TH class="l rowheader" scope="row"&gt;Percent Tied&lt;/TH&gt;
&lt;TD class="r data"&gt;0.6&lt;/TD&gt;
&lt;TH class="l rowheader" scope="row"&gt;Tau-a&lt;/TH&gt;
&lt;TD class="r data"&gt;0.099&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TH class="l rowheader" scope="row"&gt;Pairs&lt;/TH&gt;
&lt;TD class="r data"&gt;29455&lt;/TD&gt;
&lt;TH class="l rowheader" scope="row"&gt;c&lt;/TH&gt;
&lt;TD class="r data"&gt;
&lt;P&gt;0.946&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV class="branch"&gt;
&lt;DIV&gt;
&lt;DIV align="center"&gt;
&lt;TABLE class="table" summary="Procedure Logistic: Association Statistics" frame="box" rules="all" cellspacing="0" cellpadding="5"&gt;
&lt;THEAD&gt;
&lt;TR&gt;
&lt;TH class="c b header" colspan="4" scope="colgroup"&gt;Association of Predicted Probabilities and&lt;BR /&gt;Observed Responses&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TH class="l rowheader" scope="row"&gt;Percent Concordant&lt;/TH&gt;
&lt;TD class="r data"&gt;89.0&lt;/TD&gt;
&lt;TH class="l rowheader" scope="row"&gt;Somers' D&lt;/TH&gt;
&lt;TD class="r data"&gt;0.788&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TH class="l rowheader" scope="row"&gt;Percent Discordant&lt;/TH&gt;
&lt;TD class="r data"&gt;10.1&lt;/TD&gt;
&lt;TH class="l rowheader" scope="row"&gt;Gamma&lt;/TH&gt;
&lt;TD class="r data"&gt;0.795&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TH class="l rowheader" scope="row"&gt;Percent Tied&lt;/TH&gt;
&lt;TD class="r data"&gt;0.9&lt;/TD&gt;
&lt;TH class="l rowheader" scope="row"&gt;Tau-a&lt;/TH&gt;
&lt;TD class="r data"&gt;0.063&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TH class="l rowheader" scope="row"&gt;Pairs&lt;/TH&gt;
&lt;TD class="r data"&gt;23648&lt;/TD&gt;
&lt;TH class="l rowheader" scope="row"&gt;c&lt;/TH&gt;
&lt;TD class="r data"&gt;0.894&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Mon, 09 Mar 2020 21:36:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630759#M30268</guid>
      <dc:creator>GreggB</dc:creator>
      <dc:date>2020-03-09T21:36:05Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630870#M30269</link>
      <description>&lt;P&gt;You want to fit a model to the Training data set, and then apply the fitted model from the training data set to the validation data set. This is not what you have done ... you have fit a whole new model to the validation data set.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is an example of how to apply the fitted model to the validation data set: &lt;A href="http://support.sas.com/kb/39/724.html" target="_blank"&gt;http://support.sas.com/kb/39/724.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Mar 2020 11:43:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630870#M30269</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2020-03-10T11:43:52Z</dc:date>
    </item>
    <item>
      <title>Re: Predicting a binary response variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630963#M30270</link>
      <description>You use PROC SCORE or PROC PLS to score your new data set. PLS has more options these days as its the 'newest' procedure. Remember to specify the option for logistic regression though otherwise it doesn't exponentiate the estimate.</description>
      <pubDate>Tue, 10 Mar 2020 15:34:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Predicting-a-binary-response-variable/m-p/630963#M30270</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-03-10T15:34:17Z</dc:date>
    </item>
  </channel>
</rss>

