<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: A small percentage of response : Please Help Thank you in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/A-small-percentage-of-response-Please-Help-Thank-you/m-p/120228#M1017</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;First of all, the methodology is logistic regression. but, there are two ways to do the prediction:&lt;/P&gt;&lt;P&gt;1. select whole database as your targeted customers. In this case,&amp;nbsp; since you only 1% response rate, the predicted probability won't be high (p_1 = 0.1 could be higher enough to say this guy will buy the ticket). &lt;/P&gt;&lt;P&gt;2. select part of your database as your targeted customers. In this case, you have to do pre data mining to reduce the data size and increase the response rate, then the predicted probability will increase too.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;keep in mind that either way will NOT keep all potential buyers. There is no way to cover all buyers except you communicate the whole database. &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Thu, 25 Apr 2013 15:05:46 GMT</pubDate>
    <dc:creator>jf</dc:creator>
    <dc:date>2013-04-25T15:05:46Z</dc:date>
    <item>
      <title>A small percentage of response : Please Help Thank you</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/A-small-percentage-of-response-Please-Help-Thank-you/m-p/120227#M1016</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi All, I would like to predict customers who have a propensity to buy tickets for basketball. My whole database is 1,800,000 and Only 23,509 have purchased tickets in the past. (1%) How shall I proceed? Your help would be much appreciated. Many thanks&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 25 Apr 2013 10:55:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/A-small-percentage-of-response-Please-Help-Thank-you/m-p/120227#M1016</guid>
      <dc:creator>Question</dc:creator>
      <dc:date>2013-04-25T10:55:58Z</dc:date>
    </item>
    <item>
      <title>Re: A small percentage of response : Please Help Thank you</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/A-small-percentage-of-response-Please-Help-Thank-you/m-p/120228#M1017</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;First of all, the methodology is logistic regression. but, there are two ways to do the prediction:&lt;/P&gt;&lt;P&gt;1. select whole database as your targeted customers. In this case,&amp;nbsp; since you only 1% response rate, the predicted probability won't be high (p_1 = 0.1 could be higher enough to say this guy will buy the ticket). &lt;/P&gt;&lt;P&gt;2. select part of your database as your targeted customers. In this case, you have to do pre data mining to reduce the data size and increase the response rate, then the predicted probability will increase too.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;keep in mind that either way will NOT keep all potential buyers. There is no way to cover all buyers except you communicate the whole database. &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 25 Apr 2013 15:05:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/A-small-percentage-of-response-Please-Help-Thank-you/m-p/120228#M1017</guid>
      <dc:creator>jf</dc:creator>
      <dc:date>2013-04-25T15:05:46Z</dc:date>
    </item>
    <item>
      <title>Re: A small percentage of response : Please Help Thank you</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/A-small-percentage-of-response-Please-Help-Thank-you/m-p/120229#M1018</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;When rate (expectation) is so small modeling should be based on Poisson distribution, right?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 25 Apr 2013 15:33:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/A-small-percentage-of-response-Please-Help-Thank-you/m-p/120229#M1018</guid>
      <dc:creator>marxyst</dc:creator>
      <dc:date>2013-04-25T15:33:56Z</dc:date>
    </item>
    <item>
      <title>Re: A small percentage of response : Please Help Thank you</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/A-small-percentage-of-response-Please-Help-Thank-you/m-p/120230#M1019</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Never heard of that one, what is the reason?&lt;/P&gt;&lt;P&gt;Poisson is usually used for count data instead.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You can always oversample your data and then use bayesion priors to correct for the oversampling. &lt;/P&gt;&lt;P&gt;I'd make sure I used several different samples/simulations to get a better idea. This has its benefits and drawbacks, which can be found through some googling &lt;img id="smileyhappy" class="emoticon emoticon-smileyhappy" src="https://communities.sas.com/i/smilies/16x16_smiley-happy.png" alt="Smiley Happy" title="Smiley Happy" /&gt;.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You can use proc logistic, I think proc discrim is also an option. &lt;/P&gt;&lt;P&gt;Are you using JMP, EG, EM or Base SAS?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 25 Apr 2013 15:40:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/A-small-percentage-of-response-Please-Help-Thank-you/m-p/120230#M1019</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2013-04-25T15:40:47Z</dc:date>
    </item>
    <item>
      <title>Re: A small percentage of response : Please Help Thank you</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/A-small-percentage-of-response-Please-Help-Thank-you/m-p/120231#M1020</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Poisson regression assumes the dependent variable follows Poisson distribution, which means Y has non-negative integer values. In this case, Y only has two values -- buy or not. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also, &lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;23,509&lt;/SPAN&gt; is not small amount.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 25 Apr 2013 16:18:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/A-small-percentage-of-response-Please-Help-Thank-you/m-p/120231#M1020</guid>
      <dc:creator>jf</dc:creator>
      <dc:date>2013-04-25T16:18:28Z</dc:date>
    </item>
    <item>
      <title>Re: A small percentage of response : Please Help Thank you</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/A-small-percentage-of-response-Please-Help-Thank-you/m-p/120232#M1021</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt; proc discrim may do the job as logistic regression, but since LR is a well designed method for this case, the best and easiest way is LR. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;In order to get better result, deep data mining and modeling skills are necessary. &lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 25 Apr 2013 16:31:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/A-small-percentage-of-response-Please-Help-Thank-you/m-p/120232#M1021</guid>
      <dc:creator>jf</dc:creator>
      <dc:date>2013-04-25T16:31:17Z</dc:date>
    </item>
    <item>
      <title>Re: A small percentage of response : Please Help Thank you</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/A-small-percentage-of-response-Please-Help-Thank-you/m-p/120233#M1022</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;The limit distribution of the Binomial(p,N) when p is small and N is large is Poisson(pN). That's probably the origin of the confusion. Poisson regression could model the number of buyers per group of 10000 randomly selected persons, for instance.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;hth&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 25 Apr 2013 16:36:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/A-small-percentage-of-response-Please-Help-Thank-you/m-p/120233#M1022</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2013-04-25T16:36:20Z</dc:date>
    </item>
  </channel>
</rss>

