<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic regression in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/regression/m-p/464990#M7065</link>
    <description>&lt;P&gt;HI there. I am facing a problem for a marketing application. Basically, we send customers a product and they either buy x units of them or they don't. I am trying to build a predictive scoring model with the sales generated as the variable to explain. Because the response rate of such a campaign is roughly 2%, it means I am trying to build a regressor with 98% of 0 and the rest being positive sales. I have separately developed a classification model that works well in determining whether a cust. will buy or not, but with the one I am trying to build, I'd like to score the customers from 1-10 depending on which decile of the predictive sales they will generate. I feel like I am not approaching the problem the right way (with this regression) since, in a same way a classifier will be biased by the predominance of 0, my regressor (even with resampling) will be biased by all the zero sales. Any ideas? Many thanks in advance. Nicolas&lt;/P&gt;</description>
    <pubDate>Fri, 25 May 2018 08:34:11 GMT</pubDate>
    <dc:creator>NicolasC</dc:creator>
    <dc:date>2018-05-25T08:34:11Z</dc:date>
    <item>
      <title>regression</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/regression/m-p/464990#M7065</link>
      <description>&lt;P&gt;HI there. I am facing a problem for a marketing application. Basically, we send customers a product and they either buy x units of them or they don't. I am trying to build a predictive scoring model with the sales generated as the variable to explain. Because the response rate of such a campaign is roughly 2%, it means I am trying to build a regressor with 98% of 0 and the rest being positive sales. I have separately developed a classification model that works well in determining whether a cust. will buy or not, but with the one I am trying to build, I'd like to score the customers from 1-10 depending on which decile of the predictive sales they will generate. I feel like I am not approaching the problem the right way (with this regression) since, in a same way a classifier will be biased by the predominance of 0, my regressor (even with resampling) will be biased by all the zero sales. Any ideas? Many thanks in advance. Nicolas&lt;/P&gt;</description>
      <pubDate>Fri, 25 May 2018 08:34:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/regression/m-p/464990#M7065</guid>
      <dc:creator>NicolasC</dc:creator>
      <dc:date>2018-05-25T08:34:11Z</dc:date>
    </item>
    <item>
      <title>Re: regression</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/regression/m-p/465160#M7068</link>
      <description>&lt;P&gt;One thing you could try is the TwoStage model node.&amp;nbsp; You need to define 2 target variables for that: 1 as a binary (bought or not), and the other represents the number of units for those that bought (missing for those that didn't).&amp;nbsp; Then you can do a Sequential model using Filter=Non-Events (need to define the event for the binary target as those that did not buy).&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 25 May 2018 17:19:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/regression/m-p/465160#M7068</guid>
      <dc:creator>WendyCzika</dc:creator>
      <dc:date>2018-05-25T17:19:03Z</dc:date>
    </item>
    <item>
      <title>Re: regression</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/regression/m-p/465358#M7075</link>
      <description>&lt;P&gt;Hi Wendy. Thanks for your reply. I am not familiar with the two-stage model node. In the current process, I use the 'unbalanced' regression and test it and it seems fine (in terms of scoring there is a hierarchy - the top 10% (em_segment = 1) have a sum for sales higher than those in em_segment 2, etc...). Yet, I am convinced I can do better than that.&amp;nbsp;&lt;/P&gt;&lt;P&gt;What I tried as well is having a response model (model built on responders and not responders) and a sales model (regression only on the buyers) and trying to combine those two by applying the regression on the predicted buyers from the response model. I tested it (as before, on a more recent campaign) and in this case, it does not work.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does the two-stage model work in a somehow similar way as I just described? When applying the score created for this two-stage model, does it predict the target from the regression (sales) or the classification (response)? Since I have unbalanced data, I assume the overall process before using this node is the same as before (sampling+treatment of missing values+data partition)?? Thanks for your help.&lt;/P&gt;</description>
      <pubDate>Sun, 27 May 2018 20:28:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/regression/m-p/465358#M7075</guid>
      <dc:creator>NicolasC</dc:creator>
      <dc:date>2018-05-27T20:28:09Z</dc:date>
    </item>
  </channel>
</rss>

