<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Please provide suggestion for RFM using SAS EG for extremely skewed distribution. in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Please-provide-suggestion-for-RFM-using-SAS-EG-for-extremely/m-p/208643#M2843</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank you Ray for your clear explanation. I believe this is a very good starting point. I'll start experimenting with the number of bins.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;-Avinash&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 24 Mar 2015 16:48:37 GMT</pubDate>
    <dc:creator>AvinashRdy</dc:creator>
    <dc:date>2015-03-24T16:48:37Z</dc:date>
    <item>
      <title>Please provide suggestion for RFM using SAS EG for extremely skewed distribution.</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Please-provide-suggestion-for-RFM-using-SAS-EG-for-extremely/m-p/208638#M2838</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I've a customer data set with approximately 5 million records and data is collected based on customers from past 10 to 15 years. My target is divide the customers into RFM bins. All the recency, frequency and monitory value variables are extremely skewed. &lt;/P&gt;&lt;P&gt;For example, if I consider recency there are about 60-70% of the customers with recency 1 &amp;amp; 5-10% with recency between 2 to 5 &amp;amp; 2% between 5 to 10 so on..also about 0.1% above 100. &lt;/P&gt;&lt;P&gt;Similar case with monetory value. The monetary varies from 0 to 10,000,000 there are &lt;/P&gt;&lt;P&gt;about 30% of the customers who spent &amp;lt; $5 &amp;amp; &lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;about 20% of the customers who spent between 5 to &lt;/SPAN&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;10 &amp;amp; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;30% b/w 10 to 100 &amp;amp; 20% between 100 to 1000 &amp;amp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;with 10% b/w 100 and 10,000 and &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;8% b/w 10,000 to 100,000 and so on...about 0.01% &amp;gt; 1,000,000&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;Similar scenario with Recency variable.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;I need to decide how the split should be done. &lt;SPAN style="font-size: 13.3333330154419px;"&gt;I've access to SAS EG. &lt;/SPAN&gt;Any idea or solution is much appreciated. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;Thank you so much in advance for your time.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;- Avinash&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 24 Mar 2015 03:29:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Please-provide-suggestion-for-RFM-using-SAS-EG-for-extremely/m-p/208638#M2838</guid>
      <dc:creator>AvinashRdy</dc:creator>
      <dc:date>2015-03-24T03:29:27Z</dc:date>
    </item>
    <item>
      <title>Re: Please provide suggestion for RFM using SAS EG for extremely skewed distribution.</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Please-provide-suggestion-for-RFM-using-SAS-EG-for-extremely/m-p/208639#M2839</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Maybe you should consider computing their&amp;nbsp; percentiles .&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 24 Mar 2015 13:31:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Please-provide-suggestion-for-RFM-using-SAS-EG-for-extremely/m-p/208639#M2839</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2015-03-24T13:31:51Z</dc:date>
    </item>
    <item>
      <title>Re: Please provide suggestion for RFM using SAS EG for extremely skewed distribution.</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Please-provide-suggestion-for-RFM-using-SAS-EG-for-extremely/m-p/208640#M2840</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi, Avinash.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt;If you are concerned that your RFM scores will be skewed, don't worry. The scores are computed based off percentiles (ranks), so even with skewed data (which are the norm with variables like total purchase amount) you will get a pretty uniform distribution of R, F, and M scores. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;And just in case you hadn't seen it, EG has an RFM task that will make getting RFM scores very easy. Look under Tasks &amp;gt; Data Mining &amp;gt; Recency, Frequency, Monetary.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I hope this helps.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Ray&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 24 Mar 2015 13:40:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Please-provide-suggestion-for-RFM-using-SAS-EG-for-extremely/m-p/208640#M2840</guid>
      <dc:creator>rayIII</dc:creator>
      <dc:date>2015-03-24T13:40:17Z</dc:date>
    </item>
    <item>
      <title>Re: Please provide suggestion for RFM using SAS EG for extremely skewed distribution.</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Please-provide-suggestion-for-RFM-using-SAS-EG-for-extremely/m-p/208641#M2841</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Xia Keshan / Ray Wright,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you for your response. Unfortunately, I've SAS EG 5.1 which doesn't have the option to perform RFM scores directly. Can you suggest any other way to do this?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 24 Mar 2015 16:16:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Please-provide-suggestion-for-RFM-using-SAS-EG-for-extremely/m-p/208641#M2841</guid>
      <dc:creator>AvinashRdy</dc:creator>
      <dc:date>2015-03-24T16:16:09Z</dc:date>
    </item>
    <item>
      <title>Re: Please provide suggestion for RFM using SAS EG for extremely skewed distribution.</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Please-provide-suggestion-for-RFM-using-SAS-EG-for-extremely/m-p/208642#M2842</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Yes, you can get started with Proc Rank. Replace the dataset and&amp;nbsp; 'var' list with your own dataset and variables. &lt;SPAN style="font-size: 13.3333330154419px;"&gt;If you want to use more or fewer than 5 bins, then adjust the 'groups' value. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc rank data=yourData out=ranks groups=5; &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp; var mostRecent NumberofPurchases totalamount; &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp; ranks recency frequency monetary; &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;run;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;data ranks;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp; set ranks;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp; recency&amp;nbsp; + 1;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; frequency + 1;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp; monetary + 1;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;run; &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc univariate data=ranks;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; var recency frequency monetary;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;run; &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You might also want to look at the TIES= option for Proc Rank.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;This example assumes your data rows represent customers. If they are transactional, then you would need to aggregate the rows before doing this. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;As I said, this is just a start. The RFM task in EG 6.1 gives you the option to use transactional- or customer-level data, as well as several binning options and plots. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I hope this helps.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Ray&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 24 Mar 2015 16:42:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Please-provide-suggestion-for-RFM-using-SAS-EG-for-extremely/m-p/208642#M2842</guid>
      <dc:creator>rayIII</dc:creator>
      <dc:date>2015-03-24T16:42:28Z</dc:date>
    </item>
    <item>
      <title>Re: Please provide suggestion for RFM using SAS EG for extremely skewed distribution.</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Please-provide-suggestion-for-RFM-using-SAS-EG-for-extremely/m-p/208643#M2843</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank you Ray for your clear explanation. I believe this is a very good starting point. I'll start experimenting with the number of bins.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;-Avinash&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 24 Mar 2015 16:48:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Please-provide-suggestion-for-RFM-using-SAS-EG-for-extremely/m-p/208643#M2843</guid>
      <dc:creator>AvinashRdy</dc:creator>
      <dc:date>2015-03-24T16:48:37Z</dc:date>
    </item>
  </channel>
</rss>

