<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How can i treat 300 binary product variables in classification case? in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/How-can-i-treat-300-binary-product-variables-in-classification/m-p/194372#M2467</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks a lot for the replies guys. I have enough to get me forward now &lt;img id="smileyhappy" class="emoticon emoticon-smileyhappy" src="https://communities.sas.com/i/smilies/16x16_smiley-happy.png" alt="Smiley Happy" title="Smiley Happy" /&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Fri, 29 May 2015 19:09:35 GMT</pubDate>
    <dc:creator>Buaskes</dc:creator>
    <dc:date>2015-05-29T19:09:35Z</dc:date>
    <item>
      <title>How can i treat 300 binary product variables in classification case?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-can-i-treat-300-binary-product-variables-in-classification/m-p/194368#M2463</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi there.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am doing am online fraud classification case where i have 364 variables and 50.000 observations. 300 of these variables are binary product variables thus indicating if the purchase made was of a specific product or not. I am thinking that there must be some information hidden in these variables but i can figure out a good way of dealing with them. Does anyone have an idea?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 27 May 2015 17:25:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-can-i-treat-300-binary-product-variables-in-classification/m-p/194368#M2463</guid>
      <dc:creator>Buaskes</dc:creator>
      <dc:date>2015-05-27T17:25:28Z</dc:date>
    </item>
    <item>
      <title>Re: How can i treat 300 binary product variables in classification case?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-can-i-treat-300-binary-product-variables-in-classification/m-p/194369#M2464</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;MBA - market basket analysis - which products are likely to be batched together?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If only one of the 300 is filled for every observation then change the data structure to have a product variable instead?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 27 May 2015 18:26:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-can-i-treat-300-binary-product-variables-in-classification/m-p/194369#M2464</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2015-05-27T18:26:36Z</dc:date>
    </item>
    <item>
      <title>Re: How can i treat 300 binary product variables in classification case?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-can-i-treat-300-binary-product-variables-in-classification/m-p/194370#M2465</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;As Reeza pointed out , Maybe You could encode that category variable into a numeric variable by proc glmselect , then fit it in model.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 28 May 2015 14:36:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-can-i-treat-300-binary-product-variables-in-classification/m-p/194370#M2465</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2015-05-28T14:36:41Z</dc:date>
    </item>
    <item>
      <title>Re: How can i treat 300 binary product variables in classification case?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-can-i-treat-300-binary-product-variables-in-classification/m-p/194371#M2466</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;There is a data mining approach for rare events, often used to flag fraud. Give it a try not transforming or reject variables just yet. Try clustering your data and if you have a few flagged or confirmed fraud cases, you can train a predictive model for each cluster. &lt;SPAN style="font-size: 13.3333330154419px; line-height: 1.5em;"&gt;You are hoping that your fraudsters have different patterns than the rest of your customers, and you would have a higher concentration of fraudsters in certain clusters.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt;Make sure your cluster makes sense and decide whether you need to standardize or tweak your clustering. For your 300 binary variables you do not need to standardize but do standardize if you have other inputs in really different scales.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A __default_attr="831252" __jive_macro_name="user" class="jive_macro jive_macro_user" href="https://communities.sas.com/"&gt;&lt;/A&gt; presented this approach in SAS Global Forum 2015. Take a look at his paper&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;STRONG&gt;SAS® Does Data Science: How to Succeed in a Data Science Competition &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="http://support.sas.com/resources/papers/proceedings15/SAS2520-2015.pdf" title="http://support.sas.com/resources/papers/proceedings15/SAS2520-2015.pdf"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/A&gt;&lt;A href="http://support.sas.com/resources/papers/proceedings15/SAS2520-2015.pdf" target="_blank"&gt;http://support.sas.com/resources/papers/proceedings15/SAS2520-2015.pdf&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Compare this approach to Reeza's and Xia's suggestions.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Good luck,&lt;/P&gt;&lt;P&gt;Miguel&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 29 May 2015 14:58:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-can-i-treat-300-binary-product-variables-in-classification/m-p/194371#M2466</guid>
      <dc:creator>M_Maldonado</dc:creator>
      <dc:date>2015-05-29T14:58:18Z</dc:date>
    </item>
    <item>
      <title>Re: How can i treat 300 binary product variables in classification case?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-can-i-treat-300-binary-product-variables-in-classification/m-p/194372#M2467</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks a lot for the replies guys. I have enough to get me forward now &lt;img id="smileyhappy" class="emoticon emoticon-smileyhappy" src="https://communities.sas.com/i/smilies/16x16_smiley-happy.png" alt="Smiley Happy" title="Smiley Happy" /&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 29 May 2015 19:09:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-can-i-treat-300-binary-product-variables-in-classification/m-p/194372#M2467</guid>
      <dc:creator>Buaskes</dc:creator>
      <dc:date>2015-05-29T19:09:35Z</dc:date>
    </item>
  </channel>
</rss>

