<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: SAS EMiner- Variable Selection in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Variable-Selection/m-p/396507#M6038</link>
    <description>&lt;P&gt;This information is really helpful. I do appreciate this. I changed few things after reading the reponse and getting better result now.&lt;/P&gt;&lt;P&gt;Truly appreciate all your help&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks much&lt;/P&gt;&lt;P&gt;Soma&lt;/P&gt;</description>
    <pubDate>Fri, 15 Sep 2017 22:48:49 GMT</pubDate>
    <dc:creator>SGhosh</dc:creator>
    <dc:date>2017-09-15T22:48:49Z</dc:date>
    <item>
      <title>SAS EMiner- Variable Selection</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Variable-Selection/m-p/395773#M6023</link>
      <description>&lt;P&gt;I am new to SAS EMiner, so any response on this would e very helpful&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;How can I select variables in the data node of SAS EMiner? To define it more clearly, I would say if I have ~50 variables, how could I select / determine the strongest variables for my model?&lt;/P&gt;&lt;P&gt;When I am running my model package, in the log I see "Pr &amp;gt; ChiSq" for most of the variables are 1.0000 - this explains my variables have some issue. But how to fix it. For example , I have a variable called claim_count and instead of keeping the values as continuous I grouped them in certain buckets. like 1-10,11-20.. etc&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advancre&lt;/P&gt;</description>
      <pubDate>Wed, 13 Sep 2017 21:19:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Variable-Selection/m-p/395773#M6023</guid>
      <dc:creator>SGhosh</dc:creator>
      <dc:date>2017-09-13T21:19:56Z</dc:date>
    </item>
    <item>
      <title>Re: SAS EMiner- Variable Selection</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Variable-Selection/m-p/396400#M6035</link>
      <description>&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;How can I select variables in the data node of SAS EMiner? To define it more clearly, I would say if I have ~50 variables, how could I select / determine the strongest variables for my model?&lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;By 'data node' I am assuming you mean the Input Data Source node. &amp;nbsp;In general, you would be ill-advised to remove variables from consideration unless you knew they were not (most likely) suitable for direct use in modeling (e.g. ID information, date/timestamp information, SKU numbers, zip codes, etc...). &amp;nbsp;It is also not necessary to choose variables in this node since SAS Enterprise Miner provides a wealth of methods to choose variables for your model such as the following:&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;* the Variable Selection node provides Regression and Tree-based methods for choosing variables&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;* the Tree node performs its own variable selection so it does not need prior variable selection&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;* the Regression node allows you to add possible terms and to perform a set of stepwise methods to perform variable selection&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;* the Variable Clustering node provides and alternate way of trying to remove variables which have duplicate or highly similar information&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;When I am running my model package, in the log I see "Pr &amp;gt; ChiSq" for most of the variables are 1.0000 - this explains my variables have some issue. But how to fix it. For example , I have a variable called claim_count and instead of keeping the values as continuous I grouped them in certain buckets. like 1-10,11-20.. etc&lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Bucketing manually without any numerical evaluation might actually do more harm than good. &amp;nbsp;SAS Enterprise Miner provides a variety of bucketing algorithms which can take into account the relationship to the target variable. &amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; * the Transform Variables node allow you to create bucket with optimal relationship to the target variable (a Tree-based method)&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; * Interactive Grouping allows you to create groups interactively&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Please note that bucketing summarizes information and can (possibly) result in a predictor that is less capable than the original data. &amp;nbsp; The buckets, however, do provide the additional capability of helping to model non-linearity which might improve how the variable information can be used. &amp;nbsp;In practice, I recommend including both the original interval variable and the bucketed version prior to variable selection so that the information is used in the best possible way.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Hope this helps!&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Doug&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Sep 2017 15:37:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Variable-Selection/m-p/396400#M6035</guid>
      <dc:creator>DougWielenga</dc:creator>
      <dc:date>2017-09-15T15:37:29Z</dc:date>
    </item>
    <item>
      <title>Re: SAS EMiner- Variable Selection</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Variable-Selection/m-p/396507#M6038</link>
      <description>&lt;P&gt;This information is really helpful. I do appreciate this. I changed few things after reading the reponse and getting better result now.&lt;/P&gt;&lt;P&gt;Truly appreciate all your help&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks much&lt;/P&gt;&lt;P&gt;Soma&lt;/P&gt;</description>
      <pubDate>Fri, 15 Sep 2017 22:48:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Variable-Selection/m-p/396507#M6038</guid>
      <dc:creator>SGhosh</dc:creator>
      <dc:date>2017-09-15T22:48:49Z</dc:date>
    </item>
  </channel>
</rss>

