<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Decision tree for dimensional reduction in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172458#M1976</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks for discussion and information.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Wed, 15 Oct 2014 20:18:46 GMT</pubDate>
    <dc:creator>husseinmazaar</dc:creator>
    <dc:date>2014-10-15T20:18:46Z</dc:date>
    <item>
      <title>Decision tree for dimensional reduction</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172452#M1970</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Dear members,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have a classification problem with 6 classes and 14500 features (Interval ). I need to reduce the dimensions.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;whats is the optimal method to do that?&lt;/P&gt;&lt;P&gt;Decision tree, PCA,....&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 05 Oct 2014 17:13:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172452#M1970</guid>
      <dc:creator>husseinmazaar</dc:creator>
      <dc:date>2014-10-05T17:13:36Z</dc:date>
    </item>
    <item>
      <title>Re: Decision tree for dimensional reduction</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172453#M1971</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Hussein,&lt;/P&gt;&lt;P&gt;A great thing about Enterprise Miner is that you can try multiple techniques at once. Since two flows on a diagram run in parallel you are also making the most out of your time.&lt;/P&gt;&lt;P&gt;Note that nodes like decision tree will do variable selection, and nodes like PCA will do dimension reduction.&lt;/P&gt;&lt;P&gt;&lt;A href="https://support.sas.com/edu/schedules.html?ctry=us&amp;amp;id=862#s1=1" title="https://support.sas.com/edu/schedules.html?ctry=us&amp;amp;id=862#s1=1"&gt;Advanced Predictive Modeling Using SAS Enterprise Miner &lt;/A&gt; is a course that explains very well advanced topics, including unsupervised (PCA, variable clustering, etc) and supervised (PLS, LARS, LASSO, etc) dimension reduction techniques. Highly recommended!&lt;/P&gt;&lt;P&gt;A best practice is to try several techniques and select the one that suits your target, number of observations, and number of input variables. Also check this discussion ( &lt;A __default_attr="58368" __jive_macro_name="thread" class="jive_macro jive_macro_thread" href="https://communities.sas.com/" modifiedtitle="true" title="what is the optimal way to use variable selection node"&gt;&lt;/A&gt;&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt;) where you can find a list of nodes that you can use for variable selection.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also, what kind of classification of problem are you trying to solve? Are you dealing with missing values in the input variables? And what is the distribution of those 6 classes? You might even want to use a two-step model.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I hope it helps,&lt;/P&gt;&lt;P&gt;Miguel&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 07 Oct 2014 18:16:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172453#M1971</guid>
      <dc:creator>M_Maldonado</dc:creator>
      <dc:date>2014-10-07T18:16:49Z</dc:date>
    </item>
    <item>
      <title>Re: Decision tree for dimensional reduction</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172454#M1972</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Hussein,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I agree with Miguel and is even using the book from the course he recommends above in my day-to-day work.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;From my perspective, PCA is an awesome and great way to reduce the dimensions, although the problem with the technique is that it often becomes difficult to explain to a client or user of the model what exactly the PCA variables mean and how they relate to the parameters of the model. I would maybe use a decision tree to select useful variables and go from there (there's a specific way to configure the decision tree node to select variables). You can also use regression to select variables (especially forward selection, which is good at detecting strong interactions). Just note that these nodes may run a considerable amount of time due to your big dataset. Also (something I use a lot), since your inputs are interval variables, you could cluster them into groups using eminer's clustering node and then use one representative from each cluster for subsequent modelling. Eminer automatically exports the cluster representatives if you set the Variable Selection option in the clustering node to "Best Variables". Something I also sometimes use (although it's not as successful as the other techniques mentioned earlier) is SAS's variable selection node and then setting the minimum Chi Square lower bound much lower than Eminer's default value so that it may select a larger number of important variables. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Hope you succeed!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Jacques&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 10 Oct 2014 13:43:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172454#M1972</guid>
      <dc:creator>JakesVenter</dc:creator>
      <dc:date>2014-10-10T13:43:55Z</dc:date>
    </item>
    <item>
      <title>Re: Decision tree for dimensional reduction</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172455#M1973</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;One thing I'd add is that I wouldn't trust one decision tree to do variable selection for me. Decision trees are highly unstable models, and small changes in the data (i.e. even a change in SEED) can produce vastly different variable selections. Especially when many of your variables are correlated. That is why I'd rather use a Random Forest to get a feeling for variable importance. Not sure EM has random forests though, haven't use EM in a while.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 15 Oct 2014 16:32:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172455#M1973</guid>
      <dc:creator>adjgiulio</dc:creator>
      <dc:date>2014-10-15T16:32:03Z</dc:date>
    </item>
    <item>
      <title>Re: Decision tree for dimensional reduction</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172456#M1974</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Yes, beginning in SAS Enterprise Miner 13.1, random forests can be used for variable selection with the HP Forest node.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 15 Oct 2014 18:17:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172456#M1974</guid>
      <dc:creator>WendyCzika</dc:creator>
      <dc:date>2014-10-15T18:17:09Z</dc:date>
    </item>
    <item>
      <title>Re: Decision tree for dimensional reduction</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172457#M1975</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Generally speaking the hp forest procedure has been available beginning with Enterprise Miner 7.1.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;As Wendy has pointed out, the HP Forest Node is capable of variable selection beginning with Enterprise Miner 13.1. Enterprise Miner 13.1 was released December 2013.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 15 Oct 2014 18:28:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172457#M1975</guid>
      <dc:creator>RalphAbbey</dc:creator>
      <dc:date>2014-10-15T18:28:28Z</dc:date>
    </item>
    <item>
      <title>Re: Decision tree for dimensional reduction</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172458#M1976</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks for discussion and information.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 15 Oct 2014 20:18:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172458#M1976</guid>
      <dc:creator>husseinmazaar</dc:creator>
      <dc:date>2014-10-15T20:18:46Z</dc:date>
    </item>
    <item>
      <title>Re: Decision tree for dimensional reduction</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172459#M1977</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I have EM 6.2. so this is a problem for me.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 15 Oct 2014 20:19:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Decision-tree-for-dimensional-reduction/m-p/172459#M1977</guid>
      <dc:creator>husseinmazaar</dc:creator>
      <dc:date>2014-10-15T20:19:32Z</dc:date>
    </item>
  </channel>
</rss>

