01-25-2012 02:50 AM
Just curious, If at there exists any Algorithm or Code that produces the output of a Decision Tree - WITHOUT THE GUI type. Just the Data alone in Excel is suffiecient. I am not looking for Interactive Tree Representation. Just Feed Data with N Variables and N Observations - The algo should do the work that a DTREE Does. Run CHI Square Test, Run Entropy Reduction and Gini Reduction and give me an intermediate output for analysis.
01-25-2012 10:35 AM
If you can include SAS/IML on your list, SAS once supported a macro that is still available called treedisc.
I also uses SAS/OR, but that is only for obtaining the printed tree.
01-26-2012 04:35 AM
Thanks to Both for the Reply. I am asking if we know/Understand the Way SAS EM performs - Cant we break the Code into different set of SAS BASE Algorithms - Like I have found few Algorthms for run CHI Sqauare Test, Entropy Reduction - Still in Search for Gini Reduction - So why cant we Map One out put to the other so we get the Required output in an Excel or CSV Format from which further analysis can be made. The SAS EM Decision Tree Surely runs behind a logical algorithm that is in place, Why not we try to break into pieces that will produce the same output. I have also seen a module on PROC ARBORETUM - Why am I not able to use it SAS BAse. Does it reuire SAS EM too.
I am not a Statistician to understand & read formulas that are available on the internet - But I am sure these are the formulas used in the algorithm for the tree.
01-26-2012 08:51 AM
If SAS put Enterprise Miner into Base SAS then they couldn't license EMiner. That is a business decision, not a technical one.
There is one technical side to it, however. Many of the EMiner algorithms do not have closed form solutions; they require the use of itterative search algorithms. To implement them effectively, the data needs to be in memory and that runs contrary to one of SAS' core strengths -- it's ability to handle arbitrarily large data sets. EMiner uses the SEMMA approach (see its documentation) to break the data into smaller pieces and circle in on a reasonable approximate answer. That is appropriate for activities that are necessarily approximate, but not for things that should have closed form solutions.