I am working on segmentation of a Data (around 2500 obs) for designing a Scorecard model. I have created all variables that are required for segmenting.
When I start with interactive decision tree, I dont get all the variables in the Split Node Window. Can someone guide me on how to tackle this problem?
This depend on the Subtree Method you chosed in the decision tree options. The default method (Assessment) give you the smallest subtree with best value. Check the other options Largest and N. And the final decision tree could be affected by the other options too.
I think changing Subtree Method will have any effect if one is automatically training a decision tree, but in this case I want interactive Decision Tree.
Anyways, I changed the Subtree Method to Largest and N but still I couldnt see all the variables in Split Node Window.
Is there any upper limit of variables in input data set (I have around 3300 variables)??
Did you tried to open the interactive mode just after connecting the node to the data set without running the decision node. But you should run the data set first.
Also i find sometimes after working with the interactive mode i need to prune the tree and reopen it to find all the variables again, because the refresh option in split node will not work.
Yes I did not run the decision node and opened the interactive decision tree window but still some of the variables are missing in Split Node window.
Yes sometimes only some of the variables (default value is 5) are shown as it depends on the number specified in Number of Rules in Node properties of the decision tree node.
Can you try to run the decision tree with the same data set but include some dimension (for example only 50) not all the 3300. Just to make sure it is because the dimension not something else in your data.
I tried with lesser no of variables (only 100) and surprisingly every variable was coming in Split Node Window. But I want to try segmentation with all the variables.
Is there any way around for that?
If the other dimension data is similar to the 100 variables you choosed previously, then i think it is a the dimension. Hope WendyCzika to return to you by a limit considerations. Although, i hope you keep increase it and tell us the limit you found.
And it is always valid advice to tell you to try a dimension reduction technique EM provide you with.
Yes there seems to be some sort of limit around 2000 variables in the UI. I tried changing various properties - Significance Level (to 1), Leaf Size (to 1), and Exhaustive (to a very large number), turning off all p-value adjustments... but it still only shows around 2000 inputs for me as well in the Split Node dialog.
Can you check in the Variables dialog for the Decision Tree node that all the variables you are expecting to see are set as Input?
Hi WendyCzika,
I have checked the variables dialog and all the variables except one target variables are set as input.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.