BookmarkSubscribeRSS Feed
pulkit3478
Calcite | Level 5

I am working on segmentation of a Data (around 2500 obs) for designing a Scorecard model. I have created all variables that are required for segmenting.

When I start with interactive decision tree, I dont get all the variables in the Split Node Window. Can someone guide me on how to tackle this problem?

10 REPLIES 10
mohamed_zaki
Barite | Level 11

This depend on the Subtree Method you chosed in the decision tree options. The default method (Assessment) give you the smallest subtree with best value. Check the other options Largest and N. And the final decision tree could be affected by the other options too.

pulkit3478
Calcite | Level 5

I think changing Subtree Method will have any effect if one is automatically training a decision tree, but in this case I want interactive Decision Tree.

Anyways, I changed the Subtree Method to Largest and N but still I couldnt see all the variables in Split Node Window.

Is there any upper limit of variables in input data set (I have around 3300 variables)??

mohamed_zaki
Barite | Level 11

Did you tried to open the interactive mode just after connecting the node to the data set without running the decision node. But you should run the data set first.

Also i find sometimes after working with the interactive mode i need to prune the tree and reopen it to find all the variables again, because the refresh option in split node will not work.

pulkit3478
Calcite | Level 5

Yes I did not run the decision node and opened the interactive decision tree window but still some of the variables are missing in Split Node window.

Yes sometimes only some of the variables (default value is 5) are shown as it depends on the number specified in Number of Rules in Node properties of the decision tree node.

mohamed_zaki
Barite | Level 11

Can you try to run the decision tree with the same data set but include some dimension (for example only 50) not all the 3300. Just to make sure it is because the dimension not something else in your data.

pulkit3478
Calcite | Level 5

I tried with lesser no of variables (only 100) and surprisingly every variable was coming in Split Node Window. But I want to try segmentation with all the variables.

Is there any way around for that?

mohamed_zaki
Barite | Level 11

If the other dimension data is similar to the 100 variables you choosed previously, then i think it is a the dimension. Hope 

WendyCzika
SAS Employee

Yes there seems to be some sort of limit around 2000 variables in the UI.  I tried changing various properties - Significance Level (to 1), Leaf Size (to 1), and Exhaustive (to a very large number), turning off all p-value adjustments... but it still only shows around 2000 inputs for me as well in the Split Node dialog.

WendyCzika
SAS Employee

Can you check in the Variables dialog for the Decision Tree node that all the variables you are expecting to see are set as Input?

pulkit3478
Calcite | Level 5

Hi WendyCzika,

I have checked the variables dialog and all the variables except one target variables are set as input.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 2632 views
  • 4 likes
  • 3 in conversation