BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Nikhil
Calcite | Level 5

Hi,

I am trying to build interactive Decision Tree using SAS EM 6.2. The tree takes only 20,000 records for building the tree while my dataset contains over 100,000 records.

Can anyone please suggest how can I make the tree take my complete records in consideration to build the tree? I am stuck at this point.

Thanks,

Nikhil

1 ACCEPTED SOLUTION

Accepted Solutions
WayneThompson
SAS Employee

Some users may wish to override default Enterprise Miner interactive decision tree sampling strategies. Enterprise Miner provides two macros that you can issue with your project startup code that will modify interactive decision tree input data sampling behaviors:

%let EM_INTERACTIVE_TREE_MAXOBS= <max-number-of-observations-in-sample>;
%let EM_INTERACTIVE_TREE_SAMPLEMETHOD=<RANDOM | FIRSTN>;

The first macro specifies the maximum number of observations that can exist in an Interactive Decision Tree node sample. You use this macro if you want to manually control the sample size. Otherwise, Enterprise Miner will use its own algorithms to perform sampling for your interactive decision tree.

The second macro specifies the sampling methodology that will be used to create an Interactive Decision Tree node sample. You can use this macro if you want to manually control the methodology Enterprise Miner uses to create interactive decision tree samples. By default, Enterprise Miner uses random sampling for interactive decision trees. You can use the macro to choose between RANDOM and FIRSTN sample creation. You use the EM_INTERACTIVE_TREE_MAXOBS macro to specify the number of observations for both RANDOM and FIRSTN sampling strategies.

View solution in original post

13 REPLIES 13
art297
Opal | Level 21

Could it simply be that you or the algorithm selected a particular size for a training model?

WayneThompson
SAS Employee

Some users may wish to override default Enterprise Miner interactive decision tree sampling strategies. Enterprise Miner provides two macros that you can issue with your project startup code that will modify interactive decision tree input data sampling behaviors:

%let EM_INTERACTIVE_TREE_MAXOBS= <max-number-of-observations-in-sample>;
%let EM_INTERACTIVE_TREE_SAMPLEMETHOD=<RANDOM | FIRSTN>;

The first macro specifies the maximum number of observations that can exist in an Interactive Decision Tree node sample. You use this macro if you want to manually control the sample size. Otherwise, Enterprise Miner will use its own algorithms to perform sampling for your interactive decision tree.

The second macro specifies the sampling methodology that will be used to create an Interactive Decision Tree node sample. You can use this macro if you want to manually control the methodology Enterprise Miner uses to create interactive decision tree samples. By default, Enterprise Miner uses random sampling for interactive decision trees. You can use the macro to choose between RANDOM and FIRSTN sample creation. You use the EM_INTERACTIVE_TREE_MAXOBS macro to specify the number of observations for both RANDOM and FIRSTN sampling strategies.

Nikhil
Calcite | Level 5

Hi Wayne,

Thanks for the help. But I am new to EM. Request you to please guide me how can I update the project startup code.

Thanks,

Nikhil

Nikhil
Calcite | Level 5

Hi Wayne,

Thanks a lot. I figured out how to update startup code. I need one more help.

When I run the interactive decision tree Prediction value is shown as 0. How can I change it to prediction = 1?

Thanks,

Nikhil

WayneThompson
SAS Employee

Hi Nikhil,

My fault but I probably and not understanding the question well.

When you run similar startup code for the project:

%let EM_INTERACTIVE_TREE_MAXOBS=100000;

%let EM_INTERACTIVE_TREE_SAMPLEMETHOD=RANDOM;

And you are modeling a binary response, do you have 1's and 0's  distributed in the root node?

Is your target variable indeed a binary target and set to the binary variable role in the input node or is it set as interval? 

Thanks

Nikhil
Calcite | Level 5

I am sorry Wayne. I should have been more clear.

The whole picture is:

I have got a project to build a logistic regression model (Binary Response). So, my target variable is binary and explanatory variables are interval as well as binary.

Now the problem is I want to build a decision tree to get an idea which set of variables can I use to build some new variables to include in the predictive model. So, my root node is distributed as binary and my input nodes are binary as well as interval. But when I am building the tree it is showing prediction = 0 as the target. I want it to change to prediction = 1. So, need your help on this.

Thanks,

Nikhil

Nikhil
Calcite | Level 5

I also mentioned the order as descending for my target variable where I define the role of each variable.

WayneThompson
SAS Employee

Prediction=0 in the table refers to the prediction for the selected node. Assuming you are strating with the root node and  and  you have more 0s than 1s then the prediction classification for the root node is =0.   When you split on the root node and continue growing the tree interactivelhy hopefully you are resulting in some nodes (leaves) with prediciton = 1. 

In some cases when you have a rare target event (1s) and little if any signal in the data the null root node can result in being the final classification. In this case trying using the inverse priors options under "Decisions" for the Input Data Source node.    Or obtain add some additonal predictors if you avaialble.  Anyway I could still be of base with your question. Hope this helps.

Nikhil
Calcite | Level 5

Thanks a lot for the input Wayne. It helped me a lot. I will touch base with you in case of any futher help..

Thanks a lot again.

Nikhil

Nikhil
Calcite | Level 5

Hi,

I have come across a situation where I need to use cluster analysis in SAS EM 6.2. I have run the clster analysis on the dataset but am unable interpret the output. Please help.

geetsharma
Calcite | Level 5

Hi Nikhil,

I am new to EM too....How did you update the project startup code?

I want to run the classification tree onmy entire data set.

Regards,

Geetika

Debbielu
Calcite | Level 5

Answer from SAS:

http://support.sas.com/kb/47/220.html

clip_image002.jpg

Regards,
Debbie

CamillaGua
Fluorite | Level 6

Hello Debbielu,

I did this, I put %let EM_INTERACTIVE_TREE_MAXOBS= 10000000; in the code, but it didn't work 😞 I don't know what to do. Could you please explain me step by step telling what I have to do.

Thanks!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 13 replies
  • 9770 views
  • 0 likes
  • 6 in conversation