BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jnvickery
Obsidian | Level 7

Hello,

 

I'm using Enterprise Miner 12.3.

 

I am trying to understand when and if I need to use a "Decisions node" in conjunction with oversampling my input data. I have read this very helpful tip (Tip: How to model a rare target using an oversample approach in SAS® Enterprise Miner™) but I am still not exactly sure.

 

My question is this: Do I need to use the "Decisions Node" if I applied decisions with the input data node?

 

Here is the process that I have followed:

  1. Input data node
    1. Within the input data node, I have prior probabilities. I also have selected "yes" on the "decisions" tab. And I've selected to "default with inverse prior weights"
  2. Sample node to oversample to create balanced sample
  3. Data partition node
  4. Decision tree node 
    1. In the "Split search" property, I have "Use decisions" = Yes and "Use priors" = Yes.Decision tree node

 

Is the above process equivalent to this?

  1. Input data node (without using the decisions in the property panel)
  2. Sample node to oversample
  3. Data partition node
  4. Decision node
  5. Decision tree node

 

Thanks!

John

1 ACCEPTED SOLUTION

Accepted Solutions
pepecura
SAS Employee

Hi John

 

Decisions Node is very useful when you want your probability scores and your statistical metrics like lift to reflect the values in your real population. Once you run the stratification and oversample your dataset you can then reverse the impact of it on your model comparison charts. In your flow, the usage of the node might be more complicated than required. If the ranking of your customer scores is important rather than the scores themselves (at it mostly happens in marketing models) then you may use the model outputs as is without any transformation. You may apply ranking algorithms (available as a task in SAS Enterprise Guide) to create customer groups with similar scores in order to take targeted actions.

 

Please find attached a simple flow for the usage Decisions Node as an example.

 

Hope this helps. 

Best

Tuba.

View solution in original post

3 REPLIES 3
pepecura
SAS Employee

Hi John

 

Decisions Node is very useful when you want your probability scores and your statistical metrics like lift to reflect the values in your real population. Once you run the stratification and oversample your dataset you can then reverse the impact of it on your model comparison charts. In your flow, the usage of the node might be more complicated than required. If the ranking of your customer scores is important rather than the scores themselves (at it mostly happens in marketing models) then you may use the model outputs as is without any transformation. You may apply ranking algorithms (available as a task in SAS Enterprise Guide) to create customer groups with similar scores in order to take targeted actions.

 

Please find attached a simple flow for the usage Decisions Node as an example.

 

Hope this helps. 

Best

Tuba.

jnvickery
Obsidian | Level 7

Thank you, Tuba! Your response was very helpful. Exactly what I needed to know.

 

- John

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 3303 views
  • 1 like
  • 2 in conversation