BookmarkSubscribeRSS Feed
cmajorros
Calcite | Level 5

How to manage the factor which have high Information Value (<0.1),  but  the graph of WOE is shown like this.

1.jpg

In practically, WOE graph should have linier slope like this

2.jpg

In my case, if I re-grouped the attribute in “Interactive- Grouping” make the WOE  graph have liner line like above it will cause of low Information Value

Besides, If continue use this result , in scorecard node, it will affect my score result.

For example,

Percentage of Down payment

<15%                     = 10

15=<DP<30         = 20

30=<DP<50         = 15

>=50                      = -5

From the result, I don’t think we cant use this factor. The customers who have high down payment, should have the high score.

I have 9 potential factors which were have IV exceed 0.1 ,but almost all of them have the result as I stated. Do you have any ideas to tackle with this problem?

Thanks for you all help in advance.

Best Regards,

Ros

2 REPLIES 2
M_Maldonado
Barite | Level 11

Hi Ros,

Rephrasing your question, you are saying that you look for monotonicity in your weight of evidence curves. If your input variable (factor) does not show monotonicity, what to do about it?

To give monotonicity a priority try these two grouping mehtods: monotonic event rate and constrained optimal.

If you are looking for a quick fix and you are absolutely sure that the WOE curve does not represent your data, or future data, you can override with manual WOE on the coarse detail options. Simply adjust the weight of evidence using the Manual WOE columns. This can help to deal with the scorecard points of the down payment variable you mentioned in your example, but remember that you are overriding the ratio of events/non-events according to your business knowledge, not to what is in your data.

Quick fix example: Default grouping WOE vs manual WOE

defaultIGN.pngmanualwoe.png

Monotonic event rate grouping

Use this grouping option if you are looking for monotonic event rate. This option does a great job and it is really hard to beat if you try to come up with the groupings on your own. It will save you a lot of time. But only use it if monotonicty is really important for all your variables. Some variables are expected to have a linear trend, while others are expected to have an inverted U curve. Be careful to only impose this constraint when it makes sense.

For this particular example, notice that IV decreases to 0.10. The default grouping had a higher IV value, which means that the previous grouping was somehow more useful. But if monotonicity is very important for this variable, this is the way to go.

ermonotonicity.png

Constrained optimal grouping

A third option grouping method is constrained optimal. This method has an OR approach and it will impose several constraints (you can enable/disable/modify them through the constraint options or advanced constraint options menus). Monotonicity has a priority on those constraints. This grouping is my personal favorite because I can choose which constraints should be applied to which variables using the advanced constraint option menu.

Below the results for the same variable in our example using the constrained optimal grouping. Notice that it is a nice answer because these groups have a monotonic WOE trend while they also represent very well the event count of the graph on the left. I also find four groups more useful for this variable in a scorecard. Finally, even if I decided to adjust these WOE based on business knowledge for a more linear trend (and more differentiated scorecard points), the manual adjustments would be minor compared to the manual adjustment of the "quick fix" example.

constrainedoptimal.png

   

For more details on these options, check the reference help (press F1 key when you are on Enterprise Miner).

Huge favor, please don't forget to rate this answer and to comment back on how these grouping options work for you.

And this thread may open up a deeper discussion of when should WOE monotonicity be given a priority, which deserves a whole new thread on its own.

Thanks,

Miguel

cmajorros
Calcite | Level 5

Thanks so much. I am so new for this field. You answer is so helpful.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1588 views
  • 4 likes
  • 2 in conversation