09-05-2014 02:20 AM
How to manage the factor which have high Information Value (<0.1), but the graph of WOE is shown like this.
In practically, WOE graph should have linier slope like this
In my case, if I re-grouped the attribute in “Interactive- Grouping” make the WOE graph have liner line like above it will cause of low Information Value
Besides, If continue use this result , in scorecard node, it will affect my score result.
Percentage of Down payment
<15% = 10
15=<DP<30 = 20
30=<DP<50 = 15
>=50 = -5
From the result, I don’t think we cant use this factor. The customers who have high down payment, should have the high score.
I have 9 potential factors which were have IV exceed 0.1 ,but almost all of them have the result as I stated. Do you have any ideas to tackle with this problem?
Thanks for you all help in advance.
09-05-2014 02:59 PM
Rephrasing your question, you are saying that you look for monotonicity in your weight of evidence curves. If your input variable (factor) does not show monotonicity, what to do about it?
To give monotonicity a priority try these two grouping mehtods: monotonic event rate and constrained optimal.
If you are looking for a quick fix and you are absolutely sure that the WOE curve does not represent your data, or future data, you can override with manual WOE on the coarse detail options. Simply adjust the weight of evidence using the Manual WOE columns. This can help to deal with the scorecard points of the down payment variable you mentioned in your example, but remember that you are overriding the ratio of events/non-events according to your business knowledge, not to what is in your data.
Quick fix example: Default grouping WOE vs manual WOE
Monotonic event rate grouping
Use this grouping option if you are looking for monotonic event rate. This option does a great job and it is really hard to beat if you try to come up with the groupings on your own. It will save you a lot of time. But only use it if monotonicty is really important for all your variables. Some variables are expected to have a linear trend, while others are expected to have an inverted U curve. Be careful to only impose this constraint when it makes sense.
For this particular example, notice that IV decreases to 0.10. The default grouping had a higher IV value, which means that the previous grouping was somehow more useful. But if monotonicity is very important for this variable, this is the way to go.
Constrained optimal grouping
A third option grouping method is constrained optimal. This method has an OR approach and it will impose several constraints (you can enable/disable/modify them through the constraint options or advanced constraint options menus). Monotonicity has a priority on those constraints. This grouping is my personal favorite because I can choose which constraints should be applied to which variables using the advanced constraint option menu.
Below the results for the same variable in our example using the constrained optimal grouping. Notice that it is a nice answer because these groups have a monotonic WOE trend while they also represent very well the event count of the graph on the left. I also find four groups more useful for this variable in a scorecard. Finally, even if I decided to adjust these WOE based on business knowledge for a more linear trend (and more differentiated scorecard points), the manual adjustments would be minor compared to the manual adjustment of the "quick fix" example.
For more details on these options, check the reference help (press F1 key when you are on Enterprise Miner).
Huge favor, please don't forget to rate this answer and to comment back on how these grouping options work for you.
And this thread may open up a deeper discussion of when should WOE monotonicity be given a priority, which deserves a whole new thread on its own.