SAS Like H2o, there is non_negative options in H2O (H2OGeneralizedLinearEstimator) can make it, or Glum (Python Package) can control coefficient with lower bound or upper bound. When we model risk scorecard, all the parameter coefficients are the same direction (all negative or positive) after variable woe.
Hello,
You could use the HPGENSELECT Procedure and RESTRICT statement therein.
The RESTRICT statement enables you to specify linear equality or inequality constraints among the parameters of a model. These restrictions are incorporated into the maximum likelihood analysis.
You might expect all WOE-variables to have the same sign (based on bi-variate relationship between target and input) , but due to multi-collinearity ( f.e. ) in the full model one or two WOE-variables can still get a sign opposite to the one you might expect.
See also here (although this is PROC REG) :
Restricted least squares regression in SAS
By Rick Wicklin on The DO Loop September 16, 2020
https://blogs.sas.com/content/iml/2020/09/16/restricted-regression-sas.html
Koen
If the original variable is used for glm regression, the positive and negative reaction variables of the sign are related to the target variable, but after WOE coding, the correlation symbol should be consistent, and the review of the model by the strategy is also more acceptable to the model with the same parameter symbol.
Hello,
I have a similar query. As part of a scorecard development, I have transformed all explanatory variables using WOE and established a monotonic relationship (increasing or decreasing) of each variable with the response variable. Should all coefficients be negative when I apply a logistic regression to the transformed explanatory variables?
@MikaellaK wrote:
Hello,
I have a similar query. As part of a scorecard development, I have transformed all explanatory variables using WOE and established a monotonic relationship (increasing or decreasing) of each variable with the response variable. Should all coefficients be negative when I apply a logistic regression to the transformed explanatory variables?
Does not the above discussion answer this question?
No. Absolutely not .
@Rick_SAS wrote a wonderful blog "Simpson's Paradox" to explain this problem.
https://blogs.sas.com/content/iml/2023/03/27/simpsons-paradox.html
Suppose X stands for income per month, Y stands for the probability of default ,
From all the data, you could see the linear decreasing relationship between X and Y.
But if you take into account AGE variable, you would see the reverse result . Surprise ?
So you can't constraint all the parameter to be positive or negative .
And Rick also pointed out that would reduce the accuracy of model's prediction.
Adding to the comments from @Ksharp
Let's suppose you have at least these two variables in your scorecard model: FICO and number of delinquencies in last 24 months. These have opposite effects ... as FICO goes up, the score should go up; and number of delinquencies in last 24 months goes up, the score should go down. The coefficients in the model for these two variables SHOULD have opposite signs; there is nothing wrong with this. In fact, restricting the coefficients to have the same signs would make the model fit worse and cause the scores to do illogical things.
Hello,
You could use the HPGENSELECT Procedure and RESTRICT statement therein.
The RESTRICT statement enables you to specify linear equality or inequality constraints among the parameters of a model. These restrictions are incorporated into the maximum likelihood analysis.
You might expect all WOE-variables to have the same sign (based on bi-variate relationship between target and input) , but due to multi-collinearity ( f.e. ) in the full model one or two WOE-variables can still get a sign opposite to the one you might expect.
See also here (although this is PROC REG) :
Restricted least squares regression in SAS
By Rick Wicklin on The DO Loop September 16, 2020
https://blogs.sas.com/content/iml/2020/09/16/restricted-regression-sas.html
Koen
With respect, I have never made any public statements about building scorecards or using WOE. KSharp, when you want to encourage an OP to read something that I wrote, it would be good to provide a link so that the OP can read exactly what I said, including the context in which I said it.
I encourage the OP to think about the comments from Paige and others who have pointed out that the parameters in a regression model should not be artificially constrained without a good reason. Doing so can reduce the accuracy of the model's predictions.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.