Solved: SAS fitting logistic regression, how to make all parameters coefficien...

RiskViya · Posted 02-22-2023 02:20 AM

SAS Like H2o, there is non_negative options in H2O (H2OGeneralizedLinearEstimator) can make it, or Glum (Python Package) can control coefficient with lower bound or upper bound. When we model risk scorecard, all the parameter coefficients are the same direction (all negative or positive) after variable woe.

sbxkoenk · Posted 02-22-2023 06:53 PM

Hello,

You could use the HPGENSELECT Procedure and RESTRICT statement therein.

The RESTRICT statement enables you to specify linear equality or inequality constraints among the parameters of a model. These restrictions are incorporated into the maximum likelihood analysis.

You might expect all WOE-variables to have the same sign (based on bi-variate relationship between target and input) , but due to multi-collinearity ( f.e. ) in the full model one or two WOE-variables can still get a sign opposite to the one you might expect.

See also here (although this is PROC REG) :
Restricted least squares regression in SAS
By Rick Wicklin on The DO Loop September 16, 2020
https://blogs.sas.com/content/iml/2020/09/16/restricted-regression-sas.html

Koen

View solution in original post

Ksharp · Posted 02-22-2023 06:47 AM

“When we model risk scorecard, all the parameter coefficients are the same direction (all negative or positive) after variable woe.”

That is NOT true or right. @Rick_SAS discussed this topic at one of his blogs.

RiskViya · Posted 02-22-2023 10:41 AM

If the original variable is used for glm regression, the positive and negative reaction variables of the sign are related to the target variable, but after WOE coding, the correlation symbol should be consistent, and the review of the model by the strategy is also more acceptable to the model with the same parameter symbol.

MikaellaK · Posted 04-07-2023 08:00 AM

Hello,

I have a similar query. As part of a scorecard development, I have transformed all explanatory variables using WOE and established a monotonic relationship (increasing or decreasing) of each variable with the response variable. Should all coefficients be negative when I apply a logistic regression to the transformed explanatory variables?

PaigeMiller · Posted 04-07-2023 08:09 AM

@MikaellaK wrote:

Hello,

I have a similar query. As part of a scorecard development, I have transformed all explanatory variables using WOE and established a monotonic relationship (increasing or decreasing) of each variable with the response variable. Should all coefficients be negative when I apply a logistic regression to the transformed explanatory variables?

Does not the above discussion answer this question?

--
Paige Miller

Ksharp · Posted 04-07-2023 08:32 AM

No. Absolutely not .

@Rick_SAS wrote a wonderful blog "Simpson's Paradox" to explain this problem.

https://blogs.sas.com/content/iml/2023/03/27/simpsons-paradox.html

Suppose X stands for income per month, Y stands for the probability of default ,

From all the data, you could see the linear decreasing relationship between X and Y.

But if you take into account AGE variable, you would see the reverse result . Surprise ?

So you can't constraint all the parameter to be positive or negative .

And Rick also pointed out that would reduce the accuracy of model's prediction.

PaigeMiller · Posted 02-22-2023 08:25 AM

Adding to the comments from @Ksharp

Let's suppose you have at least these two variables in your scorecard model: FICO and number of delinquencies in last 24 months. These have opposite effects ... as FICO goes up, the score should go up; and number of delinquencies in last 24 months goes up, the score should go down. The coefficients in the model for these two variables SHOULD have opposite signs; there is nothing wrong with this. In fact, restricting the coefficients to have the same signs would make the model fit worse and cause the scores to do illogical things.

--
Paige Miller

RiskViya · Posted 02-22-2023 10:37 AM

The variables are monotonically divided into bins and WOE encoded, which is consistent with the correlation of the target variable. Therefore, The case you mentioned above is the original variable, and the scorecard was developed using the WOE-encoded variable.

sbxkoenk · Posted 02-22-2023 06:53 PM

Hello,

You could use the HPGENSELECT Procedure and RESTRICT statement therein.

The RESTRICT statement enables you to specify linear equality or inequality constraints among the parameters of a model. These restrictions are incorporated into the maximum likelihood analysis.

You might expect all WOE-variables to have the same sign (based on bi-variate relationship between target and input) , but due to multi-collinearity ( f.e. ) in the full model one or two WOE-variables can still get a sign opposite to the one you might expect.

See also here (although this is PROC REG) :
Restricted least squares regression in SAS
By Rick Wicklin on The DO Loop September 16, 2020
https://blogs.sas.com/content/iml/2020/09/16/restricted-regression-sas.html

Koen

Ksharp · Posted 02-23-2023 06:51 AM

1)"but after WOE coding, the correlation symbol should be consistent"
Why do you think that would happen? Either you use original variable
or WOE variable to build a model ,they are all under general LINEAR model.
Noticed that they both fit LINEAR effect not non-linear effect.

2)"the review of the model by the strategy is also more acceptable to the model with the same parameter symbol."
Maybe you are right.But
That is for business thing(i.e. could get better explanation for scorecard,better fit business rule),
NOT for statistical thing/theory .
In statistical theory there is no need to constrain coefficient to be positive or negative.

3)"The variables are monotonically divided into bins and WOE encoded, which is consistent with the correlation of the target variable."
They are the same thing,Both fit LINEAR effect as you said monotonically between X and Y.
And even worse, after you bin original variable into WOE,you lost more information than original variable,
That why @Rick_SAS suggest to use original variable to build model not WOE.
But in Scorecard to use WOE could make a better explanation than original variable.
And that is not reason you think estimated coefficient all should be positive or negative.
And if you check the SAS documentation of Scorecard ,
there is also an example which include positive and negative both .
Base your opinion ,SAS documentation is WRONG ?

P.S I totall agree with Paige's opinion.

4)"The variables are monotonically divided into bins and WOE encoded, which is consistent with the correlation of the target variable."
But under other variables/WOE influence , the correlation could reverse(a.k.a positive become negative).

Rick_SAS · Posted 02-23-2023 07:05 AM

With respect, I have never made any public statements about building scorecards or using WOE. KSharp, when you want to encourage an OP to read something that I wrote, it would be good to provide a link so that the OP can read exactly what I said, including the context in which I said it.

I encourage the OP to think about the comments from Paige and others who have pointed out that the parameters in a regression model should not be artificially constrained without a good reason. Doing so can reduce the accuracy of the model's predictions.

SAS fitting logistic regression, how to make all parameters coefficient are positive or negative？

Re: SAS fitting logistic regression, how to make all parameters coefficient are positive or negative

Re: SAS fitting logistic regression, how to make all parameters coefficient are positive or negative

Re: SAS fitting logistic regression, how to make all parameters coefficient are positive or negative

Re: SAS fitting logistic regression, how to make all parameters coefficient are positive or negative

Re: SAS fitting logistic regression, how to make all parameters coefficient are positive or negative

Re: SAS fitting logistic regression, how to make all parameters coefficient are positive or negative

Re: SAS fitting logistic regression, how to make all parameters coefficient are positive or negative

Re: SAS fitting logistic regression, how to make all parameters coefficient are positive or negative

Re: SAS fitting logistic regression, how to make all parameters coefficient are positive or negative

Re: SAS fitting logistic regression, how to make all parameters coefficient are positive or negative

Re: SAS fitting logistic regression, how to make all parameters coefficient are positive or negative

The 2025 SAS Hackathon has begun!