About Top_Katz

Top_Katz · ‎05-25-2021

Thank you for responding @SASKiwi . If what you say is true, and as of now I have no reason to doubt it, I do feel disappointed. After all, Enterprise Guide can communicate between my local drive and the server. Why can't SAS do it too? I understand that EG sits on my computer and knows how to locate the grid server because I gave it the grid server address. But then EG should be able to tell SAS where my computer is located; of course, SAS would have to have a way to use that information to reach back from the server to my computer, and it doesn't appear to have that capability. Is my understanding correct?

Top_Katz · ‎05-24-2021

Hi! The Enterprise Guide Import Data Wizard makes it very easy to read files into a SAS Grid Server session directly from my local C: (or other) drive. Can this be done with a SAS program? When I look at the generated code from the Wizard, there's a crucial step missing. The local file gets sucked up onto the server. The generated code shows how to read the server copy of the file into SAS, but not how to get the file from the local drive onto the server. Can this all be achieved within a SAS program? Thanks!

Top_Katz · ‎01-02-2020

Hi! I'm not sure where the best place to ask this question is, but I figured I'd start here. I am a very inexperienced LaTeX user. I want to create a LaTeX document with SAS code as it appears in the EG editor (with keyword coloration and Courier font). Of course, I could paste it in as an image, and it would have the right appearance, but then it couldn't be copied as text from the document. Do any of you have a LaTeX template that makes SAS code look more or less like it would in the EG editor? Thanks!

Top_Katz · ‎05-09-2019

Hi @Ksharp ! If you have a predictor variable with a measurable effect on your outcome and you don't include it in your model, then your model likely will be mis-specified. Of course, if the effect is non-monotonic and you're building a regression model, then you have to transform the predictor first to linearize the relationship. You can do that with bins or splines or other functions. The transformed variable will have the proper WoE relationship by design. I don't think I'm telling you anything you don't already know. I also understand that for credit scorecards, people will be suspicious of non-monotonic relationships and won't want to try to transform them; luckily there don't seem to be many meaningful non-monotonic relationships in credit scorecarding. But marketing is a different beast.

Top_Katz · ‎05-09-2019

Hi @Ksharp ! A hockey stick is not necessarily non-monotonic. I think the idea is that you have a flat stretch and then a sudden increase, rather than constant proportional increases. For example, suppose you're offering discounts on a product. Perhaps very few people buy at 5% or 10% or 15% or 20%, but something about a 25% discount grabs them and you get a sudden uptick, 40% even more, 50% way more, 60% through the roof. I'm sure you could imagine such scenarios. Also, not every effect is monotonic. For example, if you're selling Buicks (a General Motors brand of automobile), low income people probably don't buy very many. Middle income you start to see more interest, and it likely peaks in the very high middle income range. But wealthier people buy more luxury brands, like Cadillac and Lexus and Mercedes. If you wanted to model likelihood to purchase Buicks using income level as a linear predictor, you could use binning (or some other non-linear transformation) to transform the raw income levels.

Top_Katz · ‎05-09-2019

Hi @Ksharp ! Thank you for following up. I think what Siddiqi said makes sense; it looks like he more or less agrees with my views about linearity, where I assume he's talking about visual linearity of equally spaced bin WoE values. What is your disagreement with him?

Top_Katz · ‎05-08-2019

Right you are @Ksharp ! Univariate models are rarely useful in practice, just suitable for demonstration purposes. My point is that you can always bin a continuous variable, even one with a non-linear, non-monotonic relationship to the target, and get a transformed variable which is usable in a logistic regression model. (And once again, users of binned variables as model predictors should be aware of the risks incurred.)

Top_Katz · ‎05-08-2019

Hi @Ksharp ! As I demonstrated with my example program in response to your previous post, woe monotonicity is completely unnecessary for the original predictor variable. You just need to transform it to a variable that has woe monotonicity. When you create bins for a continuous predictor, just use the bin woe value as the transformed variable value, as I have shown you. That transformed variable will be ready for use as a predictor in a logistic regression.

Top_Katz · ‎05-08-2019

Hi @Ksharp ! The variable you call woe_amount is the negative of the variable I called woe_ks, and it looks like the variable you call good_bad is the same as the variable I called Creditability, so your logistic regression of good_bad against woe_amount is exactly equivalent to my second regression from above: %let runid = 02 ; %let indata&runid. = work.german_credit_groups ; %let binary_target_&runid. = Creditability ; %let bin_predictor_&runid. = woe_ks ; title2 "PROC LOGISTIC &runid. &&binary_target_&runid.. by &&bin_predictor_&runid.. DATA = &&indata&runid.." ; PROC LOGISTIC DATA = &&indata&runid.. DESCENDING OUTEST = work.gc_logit_&&bin_predictor_&runid.._coefs_&runid. OUTMODEL = work.gc_logit_&&bin_predictor_&runid.._outmodel_&runid. ; model &&binary_target_&runid.. = &&bin_predictor_&runid.. / lackfit rsquare ; store work.gc_logit_&&bin_predictor_&runid.._store_&runid. ; output out = work.gc_logit_&&bin_predictor_&runid.._output_&runid. predicted = predicted&runid. xbeta = xbeta&runid. reschi = reschi&runid. resdev = resdev&runid. reslik = reslik&runid. ; run ; title2 ; And the fact that the Hosmer-Lemeshow test shows perfect agreement between observed and expected is not a coincidence. Creating a bin transformation variable with values equal to the woe or the log odds will always do that, even if the woe is non-monotonic. You can test it and see. The program I have attached (_try_nonmono_woe_.sas) creates a thirty-six record data set with eighteen events and eighteen non-events and three bins, each with twelve records. The first bin has six events and six non-events, woe=0. The second bin has nine events and three non-events, woe=1.099, the third bin has three events and nine non-events, woe=-1.099. Very non-monotonic woe in bin sequence. Logistic regression against the bin sequence number is a failure, and the Hosmer-Lemeshow test shows significant disagreement between observed and expected. But once again, for logistic regression against the woe transformation, the Hosmer-Lemeshow test shows perfect agreement between observed and expected, as asserted.

Top_Katz · ‎05-07-2019

Hi @Ksharp ! Sorry for being unclear. Yes, for the set of bins you just posted, which I copied from your paper, I created two variables: 1. group_ks, which just assigns the corresponding "Obs" column number (1, 2, 3, 4) to each of the 1,000 data records, depending on which bin it falls into; 2. woe_ks, which assigns the corresponding "woe" column number (0.1672, -0.0442, -0.3719, -0.7295) to each of the 1,000 data records, depending on which bin it falls into Here is the other set of bins I used: Obs i j wsize wones wzero wrate sos iv woe ent css diffwrate sigdifrate diffwoe 1 1 668 739 550 189 0.7443 0.1407 0.0344 0.2209 0.6878 0.1278 . . . 2 669 845 183 114 69 0.6230 0.0430 0.0232 -0.3452 0.1985 0.0415 -0.1213 0.0769 -0.5661 3 846 923 78 36 42 0.4615 0.0194 0.0887 -1.0015 0.0881 0.0194 -0.1614 0.1310 -0.6562 Total 1,000 700 300 0.2030 0.1463 0.9745 0.1887 I also created two variables for this set of bins: 3. group_sd, which just assigns the corresponding "Obs" column number (1, 2, 3) to each of the 1,000 data records, depending on which bin it falls into; 2. woe_sd, which assigns the corresponding "woe" column number (0.2209, -0.3452, -1.0015) to each of the 1,000 data records, depending on which bin it falls into Then I ran separate regressions of the binary target variable, Creditability, against each of the four new variables, used as numeric variables, and also used as class variables. If you regress against the group_ks or group_sd variable as a numeric variable, that's where the visual linearity by sequence number corresponds to linearity between target and predictor. But if you use the binned predictor as a class variable, or if you use the woe_ks or woe_sd numeric version, visual linearity becomes irrelevant and the fit is better, too. The "_import_WORK.GERMAN_CREDIT_GROUPS.sas" program I uploaded with my previous post reads in the German Credit data along with the variables I added, and then the code embedded in my post runs the series of regressions, including Hosmer-Lemeshow tests (from the "lackfit" option on the model statement in PROC LOGISTIC). All of the H-L test show good agreement between observed and expected, but the ones using the woe_ks, woe_sd, or class variable predictors show perfect agreement.

Top_Katz · ‎05-06-2019

Hi @Ksharp ! I did an experiment with the German Credit data to illustrate my point about linearity of target and predictor versus visual linearity of bin WoE values with bin sequence numbers. The attachment imports the full thousand record German Credit data set, sorted by increasing order of Credit_Amount (the original interval-valued predictor), with five added variables: 1. idnum is just a sequence number for the original order of the observations 2. group_ks gives bin sequence numbers for the binning in your paper (group_ks = 1 for the first 699 records, up to Credit_Amount = 3578, group_ks = 2 for the next 139 records, up to Credit_Amount = 5743, group_ks = 3 for the next 60 records, up to Credit_Amount = 7127, group_ks = 4 for the last 102 records, up to Credit_Amount = 18424) 3. woe_ks gives the corresponding WoE value for each of the bins from your paper (woe_ks = 0.167231311070365 when group_ks = 1, woe_ks = -0.0441497846129298 when group_ks = 2, woe_ks = -0.371874163672129 when group_ks = 3, woe_ks = -0.72951482473082 when group_ks = 4) 4. group_sd gives bin sequence numbers for the maximum information value binning with all bin event rate differences significant with 95% confidence and 5% minimum distribution requirement with group size at least 60 (group_sd = 1 for the first 739 records, up to Credit_Amount = 3905, group_sd = 2 for the next 183 records, up to Credit_Amount = 7758, group_ks = 3 for the last 78 records, up to Credit_Amount = 18424) 5. woe_sd gives the corresponding WoE value for each of the group_sd bins (woe_sd = 0.2208734028 when group_sd = 1, woe_sd = -0.345205917 when group_sd = 2, woe_sd = -1.00144854 when group_sd = 3) Then I ran logistic regressions of Creditability, the binary target, against, respectively, group_ks as a numeric variable, woe_ks as a numeric variable, group_ks as a class variable, group_sd as a numeric variable, woe_sd as a numeric variable, group_sd as a class variable. In each case, I had PROC LOGISTIC run the Hosmer-Lemeshow test; it's only on development data because there is no extra test data. %let runid = 01 ; %let indata&runid. = work.german_credit_groups ; %let binary_target_&runid. = Creditability ; %let bin_predictor_&runid. = group_ks ; title2 "PROC LOGISTIC &runid. &&binary_target_&runid.. by &&bin_predictor_&runid.. DATA = &&indata&runid.." ; PROC LOGISTIC DATA = &&indata&runid.. DESCENDING OUTEST = work.gc_logit_&&bin_predictor_&runid.._coefs_&runid. OUTMODEL = work.gc_logit_&&bin_predictor_&runid.._outmodel_&runid. ; model &&binary_target_&runid.. = &&bin_predictor_&runid.. / lackfit rsquare ; store work.gc_logit_&&bin_predictor_&runid.._store_&runid. ; output out = work.gc_logit_&&bin_predictor_&runid.._output_&runid. predicted = predicted&runid. xbeta = xbeta&runid. reschi = reschi&runid. resdev = resdev&runid. reslik = reslik&runid. ; run ; title2 ; %let runid = 02 ; %let indata&runid. = work.german_credit_groups ; %let binary_target_&runid. = Creditability ; %let bin_predictor_&runid. = woe_ks ; title2 "PROC LOGISTIC &runid. &&binary_target_&runid.. by &&bin_predictor_&runid.. DATA = &&indata&runid.." ; PROC LOGISTIC DATA = &&indata&runid.. DESCENDING OUTEST = work.gc_logit_&&bin_predictor_&runid.._coefs_&runid. OUTMODEL = work.gc_logit_&&bin_predictor_&runid.._outmodel_&runid. ; model &&binary_target_&runid.. = &&bin_predictor_&runid.. / lackfit rsquare ; store work.gc_logit_&&bin_predictor_&runid.._store_&runid. ; output out = work.gc_logit_&&bin_predictor_&runid.._output_&runid. predicted = predicted&runid. xbeta = xbeta&runid. reschi = reschi&runid. resdev = resdev&runid. reslik = reslik&runid. ; run ; title2 ; %let runid = 03 ; %let indata&runid. = work.german_credit_groups ; %let binary_target_&runid. = Creditability ; %let bin_predictor_&runid. = group_ks ; title2 "PROC LOGISTIC &runid. &&binary_target_&runid.. by &&bin_predictor_&runid.. (class) DATA = &&indata&runid.." ; PROC LOGISTIC DATA = &&indata&runid.. DESCENDING OUTEST = work.gc_logit_&&bin_predictor_&runid.._coefs_&runid. OUTMODEL = work.gc_logit_&&bin_predictor_&runid.._outmodel_&runid. ; class &&bin_predictor_&runid.. ; model &&binary_target_&runid.. = &&bin_predictor_&runid.. / lackfit rsquare ; store work.gc_logit_&&bin_predictor_&runid.._store_&runid. ; output out = work.gc_logit_&&bin_predictor_&runid.._output_&runid. predicted = predicted&runid. xbeta = xbeta&runid. reschi = reschi&runid. resdev = resdev&runid. reslik = reslik&runid. ; run ; title2 ; %let runid = 04 ; %let indata&runid. = work.german_credit_groups ; %let binary_target_&runid. = Creditability ; %let bin_predictor_&runid. = group_sd ; title2 "PROC LOGISTIC &runid. &&binary_target_&runid.. by &&bin_predictor_&runid.. DATA = &&indata&runid.." ; PROC LOGISTIC DATA = &&indata&runid.. DESCENDING OUTEST = work.gc_logit_&&bin_predictor_&runid.._coefs_&runid. OUTMODEL = work.gc_logit_&&bin_predictor_&runid.._outmodel_&runid. ; model &&binary_target_&runid.. = &&bin_predictor_&runid.. / lackfit rsquare ; store work.gc_logit_&&bin_predictor_&runid.._store_&runid. ; output out = work.gc_logit_&&bin_predictor_&runid.._output_&runid. predicted = predicted&runid. xbeta = xbeta&runid. reschi = reschi&runid. resdev = resdev&runid. reslik = reslik&runid. ; run ; title2 ; %let runid = 05 ; %let indata&runid. = work.german_credit_groups ; %let binary_target_&runid. = Creditability ; %let bin_predictor_&runid. = woe_sd ; title2 "PROC LOGISTIC &runid. &&binary_target_&runid.. by &&bin_predictor_&runid.. DATA = &&indata&runid.." ; PROC LOGISTIC DATA = &&indata&runid.. DESCENDING OUTEST = work.gc_logit_&&bin_predictor_&runid.._coefs_&runid. OUTMODEL = work.gc_logit_&&bin_predictor_&runid.._outmodel_&runid. ; model &&binary_target_&runid.. = &&bin_predictor_&runid.. / lackfit rsquare ; store work.gc_logit_&&bin_predictor_&runid.._store_&runid. ; output out = work.gc_logit_&&bin_predictor_&runid.._output_&runid. predicted = predicted&runid. xbeta = xbeta&runid. reschi = reschi&runid. resdev = resdev&runid. reslik = reslik&runid. ; run ; title2 ; %let runid = 06 ; %let indata&runid. = work.german_credit_groups ; %let binary_target_&runid. = Creditability ; %let bin_predictor_&runid. = group_sd ; title2 "PROC LOGISTIC &runid. &&binary_target_&runid.. by &&bin_predictor_&runid.. (class) DATA = &&indata&runid.." ; PROC LOGISTIC DATA = &&indata&runid.. DESCENDING OUTEST = work.gc_logit_&&bin_predictor_&runid.._coefs_&runid. OUTMODEL = work.gc_logit_&&bin_predictor_&runid.._outmodel_&runid. ; class &&bin_predictor_&runid.. ; model &&binary_target_&runid.. = &&bin_predictor_&runid.. / lackfit rsquare ; store work.gc_logit_&&bin_predictor_&runid.._store_&runid. ; output out = work.gc_logit_&&bin_predictor_&runid.._output_&runid. predicted = predicted&runid. xbeta = xbeta&runid. reschi = reschi&runid. resdev = resdev&runid. reslik = reslik&runid. ; run ; title2 ; The first model uses your bin sequence numbers as a single variable, group_ks. Because you tried to create linearity, the Hosmer-Lemeshow test shows pretty close agreement between observed and expected, chi-square = 0.2012 and p value = 0.9043. But if you run the second model, using woe_ks, the WoE values for the same set of bins, as a single variable, you get perfect agreement between observed and expected in Hosmer-Lemeshow, chi-square = 0 and p value = 1. Same perfect result for the third model, where you use group_ks as a class variable. group_sd has similar results, although it does slightly better at model fit and classification / rank ordering than group_ks, and the only requirements for group_sd were that the event rate differences between neighboring groups should be significant (with event rates monotonically decreasing) and the minimum group size be 60 (with the 5% minimum distribution requirement too). group_sd made no attempt to show visual linearity. So, if you just use WoE values to represent the bins, or make the bins class variables, you get the best linear response between predictor and target, there's no need to try to obtain visual linearity.

Top_Katz · ‎05-03-2019

Hi @Ksharp ! "Pearson chi-square/DF =1 is testing if the data is over-disperse ,has nothing to do with GOF." Ha-ha, well, if you want to argue with Paul Allison, go ahead. Anyway, I've invited you to apply any of your favorite GOF measures to the data. You've already invested a lot of time in your argument, and haven't shown even a single bit of quantitative proof. I've given you fact after fact. So here's your chance! "What I try to do is to make score more distinguish (-10 -5 V.S. -10 -8) ,and make better GOF ,although IV is not bigger that yours." You haven't shown how or why making the score more distinguished is better, nor have you shown a better GOF. And in my example, the event rates are already distinguished with 99% confidence, anyway. "What I do with linearity of equally spaced bin is trying to not break assumption violation of GLM and get better GOF ." I don't mean to sound harsh, but as I've already repeated several times, equal spaced linearity is COMPLETELY IRRELEVANT to GLM assumptions. And you still haven't proved that it gives better GOF. I'm waiting. For proof.

Top_Katz · ‎05-03-2019

Hi @Ksharp ! You often mention GOF. Okay, which GOF statistics do you want to use? You can apply them to the examples I gave and tell me which one is better. Or you can create your own examples. Paul Allison has a nice article called "Measures of Fit for Logistic Regression" and the first goodness of fit test he refers to is Pearson chi-square (as you probably know, Paul Allison is a distinguished statistician and educator). The Pearson chi-square value for the more linear looking / lower IV set of bins is 1,192. The Pearson chi-square value for the improved fit higher IV set of bins is 1,305. As for distinguishing the scores, that is a good goal up to the point where it degrades your accuracy. In the example I gave, it's not an issue because the event rates for every neighboring pair of bins in both the higher and lower IV sets are different from each other with 99% confidence according to the standard difference of ratios test. The points I'm trying to get across are that: 1. Visual linearity of equally spaced bin WoE values is completely unrelated to the linearity of the relationship between the predictor and the target, and serves no analytic purpose. It just looks nice for story telling. 2. For both prediction accuracy and rank ordering, it is nearly always better to use a metric such as: maximum IV, maximum chi-square, minimum entropy, minimum sum of squares, etc., rather than visual linearity, as a guide. In particular, for binning with binary targets, minimum entropy is equivalent to logistic regression maximum likelihood. You're certainly correct when you say: "for a model , you can't just stand on a simple variable analysis" but that still doesn't justify picking your bins with an arbitrary methodology.

Top_Katz · ‎05-02-2019

Hi @Ksharp ! The example was completely hypothetical, made up data to show you the possible consequences of making a trade-off based on aesthetic rather than analytical considerations. You preferred the binning that didn't fit as well based purely on visual appeal. In this example, with two different ways of binning the same data, the one with better fitness statistics did a better job of classifying the data, even though it didn't have the visual linearity you prefer. If the differences are significant that will nearly always be the case (in both of the example binnings, the event rate differences between successive bins are all statistically significant with 99% confidence). You have to use the data that's available to support your decision. You can't assume that maybe a different data sample will behave the way you want it to, but a careful analyst estimates the error / confidence level in the expected results because some amount of difference may occur. Your method of looking for linearity is based on evenly spacing the bin results along your horizontal axis; if the spacing was different, the graph wouldn't look linear. But why should the spacing be even? The bins are not likely of equal width or equal frequency. It's an aesthetic choice, where are the analytics behind it? Can you show me an example where a binning with visual linearity outperforms (in some measurable way) a binning of the same data that doesn't have visual linearity, but has higher IV and Somers' D and chi-square than the binning with visual linearity?

Top_Katz · ‎05-01-2019

Hi @Ksharp ! Great. So let's look at an example and see the consequences of your decision. Here is our set of bins with the linear trend: Bin ID events non-events WoE IV 1 3,977 2,137 0.621 0.131 2 1,230 1,000 0.207 0.005 3 1,513 1,861 -0.207 0.008 4 2,000 3,722 -0.621 0.123 Total 8,720 8,720 0.267 But it turns out that the 283 leftmost members of bin three are all events. If you shift them to the right side of bin two, you get the following: Bin ID events non-events WoE IV 1 3,977 2,137 0.621 0.131 2 1,513 1,000 0.414 0.024 3 1,230 1,861 -0.414 0.030 4 2,000 3,722 -0.621 0.123 Total 8,720 8,720 0.308 Not only is the information value higher, although you say you don't care so much about that, but now you've correctly classified 283 (p = 0.60) events that you misclassified (p = 0.45) in the original binning. Was the linear appearance worth the misclassifications? I guess that's your call.

Online Status	Offline
Date Last Visited	2 weeks ago

Re: Binning (categorize continuous var into categories)

Re: Univariate Box Cox Transformation algorithm, PROC TRANSREG, and mi...

Re: Univariate Box Cox Transformation algorithm, PROC TRANSREG, and mi...

Re: Univariate Box Cox Transformation algorithm, PROC TRANSREG, and mi...

Re: Univariate Box Cox Transformation algorithm, PROC TRANSREG, and mi...

Re: Univariate Box Cox Transformation algorithm, PROC TRANSREG, and mi...

Re: Univariate Box Cox Transformation algorithm, PROC TRANSREG, and mi...

Re: Univariate Box Cox Transformation algorithm, PROC TRANSREG, and mi...

Univariate Box Cox Transformation algorithm, PROC TRANSREG, and minimu...

Re: How to graph overlapping bell curves?

Re: Univariate Box Cox Transformation algorithm, PROC TRANSREG, and mi...

Re: Univariate Box Cox Transformation algorithm, PROC TRANSREG, and mi...

Re: Univariate Box Cox Transformation algorithm, PROC TRANSREG, and mi...

Re: Univariate Box Cox Transformation algorithm, PROC TRANSREG, and mi...

Re: Univariate Box Cox Transformation algorithm, PROC TRANSREG, and mi...

Re: Binning (categorize continuous var into categories)

Re: Trying to use PROC OPTMODEL for monotonic supervised optimal binni...

Re: Trying to use PROC OPTMODEL for monotonic supervised optimal binni...

Re: Newbie question on using PROC OPTMODEL for simple two-index MILP m...

Newbie question on using PROC OPTMODEL for simple two-index MILP minim...

Re: How can I import data from local drive to Grid server programmatic...

How can I import data from local drive to Grid server programmatically...

Is there a "SAS EG Editor" LaTeX template for SAS code?

Re: Trying to use PROC OPTMODEL for monotonic supervised optimal binni...

Re: Trying to use PROC OPTMODEL for monotonic supervised optimal binni...

Re: Trying to use PROC OPTMODEL for monotonic supervised optimal binni...

Re: Trying to use PROC OPTMODEL for monotonic supervised optimal binni...

Re: Trying to use PROC OPTMODEL for monotonic supervised optimal binni...

Re: Trying to use PROC OPTMODEL for monotonic supervised optimal binni...

Re: Trying to use PROC OPTMODEL for monotonic supervised optimal binni...

Re: Trying to use PROC OPTMODEL for monotonic supervised optimal binni...

Re: Trying to use PROC OPTMODEL for monotonic supervised optimal binni...

Re: Trying to use PROC OPTMODEL for monotonic supervised optimal binni...

Re: Trying to use PROC OPTMODEL for monotonic supervised optimal binni...

Re: Trying to use PROC OPTMODEL for monotonic supervised optimal binni...