If there’s one thing in today’s world that baffles the average person more than trying to decode the latest tech gadget’s manual, it’s the credit score. This elusive number, tucked away in the depths of the financial system, wields more influence over our lives than we’d like to admit. It can decide whether you land that dream home, drive off in a new car, or even whether you can snag a new sofa on a payment plan.
For the most part, we treat it like a cryptic code from a lost civilization—something we don’t understand but know holds sway over our financial lives. And the most frustrating part for most people? It can shift without warning, often for reasons that seem completely arbitrary. Let’s delve into why credit scores are the ultimate puzzle in personal finance and how they have the power to make even the calmest among us anxious.
The idea of a credit score seems straightforward. It’s meant to gauge how reliable you are at repaying borrowed money. Easy enough, you think. I pay my bills on time, I’m financially responsible, so my score should be excellent! Right? Well, not so fast.
Take this scenario: you responsibly pay off your car loan ahead of schedule. You’re expecting some credit score applause, right? Nope, not quite. Instead, your score dips. Why? Because you’ve lost the diversity in your “credit mix” (the different types of credit accounts you have – mortgages, loans, credits cards, etc.). Apparently, being financially responsible doesn’t always make you a favorite with lenders. It’s like discovering that acing your exams somehow made you less popular with the cool kids.
And let's not forget about credit inquiries. You decide to check your credit score out of curiosity (or sheer panic) to see where you stand. Amazingly, your score takes a slight hit just because you dared to take a peek (although not entirely true, as we find out later in this post). It’s like getting in trouble for simply saying “hi” to your parents.
At this point, some of us might think it’s safer to just fly under the radar. The less we tinker with our credit, the better. But here’s the kicker: even doing nothing can backfire. A short credit history is also frowned upon. It’s like being penalized for not having enough excitement in your life.
Then there are the numbers themselves. A score above 750? You’re in great shape. Between 650 and 750? You’re doing fine, but don’t get too comfortable. Below 650? You’re skating on thin financial ice. And if you’re under 600? The bank views you like you just requested a loan to acquire a nuclear sub.
For the average person, credit scores represent an invisible boss we’re constantly trying to please – without ever getting clear feedback. They’re always lurking in the background, subtly influencing our lives. Most often there’s an acute lack of understanding of how these numbers are calculated. It’s like the financial world’s version of a mystery novel where the plot shifts every time you turn the page.
Setting aside our comical entwinement with credit scores for the time being, we can all appreciate that they are crucial to aid lenders in granting consumer credit. The basic idea involves identifying key factors that influence the probability of default (PD) and combining or weighting them into a quantitative score. This score can then be interpreted directly as the probability of default or used as a basis for a classification system.
Renowned British statistician and geneticist Ronald Fisher is widely recognized for establishing the foundation of modern statistical methods, which have become integral to credit scoring. His introduction of discriminant analysis in the 1930s is especially pertinent to the development of credit scoring models.
At the onset of World War II, financial institutions faced significant challenges in credit management. With credit analysts being drafted into military service, there was a critical shortage of skilled professionals. To address this, experienced credit underwriters documented their numerous “rules of thumb”, effectively creating a paper-based expert system that could be used by non-experts.
The introduction of credit cards in the late 1960s highlighted the value of credit scoring for banks and credit card issuers. The sheer volume of daily credit card applications made it economically and logistically impractical to manually process each application. Advances in computing power facilitated automation of lending decisions, and organizations discovered that credit scoring provided more accurate predictions than any judgment-based methods available at the time.
Further advancements in computing power and statistical software during the 1980s enabled scorecard developers to explore various statistical techniques. Logistic regression emerged as the most commonly used method, while expert systems and neural networks were experimented with, showing mixed results. Additionally, the 1980s marked a broadening of scoring applications beyond traditional credit and response areas, extending into retention, attrition, collections, insurance, and other fields.
Let’s explore how credit scores are calculated. They are essentially probability estimates of delinquency, derived from statistical models that correlate credit report data (among other things) with past debt performance.
I. The Data
Credit report data offers a detailed snapshot of an individual's financial behavior and credit history, compiled by credit bureaus. It encompasses personal details, including name and national identification number (e.g. social security number), as well as comprehensive information on credit accounts like loans and credit cards. The report tracks payment patterns, highlighting any late or missed payments, and includes public financial records such as bankruptcies or liens. Additionally, it lists credit inquiries, showing who has reviewed the report. This information is vital for lenders in evaluating creditworthiness, playing a key role in decisions regarding loan approvals, interest rates, and other financial opportunities.
While credit reports generally serve as the primary data source for constructing scorecards, they are not the only information utilized. Contemporary scoring systems also incorporate alternative data points like social media engagement, mobile phone usage, and transactional records to achieve a more thorough risk evaluation. In practice, risk analysts tend to employ any legally permissible information available, recognizing that legal restrictions and guidelines can vary across different jurisdictions.
II. The Appropriate Variables
Let's take a look at some of the common variables employed when building a score card – this is very closely related to the things that confuse us about the movement of credit scores. The source of the confusion is primarily related to the fact that credit scoring is not an exact science. While it relies on sophisticated algorithms and statistical models to evaluate creditworthiness, it involves a level of estimation and uncertainty. Human financial behavior is complex and influenced by numerous factors, making it difficult to predict with absolute accuracy.
Predictive indicators in credit risk assessment, such as frequency of hard inquiries, absence of credit cards, high credit utilization, or multiple credit applications, demonstrate a statistical correlation with delinquency. This correlation is not based on intuition but on empirical evidence derived from extensive historical data analysis. These indicators have consistently shown higher default rates across various time periods.
However, it's crucial to distinguish between correlation and causation in risk modeling. While these factors are statistically significant predictors, they don't guarantee default on an individual basis. For instance, extending credit to individuals with no prior credit history doesn't always result in default. Rather, this demographic has historically exhibited higher default rates.
In credit scoring models, our primary goal is to determine whether a borrower will encounter difficulties in repaying a loan – this is our dependent variable (Y). The Y variable, typically representing credit risk or probability of default, is usually binary (e.g., default/non-default) or ordinal (e.g., risk grades).
X variables, or predictors, should be chosen based on their predictive power, stability, and logical relationship to credit risk. Common categories include payment history, credit utilization, length of credit history, types of credit accounts, and recent credit inquiries. However, as noted earlier, modern credit scoring increasingly incorporates alternative data sources, such as utility payments, or social media data, to enhance predictive power.
When choosing X variables, it's crucial to ensure regulatory compliance by excluding prohibited variables such as race or gender. Feature engineering and selection methods, like Weight of Evidence (WOE) and Information Value (IV), are useful for identifying the most predictive variables.
III. Binning, Weight of Evidence (WOE) and Information Value (IV)
Transforming raw data into meaningful formats is crucial for developing accurate and reliable models. Key preprocessing techniques such as Binning, Weight of Evidence (WOE), and Information Value (IV) play a vital role in improving the performance of parametric models used in credit scoring.
Binning is the method of dividing continuous or categorical variables into distinct intervals or groups. This approach helps simplify data, manage outliers, and reduce noise. In credit scoring, binning enables the classification of variables like age or income into specific categories, helping the model identify complex, non-linear patterns between the variable and the target outcome (in this case, predicting credit default).
Information Value (IV) is a valuable metric for selecting variables for models, as it indicates the predictive strength of a variable. A high IV suggests that the variable is significant in predicting outcomes. IV is calculated by assigning a Weight of Evidence (WOE) to each bin and summing these WOE values, weighted by the difference in the distribution of good and bad cases. This sum reflects how effective the variable is at predicting defaults.
IV. Models
The most commonly used credit scoring models include statistical models, machine learning models, and expert systems.
Expert systems utilize a set of predefined rules and expert knowledge to evaluate credit risk. These systems are more heuristic in nature and rely less on data-driven analysis, making them particularly useful in specialized situations where data availability is limited or incomplete.
The end objective of both statistical and machine learning models in credit scoring is to estimate the probability that a borrower will fail to meet their financial obligations. The reason why we want to do this is to obtain the odds of defaulting, which can then be transformed into a standardized credit score using scaling techniques.
Logistic regression is widely used in credit scoring due to its simplicity, transparency, and strong predictive power in assessing credit risk. Compared to more complex models, logistic regression offers straightforward insights into how various factors influence a borrower’s probability of default, making it a favored option among financial institutions.
Although more advanced models like machine learning have gained traction, logistic regression continues to be widely used because of its robustness, user-friendliness, and clarity in interpretation, as well as its broad regulatory acceptance. It effectively balances precision and transparency, making it a dependable choice in credit scoring.
Assuming that all the selected predictor or X variables have been WOE coded, the estimated logistic regression can be specified as:
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
V. Log Odds, Scaling and Scorecards
Log odds are the natural logarithm of the odds of an event occurring, such as the odds of a borrower defaulting. Given a probability of default p, the odds of defaulting are calculated as p/(1−p), and the log odds are expressed as ln(p/(1−p)). Logistic regression models produce log odds directly, which are then transformed into more intuitive and usable credit scores.
Scaling is the process of converting log odds into a credit score that ranges within a predefined scale, such as 300 to 850 (like Fair Isaac Corporation (FICO) scores), which is more understandable for lenders and borrowers. This is done using a linear transformation where the log odds are scaled to fit within the desired score range.
VI. Which log odds to use?
In calculating credit scores, the formula typically uses the log odds of not defaulting to defaulting, even though popular models like logistic regression will directly output the log odds of defaulting to not defaulting (assuming the event of interest is credit default).
Hence, to obtain credit scores we use
Where p is the probability of default. This is done for the sake of consistency with credit scores. Higher log odds imply better creditworthiness. A lower probability of default (p) results in a higher log odds value, which translates into a higher credit score.
Intuitively, when p is small,
is large, resulting in a positive log value, and, hence, higher scores.
The transformation employed to calculate credit scores is:
The Offset is also called the Base Score or simply constant.
We can obtain the values of Offset and Factor in two ways:
Method I
And
where,
Ref Score = Reference Score associated with a given Ref Odds
Ref Odds = Reference Odds associated with a given Ref Score
Odds Inc = Reference Odds Increment. In our situation this will be 2, given that we specify the score points required to double the odds.
Score Inc = Score Increment
PDO = The number of points required for the odds to double.
Let’s use the default Scaling Options in the Scorecard node in SAS Model Studio as an example for calculating the Offset and Factor values.
Note: The Scorecard node is available in SAS Model Studio when the SAS Risk Modeling Add-on is also installed.
The screenshot above shows that the Ref Score = 200, Ref Odds = 50 and PDO = 20. As noted above, Odds Inc = 2. Therefore, a score of 200 corresponds to the odds 50:1 of being a good client, and a score of 220 corresponds to the odds 100:1 of being a good client.
Hence,
And,
Method II
In the second method, we can use the fact that under the default settings
And,
Hence, after subtracting the smaller score-value equation from the larger score-value equation we get
And,
which is the same as what we had obtained using Method I.
Given the offset and factor values, we can now obtain the credit score corresponding to any values of probability of default.
VII. Allocation of Attribute Points
The next important question is how do we distribute the overall credit score across various attributes?
Let us assume that that there are only two attributes that determine credit scores, and these are Payment History, and Credit History. The various bins associated with the two attributes are given in the table below.
Let’s also assume that we have built a logistic regression model using historical data to predict the odds of defaulting. Given the estimated parameters in a logistic regression model, we can exponentiate them to obtain the odds ratios. Odds ratios are the ratio of two odds and represents the change in odds from the reference bin to the specific bin of interest. Specifically,
And,
The points allocated to each bin indicate the level of risk associated with that category of the attribute. Bins with lower risk are given higher points, while those with higher risk receive fewer points. An individual's final credit score is calculated by summing the points from all relevant bins.
So, if we assume that an individual's credit attributes fall into the following bins, then the final credit score would be:
Various credit risk modeling vendors (in the US), including FICO, Experian, and Equifax, provide scorecards that link credit scores to the probability of default (PD). As noted in the previous section, this relationship is determined by the reference odds, reference score, and points to double the odds (PDO). Below is an overview of how these parameters vary among different vendors:
This above list is by no means comprehensive one, but it serves to illustrate that all vendors generally adhere to a similar methodology.
SAS Model Studio in SAS Viya provides pipeline templates (only if SAS Risk Modeling Add-on is installed) for both PD Application Scorecards and for PD Behavioral Scorecards.
A PD Application Scorecard is employed during the initial loan application stage to assess the risk associated with new applicants. It analyzes factors like credit history, income, employment status, and demographic details. Its primary objective is to estimate the probability of default for individuals who are seeking credit for the first time.
Conversely, a PD Behavioral Scorecard is designed for customers with an established credit history at the institution. It utilizes data related to payment patterns, account activity, and recent interactions with existing credit products. This scorecard is frequently updated to account for shifts in a customer's financial behavior, offering a more precise forecast of potential future defaults.
For more information on SAS Risk Modeling visit the software information page here.
For more information on curated learnings paths on SAS Solutions and SAS Viya, visit the SAS Training page. You can also browse the catalog of SAS courses here.
Find more articles from SAS Global Enablement and Learning here.
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.