Stress Without a Past: Building Forward-Looking Books in a Backward-Looking World

Financial stress testing has traditionally relied on the comfort of the past – abundant loan histories, detailed default data, and steady macroeconomic indicators. Yet, the financial world is changing faster than data can keep up. New product lines, regulatory shifts, and newer borrower profiles often lack the historical depth required to support robust model estimation.

This raises a core question: How do analysts stress-test portfolios that barely exist?

In other words, how do we construct forward-looking stress tests when the backward-looking data needed for calibration is missing or incomplete?

This post explores this challenge through the concept of synthetic front book generation, where new loan portfolios – those without deep histories – are simulated and stressed based on modeled or proxy data. The end objective is to showcase how financial institutions can project credit losses even in the absence of past performance.

The Problem: Stress Testing Without History

Stress testing frameworks such as IFRS 9 (International Financial Reporting Standard 9) or CCAR (Comprehensive Capital Analysis and Review) rely on historical patterns to link macroeconomic shocks (like GDP contraction or interest rate hikes) to portfolio performance metrics (like Default Rates and Loss Given Default (LGD)).

But front-book portfolios – newly originated loans or segments – present three challenges:

Data sparsity: New products lack historical default or loss data.
Changing borrower profiles: Emerging customer bases differ from legacy ones.
Structural shifts: Macroeconomic linkages (like GDP growth, unemployment, and default rates) or underwriting standards (the rules lenders use to assess borrower risk) evolve over time.

Without data continuity, traditional regression-based approaches have faltered. The key is to synthesize forward-looking exposure data – what might these new loans look like, and how might they behave under stress?

The Concept of Synthetic Front Books

A synthetic front book is a modeled representation of a future or newly originated loan portfolio. It mimics the statistical and economic properties of real-world portfolios by combining:

Historical back-book distributions (for reference)
Planned origination assumptions (for volume or size and types)
Expert judgment or model overlays (manual adjustments applied to PD (Probability of Default), LGD, and EAD (Exposure at Default) to capture extreme conditions such as those in a stress test)

Once generated, this synthetic portfolio can be subjected to scenario-based shocks, yielding stress-test results even when actual observations are unavailable.

Synthetic Front Book Data Generation Approaches

The Early Phase

Early synthetic data generation relied on rule-based systems, where deterministic algorithms mirrored credit policy logic. For example, if the loan-to-value (LTV) exceeded 80%, the model might raise the borrower’s PD or cap loan amounts based on income. When an LTV ratio exceeds 80%, it means the borrower is financing more than 80% of the property’s value with borrowed money and contributing less than 20% as their own equity (down payment). This implies that the borrower has less equity at stake, which increases the lender’s risk exposure. As a consequence, the model responds by treating these as riskier — either by increasing default likelihood or restricting how much they can borrow.

While effective for producing consistent datasets, these methods failed to replicate the intricate dynamics of financial markets.

The Rise of Statistical and Simulation-Based Methods

To overcome these limitations, probabilistic approaches such as Monte Carlo simulations and copula-based models were introduced.

Monte Carlo simulation became central to synthetic data generation under uncertainty. By repeatedly sampling from probability distributions of key risk drivers – like income, loan amount, and macroeconomic factors – analysts could construct a large number of hypothetical front book portfolios, each representing a potential future scenario.

Copula-based models further enhanced realism by capturing complex, nonlinear relationships between variables. Unlike traditional correlation matrices, copulas modeled joint behaviors – such as how defaults in one segment might trigger defaults in another – using functions like Gaussian or t-copulas.

Note: A copula is a mathematical function that connects (or “couples”) the marginal distributions of individual variables to form their joint multivariate distribution.

AI-Driven Synthetic Data

Today, artificial intelligence has revolutionized synthetic data generation. Advanced generative models – such as Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and diffusion models – learn the joint distribution of financial variables directly from data. These models can produce highly realistic synthetic portfolios that preserve nonlinear relationships, capture rare events, and adapt to stress scenarios.

Building a Synthetic Front Book

Let us walk through a simple example to illustrate an end-to-end process. We will create a 10-loan synthetic front book for retail exposures, apply macroeconomic stress factors, and compute expected credit losses (ECLs) under base, adverse, and severe scenarios.

Step 1: Define the Back-Book Structure

Let us first define a small reference back book that represents the characteristics of past loans.

data back_book;
input account_id $ product $ loan_amt PD LGD;
datalines;
B001 Home  90000 0.01 0.20
B002 Auto  40000 0.02 0.30
B003 Pers  15000 0.05 0.45
B004 Home 120000 0.015 0.25
B005 Auto  30000 0.03 0.35
B006 Pers  10000 0.06 0.50
B007 Home  95000 0.012 0.22
B008 Auto  50000 0.025 0.33
B009 Pers  20000 0.04 0.40
B010 Home 110000 0.013 0.21
;
run;

proc print data=back_book noobs;
title "Historical Back Book Portfolio";
run;

This small dataset contains historical PD and LGD values for three product types.

Step 2: Define Origination Plan for the Front Book

Next assume that plans to originate new loans for the next quarter are as follows:

We now create the synthetic front book based on these assumptions.

data front_book_raw;
call streaminit(12345);
do product = "Home", "Auto", "Pers";
select (product);
when ("Home") do n=4; base_amt=100000; end;
when ("Auto") do n=3; base_amt=50000; end;
when ("Pers") do n=3; base_amt=20000; end;
otherwise;
end;

do i=1 to n;
account_id = cats("F", put(_n_, z2.), i);
/* Sample PD, LGD from historical range with small variation */
PD = rand("Uniform") * (0.02) + (ifc(product="Home",0.01,
ifc(product="Auto",0.02,0.04)));
LGD = rand("Uniform") * (0.05) + (ifc(product="Home",0.22,
ifc(product="Auto",0.33,0.45)));
loan_amt = base_amt * (1 + rand("Normal", 0, 0.05)); /* 5% volatility */
output;
end;
end;
drop n base_amt i;
run;

proc print data=front_book_raw noobs;
title "Synthetic Front Book Portfolio (10 Loans)";
run;

This code:

Generates 10 new loans (4 home loans, 3 auto loans, 3 personal loans).
Randomly assigns PD and LGD near product-specific benchmarks.
Creates loan amounts with small random variation.

At this point, we have a forward-looking exposure profile – even though no real data exists yet.

Step 3: Define Stress Scenarios

Next, we define macroeconomic scenarios that scale PDs and LGDs to simulate worsening conditions.

data scenarios;
input scenario $ PD_Factor LGD_Factor;
datalines;
Base 1.0 1.0
Adverse 1.5 1.1
Severe 2.0 1.2
;
run;

proc print data=scenarios noobs;
title "Macroeconomic Stress Scenarios";
run;

Step 4: Apply Stress and Calculate ECL

For simplicity, ECL = PD × LGD × EAD.

We multiply PD and LGD by scenario factors to simulate stress impacts.

proc sql;
create table stressed_front_book as
select a.account_id, a.product, a.loan_amt as EAD,
a.PD, a.LGD,
b.scenario,
a.PD * b.PD_Factor as stressed_PD,
a.LGD * b.LGD_Factor as stressed_LGD,
(a.PD * b.PD_Factor * a.LGD * b.LGD_Factor * a.loan_amt) as ECL
from front_book_raw as a, scenarios as b;
quit;

proc print data=stressed_front_book;
title "Front Book ECL under Base, Adverse, and Severe Scenarios";
run;

Each loan now has three ECL outcomes – one per scenario.

Step 5: Portfolio-Level Summary

Finally, we summarize the results by product and scenario.

proc sql;
create table portfolio_summary as
select product, scenario,
count(*) as Num_Loans,
sum(EAD) as Total_EAD,
sum(ECL) as Total_ECL,
calculated Total_ECL / calculated Total_EAD as ECL_Rate
from stressed_front_book
group by product, scenario;
quit;

data portfolio_sorted;
set portfolio_summary;
select (Scenario);
when ('Base') ScenarioOrder = 1;
when ('Adverse') ScenarioOrder = 2;
when ('Severe') ScenarioOrder = 3;
otherwise ScenarioOrder = 99;
end;
run;

proc sort data=portfolio_sorted;
by Product ScenarioOrder;
run;

proc print data=portfolio_sorted (drop=ScenarioOrder) noobs;
title "Portfolio-Level ECL Summary by Product and Scenario";
run;

06_SoumitraDas_bl06_2025_05_Portfolio_Level_ECL-Summary.png

The result provides a forward-looking stress view, showing how ECL ratios increase across scenarios, typically doubling or tripling under severe shocks.

Step 6: Visualizing Synthetic Front Book Stress Results

ECL Rate by Scenario

/* ECL Rate by Scenario (Bar Chart) */

title "ECL Rate by Product and Scenario";
proc sgplot data=portfolio_summary;
vbar product / response=ECL_Rate group=scenario groupdisplay=cluster datalabel;
yaxis label="ECL Rate (%)" grid;
xaxis label="Product Type";

07_SoumitraDas_bl06_2025_06_ECL_Rate_Product_Scenario.png

The ECL rate increases progressively from the Base to Adverse to Severe scenarios, indicating a consistent rise in expected losses under worsening economic conditions. Unsecured personal loans display the greatest sensitivity to stress, while home loans remain comparatively stable. This pattern visually highlights the varying degrees of risk elasticity across different loan segments.

Total ECL by Scenario

/* Total ECL by Scenario (Stacked Column Chart) */

title "Total Expected Credit Loss by Scenario";
proc sgplot data=portfolio_summary;
vbar scenario / response=Total_ECL group=product groupdisplay=stack datalabel;
yaxis label="Total ECL (₹)" grid;
xaxis label="Scenario";
xaxis discreteorder=data values=("Base" "Adverse" "Severe");
run;

The stacked bar chart highlights the overall portfolio composition and shows how losses evolve across products. Under the Severe scenario, total losses nearly double compared to the Base scenario, illustrating the amplified impact of economic stress. Such visualizations allow us to clearly assess the capital impact and risk concentration across different loan products.

Scenario Sensitivity Table

/* Scenario Sensitivity Table (Heat Map) */

title "Scenario Sensitivity: ECL Rate Heat Map";
proc sgplot data=portfolio_summary;
heatmap x=scenario y=product / colorresponse=ECL_Rate colormodel=(paoy bibg);
gradlegend / title="ECL Rate";
xaxis discreteorder=data values=("Base" "Adverse" "Severe");
run;

The heat map illustrates the relative changes in ECL rates across three macroeconomic scenarios – Base, Adverse, and Severe – for three key retail product segments: Auto, Home, and Personal (Pers) loans. The color gradient, from lighter to darker shades, represents increasing ECL rates, providing a visual summary of credit risk sensitivity under varying stress conditions.

Personal Loans exhibit the most pronounced sensitivity to scenario severity, with ECL rates increasing sharply under the Severe scenario. This suggests higher vulnerability among unsecured retail exposures, which typically respond more adversely to rising unemployment or income shocks.

Home Loans show relatively stable ECL rates across scenarios, indicating stronger collateral coverage and lower exposure volatility.

Auto Loans display moderate stress sensitivity, likely reflecting partial collateralization but greater dependency on consumer confidence and interest rate shifts.

The visualization effectively demonstrates that the model framework captures non-linear scenario impacts – a critical requirement under regulatory stress testing programs (e.g., ICAAP or CCAR).

Loan-Level Distribution

10_SoumitraDas_bl06_2025_09_ECL_Distribution_Adverse.png

The histogram illustrates the distribution of loan-level ECLs under an adverse macroeconomic scenario. The ECLs are centered around the $800 - $900 range, showing a moderately concentrated distribution with a near-symmetric shape. Most loans exhibit moderate credit deterioration, while only a small portion falls in the higher-loss tail, indicating limited extreme risk.

The overlay of the normal curve suggests that, under this scenario, portfolio losses remain broadly aligned with statistical expectations, without significant skewness. This implies that while the adverse environment exerts upward pressure on credit losses, the portfolio’s credit quality remains contained, and broader systemic risks have not yet materialized to cause large scale instability.

The adverse scenario acts as an early stress checkpoint. The contained dispersion of ECLs indicates adequate borrower resilience and suggests that existing credit buffers and provisioning levels remain sufficient. However, the concentration near the mean highlights the importance of continued monitoring, as even mild macro shocks could quickly shift exposures into higher loss brackets if adverse trends persist.

11_SoumitraDas_bl06_2025_10_ECL_Distribution_Severe.png

The distribution of ECLs under a severe stress environment indicates that most loans cluster around moderate ECL values (roughly $1,000–$1,500), with relatively few exposures at the tails. The normal curve overlay suggests a near-symmetric but slightly right-skewed distribution – implying that while most borrowers experience similar levels of credit deterioration, a few loans contribute disproportionately higher losses.

The emergence of a heavier right tail under severe stress warrants closer monitoring of high-risk segments. Strengthening early warning frameworks and targeted loan restructuring for vulnerable borrowers could mitigate potential tail losses before they escalate into portfolio-wide instability.

Comparative Summary

When compared across the two scenarios:

The progression from the adverse to the severe scenario demonstrates a clear stress transmission mechanism, where incremental macro shocks lead to disproportionate increases in tail risk. This nonlinearity reinforces the importance of front book monitoring, synthetic scenario generation, and model recalibration within the stress testing governance framework.

Final Thoughts

The above process demonstrates how we can replicate stress testing logic even without access to real data. By constructing a synthetic front book, it becomes possible to build a credible framework for quantifying risk exposure in the absence of historical evidence.

A key takeaway from this approach is the importance of forward-looking calibration. Even without historical data, analysts can infer plausible PD and LGD ranges using peer benchmarks or back-book proxies.

Another insight is scenario sensitivity – using scenario factors such as 1.0, 1.5, or 2.0 helps illustrate risk elasticity, or how sensitive the front book is to changes in macroeconomic conditions.

The concept of portfolio diversification also emerges clearly, as different loan products react differently under stress – secured products like home loans show smaller LGD inflation compared to unsecured products like personal loans.

Finally, this framework is highly scalable to real-world portfolios. In SAS environments, the same logic can extend to thousands of loans, with macroeconomic inputs drawn directly from forecast models or central bank projections.

Additional Information

For more information on SAS Stress Testing solution visit the software information page here. For more information on curated learnings paths on SAS Solutions and SAS Viya, visit the SAS Training page. You can also browse the catalog of SAS courses here.

Find more articles from SAS Global Enablement and Learning here.