I. Project Overview

2008 Financial meltdown due to asset securitization and overleveraging
Response in form of regulatory requirements for Bank Holding Companies
- Dodd-Frank Act supervisory stress testing (DFAST) - BHCs of $10-$50 BN Total Assets must provide forward-looking stress tests of their capital strcuture in-house
- Comprehensive Capital Analysis and Review (CCAR) - Further to DFAST requirements, BHCs of more than $50 BN Total Assets are also subject to Fed-conducted stress tests which must be publicly-disclosed.
Both programs assess whether:
- BHCs possess adequate capital to sustain macro and market shocks while still meeting lending needs without need of government capital injectionjs
- Capital posisions fall below ratio thresholds under 3 hypothetical scenarios: Baseline, Adverse, and Severely Adverse
S&P Global Market Intelligence, in collaboration with Columbia University MS&E program, has made an endeavor in developing a solution to model the risks

II. Methodology

The project followed multiple interations of of exploratory data analysis, data transformation and imputation, modeling & analysis, testing and business fine tuning.

Figure 1 - Methodology

III. Data Description

In the light of Capital and Loss Assessment under Stress Scenario (CLASS) Model, prepared by the Federal Bank of New York Staff, it’s decided to have the following proxies to be modeled:

Ratios Required by Federal Reserve	Proxies
Probability of Default (PD)	Probability of Default Model
Loss Given Default (LGD)	Loss Given Default Model
Exposure at Default (EAD)	Growth Rate Model
Pre-Provisioin Net Revenue	Net-Interest Income Non-Interest Expense Non-Interest Income

The data was transformed into panel form by aggregating data points for all banks as well as the macro-econmic data. With the panel data, it’s expected to forcast ratios from 2016 to 2020 for the pre-defined proxies.

Figure 2 - Some Meta Data of Data

IV. Exploratory Data Analysis & Data Transformation

As mentioned in Data Description, the predictors will be derived from macro-economic data while the “y” values will be derived from financial statement data. Note that some key assumptions were made in the data transformation.

Figure 3 - Variable Selection from Macro Data

Figure 4 - Data Transformation of Historical Bank Data

V. Model Selection

ARIMA has been chosen to be the champion model: remaining models violated iid principle (independent and identically distributed) while error terms showed strong autocorrelations.

Figure 5 - ARIMA as the Champion Model

VI. Model & Results

To dive deeper into the model, model was fit using linear regression firstly to capture trend; then ARIMA on residuals; finally, forecast using Kalman Filter.

Figure 6 - Linear Model with ARIMA

Sample results for prediction (Multifamily Loans):

Figure 7 - Sample Results for Multifamily Loans

VII. Challenges & Learnings

1. Challenges

Working with small data set
- Lack of complete historical data and small number of data points
- Prediction results might not be statistically significant
Making key assumptions
- Choosing proxies for modeling
- Enforce seasonal structure on PD & LGD Model
Evolving regulatory landscape
New efforts to deregulate banks could change modeling requirements and needs

2. Learnings

Data science topics
- Data Transformation & Imputation
- Variable Selection Framework
- Exploratory Data Analysis
- Time Series Analysis
Credit risk management topics
- Stress-testing general knowledge
- Corporate credit risk analysis

Regulatory Stress Testing Modeling in R

Operations Consulting, S&P Global in collaboration with Columbia University, NY