Credit Risk Modeling with Machine Learning

Cost of risk is one of the biggest components in banks’ cost structure. Thus, even a slight improvement in credit risk modelling can translate in huge savings. That’s why machine learning is often implemented in this area.

We would like to share with you some insights from one of our projects, where we applied machine learning to increase credit scoring performance.To illustrate our insights we selected a random pool of 10 000 applications.

How the regular process works

Loan applications are usually assessed through a credit score model, which is most often based on a logistic regression (LR). It is trained on historical data, such as credit history. The model assesses the importance of every attribute provided and translates them into a prediction.

The main limitation of such a model is that it can take into account only linear dependencies between input variables and the predicted variable. On the other hand, it is this very property that makes logistic regression so interpretable. LR is in widespread used in credit risk modelling.

Credit scoring from a logistic regression model

What machine learning brings to the table

Machine learning enables the utilization of more advanced modeling techniques, such as decision trees and neural networks. This introduces non-linearities to the model and allows to detect more complex dependencies between the attributes. We decided to use an XGBoost model fed with features selected with the use of a method called permutation importance.

Credit scoring from tree-based model

However, ML models are usually so sophisticated that they are hard to interpret. Since a lack of interpretability would be a serious issue in such a highly regulated field as credit risk assessment, we opted to combine XGBoost and logistic regression.

Combining the models

We used both scoring engines – logistic regression and the ML based one – to assess all of the loan applications

With a clear correlation between the two assessment approaches, a high score in one model would likely mean a high score in the other.

Loan applications assessed by 2 models

In the original approach, logistic regression was used to assess applications. The acceptance level was set around 60% and the risk resulted at 1%

Initial credit application split (acceptance to portfolio risk)

If we decrease the threshold by a couple of points, the acceptance level hits 70% while the risk jumps to 1,5%

Credit applications’ split after lowering the threshold

We next applied a threshold for an ML model, allowing us to get an acceptance percentage to the original level (60%) while bringing the risk down to 0,75% that is by 25% lower than the risk level resulting from only traditional approach.

Credit applications’ split after applying Machine Learning

Summary

Machine learning is often seen as difficult to apply in banking due to the sheer amount of regulation the industry faces. The facts don’t necessarily back this up. ML is successfully used in numerous, heavily regulated industries. The example above is just one more example of how. Thanks to this innovative approach it is possible to increase the sustainability of the loans sector and make loans even more affordable to bank customers. There’s nothing artificial about that kind of intelligence.

Contact us

I consent to receive commercial information about deepsense.ai sp. z o.o., including details about its products and offer, through email communication.*

The administrator of the personal data provided by you in the registration form is deepsense.ai sp. z o.o., headquartered at al. Jerozolimskie 44, 00-024 Warsaw, Poland. Your personal data will be processed for the purpose of directing marketing content to you.

Detailed information about the processing of your personal data, including your rights, can be found in our privacy policy.

* This consent is required to receive email communication from deepsense.ai sp. z o.o. regarding the company and its offerings.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Locations

United States of America

deepsense.ai, Inc.
2100 Geng Road, Suite 210
Palo Alto, CA 94303
United States of America

Poland

Let us know how we can help

Our service offerings
contact@deepsense.ai

Media relations
media@deepsense.ai

Using machine learning in credit risk modelling

How the regular process works

What machine learning brings to the table

Combining the models

Summary

Contact us

Locations

Let us know how we can help

Services

Resources

About us

Support

Join our community