Financial Services / ML

Bank Customer Churn Prediction

A classification system that identifies banking customers at risk of churning, providing relationship managers with early warning signals and the feature-level insight needed to design effective retention interventions.

Type

Classification

Domain

Retail Banking

Methods

Ensemble, SHAP

Status

Completed

CHURN PREDICTION

SHAP + ENSEMBLE

ALL CUSTOMERS

ENGAGEMENT DECLINING

HIGH CHURN RISK

PREDICTED CHURN

The Challenge

Customer acquisition in retail banking is significantly more expensive than retention, yet many institutions lack systematic early-warning capability for identifying at-risk customers. By the time a customer closes their account, the retention window has passed.

The challenge is not just predicting who will leave, but understanding why, so that relationship managers can tailor their intervention to the specific drivers of each customer's dissatisfaction.

Approach

Data Exploration

Analysed customer demographic, transactional, and product holding data across 10,000+ records. Identified key features correlated with churn and examined class imbalance characteristics.

Feature Engineering

Created behavioural features including product usage trends, transaction frequency changes, and tenure-adjusted engagement metrics.

Model Development

Built and compared multiple classifiers with particular attention to handling class imbalance through SMOTE, class weighting, and threshold optimisation.

Interpretability

Applied SHAP analysis to provide feature-level explanations for churn predictions, enabling targeted retention strategies rather than generic interventions.

Results

10,000+

Customer records analysed across demographic, transactional, and product data

SHAP

Per-customer feature importance revealing individual churn drivers

Segment-Specific

Churn drivers varied significantly across customer segments

Analysed over 10,000 customer records spanning demographics, transaction history, and product holdings, building classification models that identified customers at elevated churn risk with sufficient lead time for intervention.

SHAP analysis revealed that the primary churn drivers varied significantly across customer segments, confirming that a one-size-fits-all retention approach would be ineffective. For some segments, product engagement was the dominant signal; for others, tenure and transaction frequency changes were more predictive. This justified personalised intervention strategies tailored to each segment's specific risk profile.

The model gives relationship managers something they did not have before: an evidence-based early warning with a clear explanation of why each customer is flagged, enabling targeted outreach before the customer has mentally disengaged.

Technology Stack

Python Scikit-learn XGBoost SHAP SMOTE Pandas Matplotlib

Interested in this work or something similar?

Get in Touch View All Projects