Back to Projects
Financial Services / ML

Bank Customer Churn Prediction

A classification system that identifies banking customers at risk of churning, providing relationship managers with early warning signals and the feature-level insight needed to design effective retention interventions.

Type
Classification
Domain
Retail Banking
Methods
Ensemble, SHAP
Status
Completed
CHURN PREDICTION
SHAP + ENSEMBLE
ALL CUSTOMERS
ENGAGEMENT DECLINING
HIGH CHURN RISK
PREDICTED CHURN

The Challenge

Customer acquisition in retail banking is significantly more expensive than retention, yet many institutions lack systematic early-warning capability for identifying at-risk customers. By the time a customer closes their account, the retention window has passed.

The challenge is not just predicting who will leave, but understanding why, so that relationship managers can tailor their intervention to the specific drivers of each customer's dissatisfaction.

Approach

01
Data Exploration
Analysed customer demographic, transactional, and product holding data across 10,000+ records. Identified key features correlated with churn and examined class imbalance characteristics.
02
Feature Engineering
Created behavioural features including product usage trends, transaction frequency changes, and tenure-adjusted engagement metrics.
03
Model Development
Built and compared multiple classifiers with particular attention to handling class imbalance through SMOTE, class weighting, and threshold optimisation.
04
Interpretability
Applied SHAP analysis to provide feature-level explanations for churn predictions, enabling targeted retention strategies rather than generic interventions.

Results

10,000+
Customer records analysed across demographic, transactional, and product data
SHAP
Per-customer feature importance revealing individual churn drivers
Segment-Specific
Churn drivers varied significantly across customer segments

Analysed over 10,000 customer records spanning demographics, transaction history, and product holdings, building classification models that identified customers at elevated churn risk with sufficient lead time for intervention.

SHAP analysis revealed that the primary churn drivers varied significantly across customer segments, confirming that a one-size-fits-all retention approach would be ineffective. For some segments, product engagement was the dominant signal; for others, tenure and transaction frequency changes were more predictive. This justified personalised intervention strategies tailored to each segment's specific risk profile.

The model gives relationship managers something they did not have before: an evidence-based early warning with a clear explanation of why each customer is flagged, enabling targeted outreach before the customer has mentally disengaged.

Technology Stack

Python Scikit-learn XGBoost SHAP SMOTE Pandas Matplotlib
Interested in this work or something similar?