Back to Projects
Business Analytics / ML

Customer Behavioural Segmentation System

A clustering-based system that reveals hidden customer segments from transactional and behavioural data, enabling targeted marketing strategies that replace guesswork with data-driven precision.

Type
Unsupervised Learning
Domain
Marketing Analytics
Methods
K-Means, PCA, RFM
Status
Completed

The Challenge

Businesses often treat their entire customer base as a single group, deploying generic campaigns that resonate with some segments while alienating others. The underlying data to differentiate these groups typically exists within transactional systems, but without systematic analysis it remains unexploited.

The real cost is invisible: wasted marketing spend on customers unlikely to convert, missed opportunities with high-value segments, and a lack of insight into what drives engagement across different customer types.

Approach

01
Exploratory Data Analysis and Feature Engineering
Conducted thorough EDA to understand distributions, correlations, and anomalies. Engineered RFM (Recency, Frequency, Monetary) features and additional behavioural metrics.
02
Optimal Cluster Determination
Applied the elbow method and silhouette analysis to determine the optimal number of clusters, balancing granularity with interpretability.
03
Clustering and Dimensionality Reduction
Applied K-Means clustering on engineered features and used PCA for dimensionality reduction, enabling effective clustering and clear visual representation.
04
Segment Profiling and Recommendations
Profiled each cluster by defining characteristics, creating actionable segment descriptions with specific marketing strategy recommendations.
CUSTOMER SEGMENTATION
K-MEANS

Results

Distinct
Behavioural clusters with clear separation
RFM+
Extended feature engineering beyond standard metrics
Actionable
Per-segment marketing strategies delivered

Identified five distinct behavioural segments using extended RFM features (recency, frequency, customer lifetime value, average unit cost, customer age), revealing groups that generic marketing treats identically but that behave in fundamentally different ways.

The analysis exposed specific high-value segments: loyal customers generating disproportionate revenue, at-risk churners showing declining engagement patterns, and price-sensitive browsers who convert under different conditions than the core base.

Each segment was delivered with actionable characteristics and recommended engagement approaches, providing the analytical foundation for targeted campaigns that allocate budget by segment value rather than distributing spend uniformly across an undifferentiated customer base.

Technology Stack

Python Scikit-learn K-Means PCA Pandas Matplotlib Seaborn
Interested in this work or something similar?