Back to Projects
Natural Language Processing

Customer Sentiment Classification Engine

A production-ready NLP system that transforms unstructured customer feedback into categorised sentiment signals, enabling a wellness centre to identify service strengths and weaknesses at scale.

Type
Applied NLP
Domain
Customer Analytics
Accuracy
92%
Status
Completed

The Challenge

Wellness centres accumulate large volumes of customer feedback through reviews, surveys, and informal comments. This feedback contains valuable signals about what clients value and where service quality is inconsistent.

Without systematic analysis, these signals remain locked inside unstructured text. Feedback is read sporadically, patterns are missed, and service improvement decisions are based on anecdote rather than data.

Approach

01
Text Preprocessing Pipeline
Built a robust pipeline handling tokenisation, stopword removal, lemmatisation, and domain-specific text normalisation for customer-generated text with spelling variations and informal language.
02
BERT Fine-Tuning
Fine-tuned a pre-trained BERT model on the labelled customer review dataset. Optimised learning rate schedules, batch sizes, and training epochs for the specific domain vocabulary.
03
Model Evaluation
Evaluated using precision, recall, F1-score, and confusion matrix analysis. Ensured strong performance on negative reviews, the highest-value category for service improvement.
04
Insight Extraction
Beyond binary sentiment, extracted thematic patterns from classified reviews to identify specific service areas driving positive and negative feedback.
SENTIMENT CLASSIFICATION
92%
excellent recommend poor outstanding
disappointing average wonderful waste
relaxing professional rude okay amazing

Results

92%
Classification accuracy on customer reviews
BERT
Transformer model fine-tuned for domain
Actionable
Thematic insights for service improvement

The system demonstrated that transformer-based models can achieve high accuracy on real-world customer feedback even with modest training data. The thematic analysis layer provided business value beyond raw classification, identifying recurring pain points the centre could act on immediately.

Technology Stack

Python BERT Hugging Face Transformers PyTorch NLTK Pandas Scikit-learn Matplotlib
Interested in this work or something similar?