My Data Voyage - Visualisation Gallery

My Data Voyage Visualisation Gallery

Exploring the intersection of machine learning, data science, and visual storytelling through comprehensive analytical projects

Project Categories

Diverse analytical approaches across multiple domains

🤖

Machine Learning

Supervised and unsupervised learning applications

🧠

Deep Learning

Neural networks and advanced architectures

📊

Statistical Analysis

Time series and statistical thinking

🎓

Educational Visuals

Interactive learning demonstrations

Customer Segmentation with Clustering

Machine Learning
t-SNE Cluster Visualisation

Analysis Process

1

Data Processing

Processed 951,668 e-commerce orders from 2012-2016 across five continents

2

Feature Engineering

Created RFM features: Frequency, Recency, CLV, Average Unit Cost, Customer Age

3

Clustering Analysis

Applied K-means (k=5) and hierarchical clustering with PCA/t-SNE visualisation

4

Segment Profiling

Identified 5 distinct segments with tailored marketing strategies

Key Findings

t-SNE outperformed PCA for visualising non-linear clusters
Cluster 3: Young, high CLV customers ideal for retention

Business Impact

Segments Identified 5 Distinct
Silhouette Score 0.35
View Full Analysis

Predicting Student Dropout

Supervised Learning
Confusion Matrix Visualisation

Analysis Process

1

Three-Stage Analysis

Analysed student data across application, engagement, and academic stages

2

Model Development

Implemented XGBoost and Neural Networks with stratified sampling

3

Feature Selection

Identified UnauthorisedAbsenceCount as top predictor in Stage 2

4

Performance Optimisation

Tuned models achieving 95% recall for engagement-based predictions

Key Findings

Stage 2 (engagement) achieved F1: 0.24, Recall: 0.95
Unauthorised absences strongest dropout indicator

Business Impact

Recall Rate 94.88%
Early Warning Mid-course
View Full Analysis

Ship Engine Anomaly Detection

Anomaly Detection
Ship Engine Anomaly Detection

Analysis Process

1

Data Exploration

Analysed 19,535 samples of engine functionality metrics

2

Statistical Detection

Applied IQR method identifying 2.16% outliers across features

3

ML Approaches

Implemented One-Class SVM and Isolation Forest algorithms

4

Model Selection

Selected Isolation Forest (5% contamination) for real-time monitoring

Key Findings

Engine RPM showed 2,668 outliers - highest among features
Isolation Forest captured broader anomalies effectively

Business Impact

Detection Rate 5%
Real-time Capable Yes
View Full Analysis

Neural Network for Spam Detection

Deep Learning
Neural Network Architecture

Analysis Process

1

Data Preparation

Processed 4,601 emails with 57 features from Spambase dataset

2

Architecture Design

Built 64-32-1 sequential model with ReLU and Sigmoid activations

3

Model Training

Trained with Adam optimizer using binary cross-entropy loss

4

Performance Evaluation

Achieved 92.3% accuracy on test set validation

Key Findings

5,921 trainable parameters effectively captured patterns
Simple architecture achieved high accuracy

Business Impact

Accuracy 92.3%
Training Time 10 Epochs
View Full Analysis

Hyperparameter Tuning Visualisation

Educational Visual
Model Comparison Dashboard

Visualisation Features

1

Interactive Controls

Real-time adjustment of learning rate, batch size, and epochs

2

Performance Metrics

Dynamic visualisation of loss curves and accuracy trends

3

Comparison Views

Side-by-side comparison of different parameter configurations

4

Educational Insights

Clear explanations of parameter effects on model performance

Key Findings

Interactive learning enhances parameter understanding
Visual feedback accelerates optimal configuration discovery

Educational Impact

Learning Mode Interactive
Parameters 3 Key
View Full Analysis

Statistical Thinking Visuals

Educational Visual
Statistical Distributions

Visual Components

1

Probability Distributions

Interactive visualisations of common statistical distributions

2

Hypothesis Testing

Visual demonstrations of p-values and confidence intervals

3

Sampling Concepts

Animated explanations of sampling distributions and CLT

4

Bayesian Thinking

Interactive Bayesian updating and prior/posterior visualisations

Key Features

Complex concepts made accessible through interaction
Real-time parameter adjustment for deeper understanding

Educational Value

Concepts Covered 10+
Interaction Level High
View Full Analysis