My Data Voyage | Data Science Portfolio

My Data Voyage

Data Scientist | Machine Learning & NLP Enthusiast | Insight Generator

Project Portfolio

Customer Segmentation

Analysed a dataset through exploration and preprocessing, conducted feature engineering, determined the optimal number of clusters (k), and applied machine learning models to segment customers effectively.

Technologies: Python, Scikit-learn, Pandas, Clustering Algorithms

Student Dropout Prediction

Conducted phased data exploration, preprocessing, and feature engineering. Built and compared predictive models using XGBoost and a neural network to forecast student dropout rates with high accuracy.

Technologies: Python, XGBoost, TensorFlow, Pandas

Statistical Hypothesis Testing

Applied statistical hypothesis testing to evaluate organisational data scenarios. Explored the differences between correlation and causation in data analysis.

Technologies: Python, Statistical Methods

Anomaly Detection

Explored a dataset to identify patterns, preprocessed data, and performed feature engineering. Applied statistical techniques and machine learning algorithms to detect anomalies, followed by a detailed report summarising findings and recommendations.

Technologies: Python, Pandas, Scikit-learn, Statistical Methods

Time Series Forecasting

Analysed historical sales data using time series decomposition, feature engineering, and ARIMA modeling to forecast future demand. Achieved 15% improvement in forecast accuracy over baseline methods.

Technologies: Python, Statsmodels, Prophet, Pandas

Neural Network Project

Designed and implemented a deep neural network architecture from scratch. Applied forward and backward propagation algorithms, optimized hyperparameters, and achieved state-of-the-art performance on classification tasks.

Technologies: Python, TensorFlow, Keras, NumPy, Matplotlib

Technical Skills

Deep Learning & Neural Networks

Custom neural network architectures • Forward/backward propagation • Gradient descent optimisation • TensorFlow, Keras, PyTorch

Natural Language Processing

Sentiment analysis (92%+ accuracy) • BERT implementation • Text classification • NLTK, spaCy, Hugging Face Transformers

Machine Learning Engineering

Supervised/unsupervised learning • XGBoost, Random Forests, SVM • K-means, DBSCAN, Hierarchical clustering

Model Optimisation & Evaluation

Hyperparameter tuning • Grid/Random/Bayesian search • ROC-AUC, precision-recall • Custom business metrics

Time Series Analysis & Forecasting

ARIMA, SARIMA, Prophet, LSTM • Decomposition techniques • 15%+ accuracy improvement • Demand forecasting

Anomaly Detection Systems

Real-time detection (95%+ accuracy) • Statistical methods, Isolation Forests • Autoencoders • Maritime systems

Data Visualisation & Analytics

Interactive dashboards • Matplotlib, Seaborn, Plotly, D3.js • SHAP values • Business intelligence reporting

Feature Engineering & Dimensionality Reduction

Feature creation/selection • PCA, t-SNE, UMAP • Autoencoders • High-dimensional data processing

Python & Data Science Stack

Production-ready ML pipelines • NumPy, Pandas, Scikit-learn • Automated workflows • Scalable systems

Statistical Analysis & Hypothesis Testing

Parametric/non-parametric tests • A/B testing • Correlation analysis • Causal inference • Model validation

MLOps & Model Deployment

Model versioning • Experiment tracking • Deployment pipelines • Drift detection • Automated retraining

Customer Analytics & Segmentation

Behavioural segmentation • RFM analysis • Cohort analysis • Targeted marketing • Retention optimisation

Visualisation Gallery

A selection of my data visualisation techniques

Contact Me

Interested in working together? Fill out the form below, and I'll get back to you promptly.

Form was sent successfully!

Location

Based in London, UK