Data Scientist | Machine Learning & NLP Enthusiast | Insight Generator
Developed an NLP solution to analyse customer feedback sentiment. Implemented text preprocessing techniques and trained a BERT-based model to classify sentiment with 92% accuracy. Technologies: Python, NLTK, Transformers, PyTorch.
Technologies: Python, NLTK, Transformers, PyTorch.
Analysed a dataset through exploration and preprocessing, conducted feature engineering, determined the optimal number of clusters (k), and applied machine learning models to segment customers effectively.
Technologies: Python, Scikit-learn, Pandas, Clustering Algorithms
Conducted phased data exploration, preprocessing, and feature engineering. Built and compared predictive models using XGBoost and a neural network to forecast student dropout rates with high accuracy.
Technologies: Python, XGBoost, TensorFlow, Pandas
Applied statistical hypothesis testing to evaluate organisational data scenarios. Explored the differences between correlation and causation in data analysis.
Technologies: Python, Statistical Methods
Explored a dataset to identify patterns, preprocessed data, and performed feature engineering. Applied statistical techniques and machine learning algorithms to detect anomalies, followed by a detailed report summarising findings and recommendations.
Technologies: Python, Pandas, Scikit-learn, Statistical Methods
Analysed historical sales data using time series decomposition, feature engineering, and ARIMA modeling to forecast future demand. Achieved 15% improvement in forecast accuracy over baseline methods.
Technologies: Python, Statsmodels, Prophet, Pandas
Designed and implemented a deep neural network architecture from scratch. Applied forward and backward propagation algorithms, optimized hyperparameters, and achieved state-of-the-art performance on classification tasks.
Technologies: Python, TensorFlow, Keras, NumPy, Matplotlib
Custom neural network architectures • Forward/backward propagation • Gradient descent optimisation • TensorFlow, Keras, PyTorch
Sentiment analysis (92%+ accuracy) • BERT implementation • Text classification • NLTK, spaCy, Hugging Face Transformers
Supervised/unsupervised learning • XGBoost, Random Forests, SVM • K-means, DBSCAN, Hierarchical clustering
Hyperparameter tuning • Grid/Random/Bayesian search • ROC-AUC, precision-recall • Custom business metrics
ARIMA, SARIMA, Prophet, LSTM • Decomposition techniques • 15%+ accuracy improvement • Demand forecasting
Real-time detection (95%+ accuracy) • Statistical methods, Isolation Forests • Autoencoders • Maritime systems
Interactive dashboards • Matplotlib, Seaborn, Plotly, D3.js • SHAP values • Business intelligence reporting
Feature creation/selection • PCA, t-SNE, UMAP • Autoencoders • High-dimensional data processing
Production-ready ML pipelines • NumPy, Pandas, Scikit-learn • Automated workflows • Scalable systems
Parametric/non-parametric tests • A/B testing • Correlation analysis • Causal inference • Model validation
Model versioning • Experiment tracking • Deployment pipelines • Drift detection • Automated retraining
Behavioural segmentation • RFM analysis • Cohort analysis • Targeted marketing • Retention optimisation
A selection of my data visualisation techniques
Interested in working together? Fill out the form below, and I'll get back to you promptly.
Based in London, UK