Problem Statement: Poorly maintained ship engines increase fuel use, risks, and delays in the supply chain. This project uses a real dataset (19,535 samples) to detect anomalies in engine functionality (e.g., rpm, pressures, temperatures) to reduce downtime, enhance safety, and improve delivery efficiency.
Approach: The process involves EDA, statistical anomaly detection with IQR, ML methods (One-Class SVM, Isolation Forest), feature scaling, PCA visualisation, and parameter tuning to target 1-5% anomalies. A report summarizes findings for stakeholders.
Imported data, confirmed no missing/duplicate values, generated statistics (mean, median, 95th percentile), and visualized distributions showing right-skewed features.
Applied IQR to flag outliers per feature, identified samples with 2+ outliers (2.16%), and noted effectiveness for skewed data.
Scaled features, used One-Class SVM (nu=0.02, 2%) and Isolation Forest (contamination=0.05, 5%), visualized with PCA, and tuned for 1-5% anomalies.
Flagged outliers per feature (e.g., 2,668 for Engine rpm); samples with 2+ outliers totaled 422 (2.16%), fitting the 1-5% target.
Effective for skewed data but may miss subtle anomalies; Engine rpm and pressures showed most outliers, suggesting key monitoring areas.
Engine RPM
2,668
Coolant Pressure
1,872
Oil Pressure
1,314
Temperature
896
Tuned to nu=0.02 (2%, 392 outliers); detected anomalies in rpm and pressures; PCA showed clear separation with some overlap.
Tuned to contamination=0.05 (5%, 977 outliers); captured broader anomalies; PCA indicated dense normal clusters with dispersed outliers.
Reduced 6 features to 2D; SVM (nu=0.02) and IF (contamination=0.05) showed outliers in red, normals in blue; IF had broader detection.
Isolation Forest (contamination=0.05) excelled for its speed and broader anomaly capture, ideal for real-time engine monitoring.
Isolation Forest (contamination=0.05, 5%) outperformed IQR (2.16%) and SVM (nu=0.02, 2%) for its efficiency and comprehensive anomaly detection across engine features (e.g., rpm, pressures). Key insights: monitor high rpm and coolant pressure for maintenance. PCA visualisations confirmed IF's effectiveness for operational use.
Deploy Isolation Forest for real-time anomaly alerts in ship engine monitoring systems
Prioritize engine rpm and coolant pressure for preventative maintenance checks
Refine detection thresholds with additional operational data to enhance safety and efficiency