Optimising Neural Networks for Titanic Survival Prediction

Optimising Neural Networks for Titanic Survival Prediction

Overview

Problem Statement: A neural network is needed to predict Titanic survivors using a dataset with 891 records and 12 features, comparing Adam and RMSprop optimisers with regularisation and early stopping.

Approach: I preprocessed the data, built base models, added L2 regularisation and dropout, implemented early stopping, and evaluated performance with cross-validation and metrics.

Project Overview

891
Records
12→10
Features
2
Optimisers
3
Techniques

Data Preparation

1

Loading & Cleaning

Loaded 891 records, dropped 'PassengerId', 'Name', 'Ticket', 'Cabin', filled 'Age'/'Embarked' with mode.

2

Encoding

Converted 'Sex' and 'Embarked' to binary via one-hot encoding, yielding 10 features.

3

Splitting & Scaling

Split 80/20 (10% validation), stratified by 'Survived'; standardised features with StandardScaler.

Feature Processing Pipeline

Original Features (12)
  • PassengerId ❌
  • Survived ✅
  • Pclass ✅
  • Name ❌
  • Sex ✅ (encoded)
  • Age ✅ (filled)
  • SibSp ✅
  • Parch ✅
  • Ticket ❌
  • Fare ✅
  • Cabin ❌
  • Embarked ✅ (encoded)
Processed Features (10)
  • Pclass ✅
  • Sex_female ✅
  • Sex_male ✅
  • Age ✅
  • SibSp ✅
  • Parch ✅
  • Fare ✅
  • Embarked_C ✅
  • Embarked_Q ✅
  • Embarked_S ✅
Data Splits
Training (70%)
Validation (10%)
Test (20%)

Workflow

1

Start: Titanic Prediction

Initialise project for predicting survival on the Titanic using neural networks.

2

Load & Preprocess Data

Import dataset, handle missing values, drop unnecessary columns, and encode categorical features.

3

Split & Scale Data

Divide into training (70%), validation (10%), and test (20%) sets with stratification. Standardise features.

4

Build Base Models

Create 10-64-32-1 neural network architecture with Adam and RMSprop optimisers for comparison.

5

Add Regularisation

Implement L2 regularisation (0.01) and dropout (0.1) to prevent overfitting.

6

Implement Early Stopping

Configure early stopping with patience=1 to halt training when validation loss plateaus.

7

Evaluate Models

Perform 5-fold cross-validation and calculate performance metrics (accuracy, precision, recall, F1).

8

End Activity

Conclude with model recommendations based on evaluation results.

Base Model Creation

Input Layer
1
2
...
10
Hidden Layer 1
1
2
...
64
Hidden Layer 2
1
2
...
32
Output Layer
1
Base Model
Adam

Built 10-64-32-1 model with ReLU for hidden layers and Sigmoid for output. Binary cross-entropy loss.

Trained for 10 epochs with batch size of 32, validated on 10% split. Showed quicker loss drop but slight overfitting in training.

Loss
0.4892
Accuracy
0.7978
Val Loss
0.5125
Val Acc
0.7865
Base Model
RMSprop

Same architecture as Adam model. RMSprop showed more stable validation loss throughout training, with slightly lower final accuracy.

Training appeared to converge more slowly but with better generalisation properties.

Loss
0.5231
Accuracy
0.7809
Val Loss
0.4994
Val Acc
0.7753

Regularisation Techniques

L2

L2 Regularisation

Added L2 regularisation with lambda=0.01 to penalise large weights and reduce the risk of overfitting.

Implementation added to both models by including kernel_regularizer in dense layers.

D

Dropout

Introduced dropout with rate=0.1 after hidden layers to enhance model robustness.

Random neuron deactivation during training forces the network to learn redundant representations.

📊

Outcomes

Both optimisers showed higher training loss but better validation stability, indicating reduced overfitting.

Adam slightly outperformed RMSprop in terms of final validation metrics.

Regularised Model
Adam

10-64-32-1 architecture with L2=0.01 and Dropout=0.1. Trained for 10 epochs with batch size of 32.

Loss
0.5102
Accuracy
0.7865
Val Loss
0.4932
Val Acc
0.7978
Regularised Model
RMSprop

10-64-32-1 architecture with L2=0.01 and Dropout=0.1. Trained for 10 epochs with batch size of 32.

Loss
0.5309
Accuracy
0.7753
Val Loss
0.5021
Val Acc
0.7865

Early Stopping

Implementation

Added early stopping callback with patience=1 and learning rate=0.001 to halt training when validation loss plateaus.

Models configured to train for up to 50 epochs, stopping automatically when no improvement is detected.

Impact

RMSprop model stopped earlier but achieved more stable validation accuracy (0.7989).

Adam model trained longer but achieved lower final loss (0.4641 vs. 0.6414 for RMSprop).

Epochs
Loss
Early Stopping Model
Adam

Same architecture with early stopping. Trained with patience=1, lr=0.001, L2=0.01.

Final Loss
0.4641
Final Acc
0.7978
Epochs
12
Early Stopping Model
RMSprop

Same architecture with early stopping. Trained with patience=1, lr=0.001, L2=0.01.

Final Loss
0.6414
Final Acc
0.7989
Epochs
8

Evaluation & Insights

Cross-Validation

Adam CV Accuracy
0.8283 ± 0.0478
RMSprop CV Accuracy
0.8193 ± 0.0430

5-fold cross-validation shows Adam slightly outperforming RMSprop in accuracy, but with slightly higher variance.

Metrics

Adam Recall
0.6843
RMSprop Precision
0.8428
Adam F1
0.7500
RMSprop F1
0.7400

Model Choice

Adam 4.3.3 (early stopping) favoured for high recall (0.6843), important for survivor detection.

RMSprop 4.2.4 (regularised) preferred for precision (0.8428), minimising false positives.

Choice depends on whether identifying all survivors (recall) or minimising false survivor predictions (precision) is prioritised.

Model Accuracy Precision Recall F1 Score Training Time
Base Adam 0.7978 0.7826 0.6327 0.7000 10 epochs
Base RMSprop 0.7753 0.7391 0.6122 0.6700 10 epochs
Regularised Adam 0.7978 0.7609 0.6531 0.7027 10 epochs
Regularised RMSprop 0.7865 0.8428 0.5918 0.6957 10 epochs
Early Stopping Adam 0.8283 0.8200 0.6843 0.7500 ~12 epochs
Early Stopping RMSprop 0.8193 0.8113 0.6775 0.7400 ~8 epochs

Conclusion

The optimised neural networks effectively predicted Titanic survivors, with Adam excelling in recall (0.6843) and RMSprop in precision (0.8428). Early stopping and regularisation techniques significantly improved model generalisation, with Adam 4.3.3 providing the best overall balance of performance (accuracy=0.8283).

Model Selection Recommendations

For Maximum Survivor Identification

Use Adam with Early Stopping when the priority is identifying as many survivors as possible, even at the cost of some false positives.

For Minimum False Positives

Use RMSprop with Regularisation when the priority is ensuring high confidence in survivor predictions, minimising false positives.

Optimisation Impact

Early stopping proved most effective for both optimisers, preventing overfitting whilst improving performance. L2 regularisation and dropout provided valuable stability improvements, particularly for the RMSprop optimiser.

These findings demonstrate how neural network optimisation techniques can significantly impact model performance on binary classification tasks, even with relatively small datasets.