titanic-nn-optimization

Overview

Problem Statement: A neural network is needed to predict Titanic survivors using a dataset with 891 records and 12 features, comparing Adam and RMSprop optimisers with regularisation and early stopping.

Approach: I preprocessed the data, built base models, added L2 regularisation and dropout, implemented early stopping, and evaluated performance with cross-validation and metrics.

Project Overview

891

Records

12→10

Features

2

Optimisers

3

Techniques

Data Preparation

1

Loading & Cleaning

Loaded 891 records, dropped 'PassengerId', 'Name', 'Ticket', 'Cabin', filled 'Age'/'Embarked' with mode.

2

Encoding

Converted 'Sex' and 'Embarked' to binary via one-hot encoding, yielding 10 features.

3

Splitting & Scaling

Split 80/20 (10% validation), stratified by 'Survived'; standardised features with StandardScaler.

Feature Processing Pipeline

Original Features (12)

PassengerId ❌
Survived ✅
Pclass ✅
Name ❌
Sex ✅ (encoded)
Age ✅ (filled)
SibSp ✅
Parch ✅
Ticket ❌
Fare ✅
Cabin ❌
Embarked ✅ (encoded)

→

Processed Features (10)

Pclass ✅
Sex_female ✅
Sex_male ✅
Age ✅
SibSp ✅
Parch ✅
Fare ✅
Embarked_C ✅
Embarked_Q ✅
Embarked_S ✅

→

Data Splits

Training (70%)

Validation (10%)

Test (20%)

Workflow

1

Start: Titanic Prediction

Initialise project for predicting survival on the Titanic using neural networks.

2

Load & Preprocess Data

Import dataset, handle missing values, drop unnecessary columns, and encode categorical features.

3

Split & Scale Data

Divide into training (70%), validation (10%), and test (20%) sets with stratification. Standardise features.

4

Build Base Models

Create 10-64-32-1 neural network architecture with Adam and RMSprop optimisers for comparison.

5

Add Regularisation

Implement L2 regularisation (0.01) and dropout (0.1) to prevent overfitting.

6

Implement Early Stopping

Configure early stopping with patience=1 to halt training when validation loss plateaus.

7

Evaluate Models

Perform 5-fold cross-validation and calculate performance metrics (accuracy, precision, recall, F1).

8

End Activity

Conclude with model recommendations based on evaluation results.

Base Model Creation

Input Layer

1

2

...

10

Hidden Layer 1

1

2

...

64

Hidden Layer 2

1

2

...

32

Output Layer

1

Base Model

Adam

Built 10-64-32-1 model with ReLU for hidden layers and Sigmoid for output. Binary cross-entropy loss.

Trained for 10 epochs with batch size of 32, validated on 10% split. Showed quicker loss drop but slight overfitting in training.

Loss

0.4892

Accuracy

0.7978

Val Loss

0.5125

Val Acc

0.7865

Base Model

RMSprop

Same architecture as Adam model. RMSprop showed more stable validation loss throughout training, with slightly lower final accuracy.

Training appeared to converge more slowly but with better generalisation properties.

Loss

0.5231

Accuracy

0.7809

Val Loss

0.4994

Val Acc

0.7753

Regularisation Techniques

L2

L2 Regularisation

Added L2 regularisation with lambda=0.01 to penalise large weights and reduce the risk of overfitting.

Implementation added to both models by including kernel_regularizer in dense layers.

D

Dropout

Introduced dropout with rate=0.1 after hidden layers to enhance model robustness.

Random neuron deactivation during training forces the network to learn redundant representations.

📊

Outcomes

Both optimisers showed higher training loss but better validation stability, indicating reduced overfitting.

Adam slightly outperformed RMSprop in terms of final validation metrics.

Regularised Model

Adam

10-64-32-1 architecture with L2=0.01 and Dropout=0.1. Trained for 10 epochs with batch size of 32.

Loss

0.5102

Accuracy

0.7865

Val Loss

0.4932

Val Acc

0.7978

Regularised Model

RMSprop

10-64-32-1 architecture with L2=0.01 and Dropout=0.1. Trained for 10 epochs with batch size of 32.

Loss

0.5309

Accuracy

0.7753

Val Loss

0.5021

Val Acc

0.7865

Early Stopping

Implementation

Added early stopping callback with patience=1 and learning rate=0.001 to halt training when validation loss plateaus.

Models configured to train for up to 50 epochs, stopping automatically when no improvement is detected.

Impact

RMSprop model stopped earlier but achieved more stable validation accuracy (0.7989).

Adam model trained longer but achieved lower final loss (0.4641 vs. 0.6414 for RMSprop).

Epochs

Loss

Early Stopping Model

Adam

Same architecture with early stopping. Trained with patience=1, lr=0.001, L2=0.01.

Final Loss

0.4641

Final Acc

0.7978

Epochs

12

Early Stopping Model

RMSprop

Same architecture with early stopping. Trained with patience=1, lr=0.001, L2=0.01.

Final Loss

0.6414

Final Acc

0.7989

Epochs

8

Evaluation & Insights

Cross-Validation

Adam CV Accuracy

0.8283 ± 0.0478

RMSprop CV Accuracy

0.8193 ± 0.0430

5-fold cross-validation shows Adam slightly outperforming RMSprop in accuracy, but with slightly higher variance.

Metrics

Adam Recall

0.6843

RMSprop Precision

0.8428

Adam F1

0.7500

RMSprop F1

0.7400

Model Choice

Adam 4.3.3 (early stopping) favoured for high recall (0.6843), important for survivor detection.

RMSprop 4.2.4 (regularised) preferred for precision (0.8428), minimising false positives.

Choice depends on whether identifying all survivors (recall) or minimising false survivor predictions (precision) is prioritised.

Model	Accuracy	Precision	Recall	F1 Score	Training Time
Base Adam	0.7978	0.7826	0.6327	0.7000	10 epochs
Base RMSprop	0.7753	0.7391	0.6122	0.6700	10 epochs
Regularised Adam	0.7978	0.7609	0.6531	0.7027	10 epochs
Regularised RMSprop	0.7865	0.8428	0.5918	0.6957	10 epochs
Early Stopping Adam	0.8283	0.8200	0.6843	0.7500	~12 epochs
Early Stopping RMSprop	0.8193	0.8113	0.6775	0.7400	~8 epochs

Conclusion

The optimised neural networks effectively predicted Titanic survivors, with Adam excelling in recall (0.6843) and RMSprop in precision (0.8428). Early stopping and regularisation techniques significantly improved model generalisation, with Adam 4.3.3 providing the best overall balance of performance (accuracy=0.8283).

Model Selection Recommendations

For Maximum Survivor Identification

Use Adam with Early Stopping when the priority is identifying as many survivors as possible, even at the cost of some false positives.

For Minimum False Positives

Use RMSprop with Regularisation when the priority is ensuring high confidence in survivor predictions, minimising false positives.

Optimisation Impact

Early stopping proved most effective for both optimisers, preventing overfitting whilst improving performance. L2 regularisation and dropout provided valuable stability improvements, particularly for the RMSprop optimiser.

These findings demonstrate how neural network optimisation techniques can significantly impact model performance on binary classification tasks, even with relatively small datasets.

Optimising Neural Networks for Titanic Survival Prediction

Overview

Project Overview

Data Preparation

Loading & Cleaning

Encoding

Splitting & Scaling

Feature Processing Pipeline

Workflow

Start: Titanic Prediction

Load & Preprocess Data

Split & Scale Data

Build Base Models

Add Regularisation

Implement Early Stopping

Evaluate Models

End Activity

Base Model Creation

Regularisation Techniques

L2 Regularisation

Dropout

Outcomes

Early Stopping

Evaluation & Insights

Cross-Validation

Metrics

Model Choice

Conclusion

Model Selection Recommendations

For Maximum Survivor Identification

For Minimum False Positives

Optimisation Impact