Experimenting with Hyperparameter Tuning

Experimenting with Hyperparameter Tuning

Overview

Problem Statement: A company aims to enhance a spam detection neural network using the Spambase dataset (4,601 emails, 57 features) via hyperparameter tuning.

Approach: I extended the base model with additional layers, tuned batch sizes and epochs using grid search, and analyzed performance to optimize accuracy.

Hyperparameter Tuning Objectives

🧠
Enhance Model
Add hidden layers
📊
Optimize Batch Size
Test different sizes
🔄
Tune Epochs
Find optimal training duration
🎯
Improve Accuracy
Target >95% accuracy

Data Setup

1

Loading & Splitting

Loaded 4,601 emails with 57 features and binary spam/non-spam labels. Split the dataset into 80% training (with 10% used for validation) and 20% test sets to ensure proper model evaluation.

  • X: 57 numerical features (word frequencies, character frequencies)
  • y: Binary spam label (1 = spam, 0 = non-spam)
  • Train: 3,680 samples
  • Test: 921 samples
2

Standardization

Applied StandardScaler to normalize features to mean=0 and standard deviation=1. This ensures consistent scaling across features for optimal neural network training.

The standardization was applied only to the training data, and then the same transformation was applied to test data to prevent data leakage.

Why Standardize?
  • Ensures gradient descent converges more quickly
  • Prevents features with larger scales from dominating the model
  • Improves numerical stability during training

Workflow

1

Start: Hyperparameter Tuning

Begin the project to optimize neural network performance through hyperparameter tuning.

2

Import Libraries & Data

Load TensorFlow, Keras, scikit-learn, numpy, pandas, and the Spambase dataset.

3

Prepare & Split Data

Split into training and test sets, apply standardization to normalize features.

4

Enhance Model with Layers

Extend the base neural network architecture with additional hidden layers.

5

Tune Batch Sizes

Test different batch sizes to find optimal training efficiency and performance.

6

Tune Epochs

Determine the optimal number of training iterations to balance fit and overfitting.

7

Evaluate Performance

Analyze the results to identify the best hyperparameter configuration.

8

End Activity

Conclude with optimized hyperparameters and recommendations for implementation.

Model Enhancement

Layer Addition

Extended the base neural network model by adding multiple hidden layers to increase model complexity and capture more intricate patterns in the data.

Input
1
...
57
Hidden 1
1
...
64
Hidden 2
1
...
32
Hidden 3
1
...
16
Hidden 4-6
1
...
16
Output
1
Architecture Details:
  • Input Layer: 57 features
  • Hidden Layer 1: 64 neurons (ReLU)
  • Hidden Layer 2: 32 neurons (ReLU)
  • Hidden Layers 3-6: 16 neurons each (ReLU)
  • Output Layer: 1 neuron (Sigmoid for binary classification)
Model Compilation

Configured the neural network with appropriate loss function, optimizer, and metrics for binary classification.

Compilation Settings:
  • Optimizer: Adam (adaptive learning rate)
  • Loss Function: Binary Cross-Entropy
  • Metrics: Accuracy
Why This Architecture?

The progressively narrowing structure (64→32→16→16→16→16→1) allows the network to:

  • Extract high-level features in initial wide layers
  • Refine abstractions in middle layers
  • Generate focused predictions in final layers
  • Balance complexity and overfitting risk

Hyperparameter Tuning

1

Batch Size Tuning

Tested batch sizes [16, 32, 64] with a fixed number of 10 epochs to determine the optimal training efficiency.

Batch Size: 16
0.947
Batch Size: 32
0.934
Batch Size: 64
0.925

Finding: Smaller batch sizes (16) produced slightly better accuracy, likely due to more frequent weight updates.

2

Epoch Tuning

Tested different numbers of epochs [10, 20, 30] with a fixed batch size of 64 to determine optimal training duration.

Epochs: 10
0.925
Epochs: 20
0.946
Epochs: 30
0.942

Finding: Performance peaked around 14-15 epochs, with diminishing returns and potential overfitting beyond that point.

3

Grid Search

Conducted comprehensive grid search across 100 model runs to systematically identify optimal hyperparameter combinations.

Average Results Over 100 Runs
Accuracy
0.945
Optimal Epochs
14.5
Optimal Batch Size
17.44

Selected Configuration: 14 epochs, batch size of 16

Best Performance
Medium Performance
Baseline Performance

Results Analysis

Performance Metrics

The best configuration (14 epochs, batch size=16) achieved superior performance on the spam detection task.

Accuracy
0.950
Precision
0.942
Recall
0.929
F1 Score
0.935
Training Time
~2 minutes

Optimization Rationale

Several factors contributed to the performance improvements observed through hyperparameter tuning:

Architecture Benefits:
  • Additional layers captured more complex patterns in the email features
  • Progressive narrowing structure provided effective feature abstraction
  • Multiple 16-neuron layers enabled specialized feature learning
Hyperparameter Benefits:
  • Small batch size (16) provided more frequent model updates
  • Moderate epochs (14) prevented overfitting while ensuring convergence
  • Combined tuning enabled balanced fit and generalization

Conclusion

Successfully enhanced the Spambase neural network by adding four hidden layers (64-32-16-16-16-16-1) and conducting systematic hyperparameter tuning. The grid search across 100 runs identified optimal hyperparameters (14 epochs, batch size=16), achieving an accuracy of ~0.95, effectively balancing model fit and generalization capabilities.

Business Implications

Implementation Benefits

  • Enhanced Accuracy: ~5% improvement over baseline models
  • Reduced False Positives: Fewer legitimate emails misclassified as spam
  • Improved User Experience: More effective filtering for retail customers

Deployment Considerations

  • Regularization: Consider adding dropout for further overfitting prevention
  • Monitoring: Implement performance tracking to detect concept drift
  • Retraining: Schedule periodic retraining with new email samples

Future Improvements

While the current model achieves excellent performance, future enhancements could explore regularization techniques like dropout, learning rate scheduling, or advanced architectures like attention mechanisms to further boost spam detection capabilities.