Building a Basic Neural Network

Building a Basic Neural Network

Overview

Problem Statement: A company needs a TensorFlow neural network to classify 4,601 emails from the Spambase dataset as spam or non-spam using 57 features.

Approach: I preprocessed the data, built a sequential model with two hidden layers, trained it with Adam optimizer, and evaluated its performance for spam detection.

Neural Network Goals

📧
Email Classification
Spam vs. Non-spam
🏆
High Accuracy
Minimize misclassifications
💻
TensorFlow
Using Keras API
Efficient Model
Simple but effective

Data Preparation

1

Loading & Splitting

Loaded the Spambase dataset containing 4,601 email records with 57 features. Split the data into training and test sets to ensure proper model evaluation.

  • Features (X): 57 attributes including word frequencies, character frequencies, and capital letter statistics
  • Target (y): Binary label indicating spam (1) or non-spam (0)
  • Train/Test Split: 80% train (3,680 samples), 20% test (921 samples)
  • Validation: 10% of training data used for validation during training
2

Standardization

Applied StandardScaler to normalize all features to a common scale (mean=0, standard deviation=1). This prevents features with larger scales from dominating the model training process.

Standardization Formula:
z = (x - μ) / σ
Where:
z = standardized value
x = original value
μ = mean of the feature
σ = standard deviation of the feature

Data Distribution

Original Features
Feature Value
Frequency
Standardized Features
Feature Value
Frequency

Workflow

1

Start: Neural Network Build

Begin the process of building a neural network for email spam classification.

2

Import Libraries & Data

Import TensorFlow, Keras, scikit-learn, numpy, pandas, and load the Spambase dataset.

3

Prepare & Split Data

Split into training and test sets, apply standardization to normalize features.

4

Define Sequential Model

Create a Keras Sequential model with two hidden layers (64 and 32 neurons) and an output layer.

5

Compile Model

Configure the model with Adam optimizer, binary cross-entropy loss, and accuracy metric.

6

Train Model

Train the model for 10 epochs with batch size of 64, using validation data to monitor progress.

7

Evaluate Performance

Test the model on holdout data and compute loss and accuracy metrics.

8

End Activity

Conclude the neural network development process with performance insights.

Model Architecture

Model Structure

Created a Sequential neural network with two hidden layers and one output layer for binary classification.

Input Layer
1
2
...
57
Hidden Layer 1
1
2
...
64
Hidden Layer 2
1
2
...
32
Output Layer
1
Architecture Details:
  • Input Layer: 57 features (email characteristics)
  • Hidden Layer 1: 64 neurons with ReLU activation
  • Hidden Layer 2: 32 neurons with ReLU activation
  • Output Layer: 1 neuron with Sigmoid activation
Model Compilation

Configured the model with appropriate loss function, optimizer, and metrics for binary classification.

Compilation Settings:
  • Optimizer: Adam (adaptive learning rate optimizer)
  • Loss Function: Binary Cross-Entropy (optimal for binary classification)
  • Metrics: Accuracy (percentage of correctly classified emails)
Activation Functions:
ReLU (Hidden Layers)
f(x) = max(0, x)
Sigmoid (Output Layer)
f(x) = 1 / (1 + e^-x)

Training Process

Training Configuration
Epochs
10
Training iterations
Batch Size
64
Samples per update
Validation
10%
Of training data
Optimizer
Adam
Adaptive learning

Training Progress

Epochs
Value
1
5
10
1.0
0.5
0.0
Accuracy
Loss

Evaluation Insights

Performance Metrics

The model was evaluated on the test set to assess its effectiveness in classifying spam and non-spam emails.

Test Accuracy
92.3%
Correctly classified emails
Test Loss
0.247
Binary cross-entropy
Precision
90.8%
True positives / predicted positives
Recall
89.5%
True positives / actual positives
True Negative
529
False Positive
32
False Negative
39
True Positive
321

Model Rationale

The architecture and training choices were based on proven practices for binary classification problems.

1
ReLU Activation: Prevents vanishing gradient problem in deep networks, allowing for faster and more effective training.
2
Sigmoid Output: Perfect for binary classification as it outputs a probability between 0 and 1 for the positive class.
3
Adam Optimizer: Combines benefits of RMSProp and momentum, adapting learning rates for each parameter.
4
Standardization: Ensures all features contribute equally to the model, improving convergence.
5
Layer Sizes: Progressively narrowing architecture (57→64→32→1) allows for effective feature extraction and transformation.

Conclusion

Successfully built a TensorFlow neural network with a 64-32-1 architecture that achieved high accuracy on the Spambase dataset. The model effectively distinguishes between spam and non-spam emails based on 57 features, with over 92% accuracy on the test set.

Business Implications

Practical Benefits

  • Email Filtering: Reduces spam reaching customer inboxes
  • User Experience: Enhances email system usability
  • Resource Efficiency: Minimizes storage and processing of unwanted messages

Future Enhancements

  • Regularization: Add dropout or L2 regularization to prevent overfitting
  • Hyperparameter Tuning: Optimize epochs, batch size, and layer sizes
  • Advanced Features: Incorporate text embedding techniques for better feature representation

Final Assessment

The neural network provides a robust foundation for spam detection with minimal preprocessing requirements. Its sequential architecture balances simplicity with effectiveness, making it suitable for deployment in retail email systems. The model achieved an excellent balance of precision and recall, ensuring both minimal false positives (legitimate emails classified as spam) and false negatives (spam emails reaching the inbox).