basic-neural-network-visual

Overview

Problem Statement: A company needs a TensorFlow neural network to classify 4,601 emails from the Spambase dataset as spam or non-spam using 57 features.

Approach: I preprocessed the data, built a sequential model with two hidden layers, trained it with Adam optimiser, and evaluated its performance for spam detection.

Neural Network Goals

📧

Email Classification

Spam vs. Non-spam

🏆

High Accuracy

Minimise misclassifications

💻

TensorFlow

Using Keras API

⚡

Efficient Model

Simple but effective

Data Preparation

1

Loading & Splitting

Loaded the Spambase dataset containing 4,601 email records with 57 features. Split the data into training and test sets to ensure proper model evaluation.

Features (X): 57 attributes including word frequencies, character frequencies, and capital letter statistics
Target (y): Binary label indicating spam (1) or non-spam (0)
Train/Test Split: 80% train (3,680 samples), 20% test (921 samples)
Validation: 10% of training data used for validation during training

2

Standardisation

Applied StandardScaler to normalise all features to a common scale (mean=0, standard deviation=1). This prevents features with larger scales from dominating the model training process.

Standardisation Formula:

z = (x - μ) / σ

Where:
z = standardised value
x = original value
μ = mean of the feature
σ = standard deviation of the feature

Data Distribution

Original Features

Feature Value

Frequency

Standardised Features

Feature Value

Frequency

Workflow

1

Start: Neural Network Build

Begin the process of building a neural network for email spam classification.

2

Import Libraries & Data

Import TensorFlow, Keras, scikit-learn, numpy, pandas, and load the Spambase dataset.

3

Prepare & Split Data

Split into training and test sets, apply standardisation to normalise features.

4

Define Sequential Model

Create a Keras Sequential model with two hidden layers (64 and 32 neurones) and an output layer.

5

Compile Model

Configure the model with Adam optimiser, binary cross-entropy loss, and accuracy metric.

6

Train Model

Train the model for 10 epochs with batch size of 64, using validation data to monitor progress.

7

Evaluate Performance

Test the model on holdout data and compute loss and accuracy metrics.

8

End Activity

Conclude the neural network development process with performance insights.

Model Architecture

Model Structure

Created a Sequential neural network with two hidden layers and one output layer for binary classification.

Input Layer

1

2

...

57

Hidden Layer 1

1

2

...

64

Hidden Layer 2

1

2

...

32

Output Layer

1

Architecture Details:

Input Layer: 57 features (email characteristics)
Hidden Layer 1: 64 neurones with ReLU activation
Hidden Layer 2: 32 neurones with ReLU activation
Output Layer: 1 neurone with Sigmoid activation

Model Compilation

Configured the model with appropriate loss function, optimiser, and metrics for binary classification.

Compilation Settings:

Optimiser: Adam (adaptive learning rate optimiser)
Loss Function: Binary Cross-Entropy (optimal for binary classification)
Metrics: Accuracy (percentage of correctly classified emails)

Activation Functions:

ReLU (Hidden Layers)

f(x) = max(0, x)

Sigmoid (Output Layer)

f(x) = 1 / (1 + e^-x)

Training Process

Training Configuration

Epochs

10

Training iterations

Batch Size

64

Samples per update

Validation

10%

Of training data

Optimiser

Adam

Adaptive learning

Training Progress

Epochs

Value

1

5

10

1.0

0.5

0.0

Accuracy

Loss

Evaluation Insights

Performance Metrics

The model was evaluated on the test set to assess its effectiveness in classifying spam and non-spam emails.

Test Accuracy

92.3%

Correctly classified emails

Test Loss

0.247

Binary cross-entropy

Precision

90.8%

True positives / predicted positives

Recall

89.5%

True positives / actual positives

True Negative

529

False Positive

32

False Negative

39

True Positive

321

Model Rationale

The architecture and training choices were based on proven practices for binary classification problems.

1

ReLU Activation: Prevents vanishing gradient problem in deep networks, allowing for faster and more effective training.

2

Sigmoid Output: Perfect for binary classification as it outputs a probability between 0 and 1 for the positive class.

3

Adam Optimiser: Combines benefits of RMSProp and momentum, adapting learning rates for each parameter.

4

Standardisation: Ensures all features contribute equally to the model, improving convergence.

5

Layer Sizes: Progressively narrowing architecture (57→64→32→1) allows for effective feature extraction and transformation.

Conclusion

Successfully built a TensorFlow neural network with a 64-32-1 architecture that achieved high accuracy on the Spambase dataset. The model effectively distinguishes between spam and non-spam emails based on 57 features, with over 92% accuracy on the test set.

Business Implications

Practical Benefits

Email Filtering: Reduces spam reaching customer inboxes
User Experience: Enhances email system usability
Resource Efficiency: Minimises storage and processing of unwanted messages

Future Enhancements

Regularisation: Add dropout or L2 regularisation to prevent overfitting
Hyperparameter Tuning: Optimise epochs, batch size, and layer sizes
Advanced Features: Incorporate text embedding techniques for better feature representation

Final Assessment

The neural network provides a robust foundation for spam detection with minimal preprocessing requirements. Its sequential architecture balances simplicity with effectiveness, making it suitable for deployment in retail email systems. The model achieved an excellent balance of precision and recall, ensuring both minimal false positives (legitimate emails classified as spam) and false negatives (spam emails reaching the inbox).

Building a Basic Neural Network