Problem Statement: A company needs a TensorFlow neural network to classify 4,601 emails from the Spambase dataset as spam or non-spam using 57 features.
Approach: I preprocessed the data, built a sequential model with two hidden layers, trained it with Adam optimizer, and evaluated its performance for spam detection.
Loaded 4,601 emails with 57 features and binary spam/non-spam labels. Split the dataset into 80% training (with 10% used for validation) and 20% test sets to ensure proper model evaluation.
Applied StandardScaler to normalize features to mean=0 and standard deviation=1. This ensures consistent scaling across features for optimal neural network training.
The standardization was applied only to the training data, and then the same transformation was applied to test data to prevent data leakage.
Begin the project to build a basic neural network for spam detection.
Load TensorFlow, Keras, scikit-learn, numpy, pandas, and the Spambase dataset.
Split into training and test sets, apply standardization to normalize features.
Create a Sequential model with two hidden layers (64 and 32 neurons) and an output layer.
Configure model with Adam optimizer, binary cross-entropy loss, and accuracy metric.
Train for 10 epochs with batch size of 64, using validation data to monitor performance.
Test the model's accuracy and loss on the held-out test dataset.
Complete the neural network development with performance insights.
Implemented a Sequential neural network with three layers: two hidden layers with ReLU activation and an output layer with Sigmoid activation for binary classification.
5,921 trainable parameters (includes weights and biases)
Configured the model with appropriate loss function, optimizer, and metrics for binary classification.
Adam combines the benefits of two other extensions of stochastic gradient descent: AdaGrad and RMSProp, making it well-suited for a wide range of problems with noisy data.
ReLU (Rectified Linear Unit) returns x for positive values and 0 for negative values.
Benefits: Prevents vanishing gradient problem, computationally efficient, produces sparse activations.
Sigmoid squashes input values to range between 0 and 1, ideal for binary classification.
Benefits: Smooth, differentiable function that outputs probabilities for binary classification.
The model was trained using the following parameters to optimize performance while balancing computational efficiency.
The model was evaluated on the 20% test set that was held out during training to assess its generalization capability.
The architectural choices were deliberately made to optimize performance for this binary classification task.
Successfully built a TensorFlow neural network with a 64-32-1 architecture for spam detection, achieving a 92.3% accuracy on the Spambase dataset. The model effectively leverages 57 email features to distinguish between spam and legitimate emails with high confidence.
This basic neural network demonstrates the power of even relatively simple deep learning architectures for classification tasks. With just two hidden layers, the model achieves high accuracy on email spam detection, providing a solid foundation for more sophisticated enhancements in future iterations.