Comprehensive RNN Model Comparison on SST-5 Dataset
Excels by processing sequences in both directions, capturing full context for sentiment analysis. Shows overfitting with rising validation loss.
Surprisingly outperforms more complex models on test set. Limited by vanishing gradients but shows less overfitting.
Underperformed expectations despite gating mechanisms. May require more training epochs or regularisation techniques.
Identical performance to LSTM. More efficient architecture but similar learning limitations in this configuration.
Bidirectional LSTM significantly outperformed all other models with 38.51% accuracy, demonstrating the value of bidirectional context in sentiment classification.
Vanilla RNN unexpectedly achieved 26.43% accuracy, outperforming both LSTM (23.08%) and GRU (23.08%) on the test set despite theoretical limitations.
Bidirectional LSTM showed clear overfitting with training accuracy reaching 85.82% while validation loss increased from 1.36 to 2.07.
LSTM and GRU models plateaued early with identical final accuracies, suggesting insufficient training epochs or architectural limitations for this task.
All models struggled with the 5-class sentiment classification task, with performances only modestly exceeding random baseline of 20%.
SST-5's fine-grained sentiment classes (very negative to very positive) proved challenging, requiring more sophisticated architectures than binary sentiment tasks.
For immediate deployment, Bidirectional LSTM offers the best performance but requires overfitting mitigation through regularisation techniques.
Vanilla RNN provides surprising value for resource-constrained environments, offering decent performance with minimal computational requirements.
Consider task complexity: use Bidirectional LSTM for nuanced sentiment analysis, simpler models for binary classification tasks.