Week 2, Assignment 2 | Why was the LSTM model giving much lower test accuracy than the plain word averaging model?

saikirandiddi · January 14, 2025, 8:01pm

I tought the LSTM model would be a more sophisticated model and would be able to achieve better test accuracies than the plain word vector averaging model that was trained first. Can someone shine some light on this?

SNaveenMathew · January 15, 2025, 12:01am

The two models are fundamentally different in terms of assumptions, parameters, hyperparameters, etc. Their bias-variance characteristics differ. Therefore, it’s impossible to say which model will perform better without specifying the experiment setup.

That said, it’s possible to discuss the expected behavior. Even if both models converged to their respective optima, their performances depend (at the least) on training set size and model complexity. A more complex model (here it’s LSTM) requires more training data with a lot of variety to perform well in the real world.

Topic		Replies	Views
What is the different between using standard RNN and LSTM for time series prediction? Sequence Models coursera-platform	2	554	January 10, 2023
Comparing performance in week 2 and 3 Sequences, Time Series and Prediction week-3	1	473	May 14, 2023
NLP model Problem Sequence Models coursera-platform	3	513	July 18, 2022
Can anyone clarify LSTMs to me? Better with an example AI Discussions ai-discussions	1	44	April 23, 2025
About C5W3 CRNN model Sequence Models coursera-platform	3	481	May 14, 2023

Week 2, Assignment 2 | Why was the LSTM model giving much lower test accuracy than the plain word averaging model?

Related topics