Improving Training accuracy of LSTM in C3W4 assignment

Hello All,

I had a few questions regarding the final assignment:

(1) For my current notebook, the padded sequences length of array [11 elements] is longer than what is shown in the example [5 elements] , could that be causing the low accuracy of trained model- is an additional truncation needed? Please see message from Grader output below:

All tests passed for n_gram_seqs!
Details of failed tests for pad_seqs

Failed test case: There was an error grading this function. Details shown in ‘got’ value below:.
Expected:
no exceptions,
but got:
operands could not be broadcast together with shapes (5,11) (5,5) .

All tests passed for features_and_labels!
All tests passed for create_model!

(2) I tried increasing the number of embedding dimensions, and fanning up the number of LSTM units, but with this notebook, remain pinned at a training accuracy of less than 80% (much lower) - please see message from Grader output below:

Failed test case: training accuracy under 80%.
Expected:
0.8 or above,
but got:
0.07560470700263977

What is the best way to proceed, in order to get to higher training accuracies? Is further pre-processing needed to the inputs, in order to get there?

Best Regards, Priya

Here are a few hints for you:

  1. In pad_seqs function, fix the maxlen parameter.
  2. The starter code allows you to use 50 epochs for training the model. Did you have a reason to use just 10 epochs?
1 Like

Thanks Balaji, will try that- I updated the code to 10, as I seemed to be so far off optimal, wanted to just see if I was converging or not, before increasing number of epochs. Will revert back to you after fixing Maxlen and setting total number of epochs back to 50.

Hi Balaji,

Those hints were very helpful, after fixing Maxlen, and increasing number of epochs to 50, the model started to converge steadily, as shown below. This has been such an enriching course- thanks so much for your help to get this and other assignments finished. So excited to have been able to finish this before end of 2022. Best, Priya

Here’s what I saw for epochs 1 through 15:
Epoch 1/50
484/484 [==============================] - 15s 10ms/step - loss: 6.8994 - accuracy: 0.0226
Epoch 2/50
484/484 [==============================] - 5s 10ms/step - loss: 6.4586 - accuracy: 0.0279
Epoch 3/50
484/484 [==============================] - 5s 10ms/step - loss: 6.2517 - accuracy: 0.0343
Epoch 4/50
484/484 [==============================] - 5s 10ms/step - loss: 6.0734 - accuracy: 0.0418
Epoch 5/50
484/484 [==============================] - 5s 10ms/step - loss: 5.8801 - accuracy: 0.0507
Epoch 6/50
484/484 [==============================] - 5s 10ms/step - loss: 5.6829 - accuracy: 0.0598
Epoch 7/50
484/484 [==============================] - 5s 11ms/step - loss: 5.4754 - accuracy: 0.0698
Epoch 8/50
484/484 [==============================] - 5s 10ms/step - loss: 5.2595 - accuracy: 0.0847
Epoch 9/50
484/484 [==============================] - 5s 10ms/step - loss: 5.0364 - accuracy: 0.0992
Epoch 10/50
484/484 [==============================] - 5s 10ms/step - loss: 4.7981 - accuracy: 0.1209
Epoch 11/50
484/484 [==============================] - 5s 10ms/step - loss: 4.5654 - accuracy: 0.1390
Epoch 12/50
484/484 [==============================] - 5s 10ms/step - loss: 4.3210 - accuracy: 0.1645
Epoch 13/50
484/484 [==============================] - 5s 10ms/step - loss: 4.0730 - accuracy: 0.1924
Epoch 14/50
484/484 [==============================] - 5s 10ms/step - loss: 3.8206 - accuracy: 0.2231
Epoch 15/50
484/484 [==============================] - 5s 10ms/step - loss: 3.5657 - accuracy: 0.2562

And for epochs 40 through 50:

Epoch 40/50
484/484 [==============================] - 5s 10ms/step - loss: 0.7504 - accuracy: 0.8173
Epoch 41/50
484/484 [==============================] - 5s 10ms/step - loss: 0.7276 - accuracy: 0.8233
Epoch 42/50
484/484 [==============================] - 5s 10ms/step - loss: 0.7028 - accuracy: 0.8271
Epoch 43/50
484/484 [==============================] - 5s 10ms/step - loss: 0.7060 - accuracy: 0.8232
Epoch 44/50
484/484 [==============================] - 5s 10ms/step - loss: 0.7047 - accuracy: 0.8220
Epoch 45/50
484/484 [==============================] - 5s 10ms/step - loss: 0.7039 - accuracy: 0.8213
Epoch 46/50
484/484 [==============================] - 5s 10ms/step - loss: 0.6965 - accuracy: 0.8240
Epoch 47/50
484/484 [==============================] - 5s 10ms/step - loss: 0.6689 - accuracy: 0.8299
Epoch 48/50
484/484 [==============================] - 5s 10ms/step - loss: 0.6524 - accuracy: 0.8333
Epoch 49/50
484/484 [==============================] - 5s 10ms/step - loss: 0.6345 - accuracy: 0.8389
Epoch 50/50
484/484 [==============================] - 5s 10ms/step - loss: 0.6347 - accuracy: 0.8368

Hmm, hard coding Maxlen, cause issues during grading- am going to give it another go, to get a more stable solution- as now the model convergence part is perfect, but running into expected shape issues in the graded function portion - feel like the right answer is very close though!

Hi Balaji, was able to resolve the issue around setting Maxlen, and acheive grading accuracy of greater than 80%. Thanks for your assistance!

1 Like

I first thought the comments were saying that we should leave the output_dim of the embedding layer at 100, so it was impossible to achieve the required accuracy that way. In case anyone is having this problem, it seems that you need to change this number.

1 Like