I had a few questions regarding the final assignment:
(1) For my current notebook, the padded sequences length of array [11 elements] is longer than what is shown in the example [5 elements] , could that be causing the low accuracy of trained model- is an additional truncation needed? Please see message from Grader output below:
All tests passed for n_gram_seqs!
Details of failed tests for pad_seqs
Failed test case: There was an error grading this function. Details shown in ‘got’ value below:.
Expected:
no exceptions,
but got:
operands could not be broadcast together with shapes (5,11) (5,5) .
All tests passed for features_and_labels!
All tests passed for create_model!
(2) I tried increasing the number of embedding dimensions, and fanning up the number of LSTM units, but with this notebook, remain pinned at a training accuracy of less than 80% (much lower) - please see message from Grader output below:
Failed test case: training accuracy under 80%.
Expected:
0.8 or above,
but got:
0.07560470700263977
What is the best way to proceed, in order to get to higher training accuracies? Is further pre-processing needed to the inputs, in order to get there?
Thanks Balaji, will try that- I updated the code to 10, as I seemed to be so far off optimal, wanted to just see if I was converging or not, before increasing number of epochs. Will revert back to you after fixing Maxlen and setting total number of epochs back to 50.
Those hints were very helpful, after fixing Maxlen, and increasing number of epochs to 50, the model started to converge steadily, as shown below. This has been such an enriching course- thanks so much for your help to get this and other assignments finished. So excited to have been able to finish this before end of 2022. Best, Priya
Hmm, hard coding Maxlen, cause issues during grading- am going to give it another go, to get a more stable solution- as now the model convergence part is perfect, but running into expected shape issues in the graded function portion - feel like the right answer is very close though!
I first thought the comments were saying that we should leave the output_dim of the embedding layer at 100, so it was impossible to achieve the required accuracy that way. In case anyone is having this problem, it seems that you need to change this number.