Contrary to the comment at the end of the assignment that: “the model generates text that makes sense capturing dependencies between words and without any input”, but I wouldn’t say that was true based on my output from two runs:
“ay maysem of way and rojerpudy i”
“in the love liber of his pariola”
“usis’ my lookly. for by a lucin-”
“timer’s. so know you take,”
“would indirate it pleaser difley”
“thet is glat: tentuin,”
“[exeunt]”
So the last item is a common stage direction in Shakespeare, and some of the prepositional phrases are reasonable, but that’s the only sign I see of any word-dependency being captured, and even that is at best captured across 3 words, which you could do better with a word-level trigram.
In fact, just looking at the output (without the unit tests or 100% result from the grader), I would have guessed that there was a bug in my code.
What was the point of doing character prediction if the results are so poor? I get that the computations with such a small vocabulary are more tractable, but since the actual results used a pre-trained model anyway, why not use a decent pre-trained model?
If I hadn’t gone into this week knowing the GRUs were useful, I certainly wouldn’t have gotten that impression from the assignment.
That is not a fair comparison. Why not compare sentence-level, paragraph-level or document-level trigrams?
Learning. Not every class should be about sub-word level modelling (which in most cases is optimal). Character level modeling is one of the most obvious ways to model the language but it has its limitations.
If you’re not impressed by the results of RNNs, try achieving the same with regular NNs or even classical techniques (all character level).
I understand that people might have unreasonable expectations (up to AGI) out of relatively simple architectures with simple datasets but I think the goal of the course is to show some techniques of NLP (and I personally am happy that at least one class was character-level).
I generally agree with @David_Fox’s sentiment here. I think the assignment would benefit from clearer expectations for what kind of output we should see at the end. Even though I’ve completed the exercises correctly, and despite the assignment text claiming:
you can see that the model generates text that makes sense…
The output is fairly nonsensical & underwhelming. Earlier the assignment mentions:
The model was only trained for 1 step due to the constraints of this environment. Even on a GPU accelerated environment it will take many hours for it to achieve a good level of accuracy.
Perhaps it would have been helpful to mention this point again at the end of the assignment to set clearer expectations for the model output?
I understand your points and I agree with you - expectations could have been managed better - in regard that “positivity/expectations” could have been geared towards learning and not towards performance.
But I would also argue that nothing is perfect (at least some thing to every single person) and there are always ways to improve. And the questions when we approach the course should be like “What can I learn from this?”, “Why people did these things back then and what I’ve would have made different?”, “What are similarities and differences between these approaches?” etc., instead of only focusing on “This model is gonna be great!” (but having some of this is good too ).