Not impressed with C3_W2 Programming Assignment (Deep N-grams)

David_Fox · June 22, 2023, 7:26pm

Contrary to the comment at the end of the assignment that: “the model generates text that makes sense capturing dependencies between words and without any input”, but I wouldn’t say that was true based on my output from two runs:

“ay maysem of way and rojerpudy i”
“in the love liber of his pariola”
“usis’ my lookly. for by a lucin-”
“timer’s. so know you take,”
“would indirate it pleaser difley”
“thet is glat: tentuin,”
“[exeunt]”

So the last item is a common stage direction in Shakespeare, and some of the prepositional phrases are reasonable, but that’s the only sign I see of any word-dependency being captured, and even that is at best captured across 3 words, which you could do better with a word-level trigram.

In fact, just looking at the output (without the unit tests or 100% result from the grader), I would have guessed that there was a bug in my code.

What was the point of doing character prediction if the results are so poor? I get that the computations with such a small vocabulary are more tractable, but since the actual results used a pre-trained model anyway, why not use a decent pre-trained model?

If I hadn’t gone into this week knowing the GRUs were useful, I certainly wouldn’t have gotten that impression from the assignment.

arvyzukai · June 23, 2023, 7:12am

Hi @David_Fox

That is not a fair comparison. Why not compare sentence-level, paragraph-level or document-level trigrams?

Learning. Not every class should be about sub-word level modelling (which in most cases is optimal). Character level modeling is one of the most obvious ways to model the language but it has its limitations.

If you’re not impressed by the results of RNNs, try achieving the same with regular NNs or even classical techniques (all character level).

I understand that people might have unreasonable expectations (up to AGI) out of relatively simple architectures with simple datasets but I think the goal of the course is to show some techniques of NLP (and I personally am happy that at least one class was character-level).

Cheers

esav · December 6, 2023, 6:48pm

Hi @arvyzukai,

I generally agree with @David_Fox’s sentiment here. I think the assignment would benefit from clearer expectations for what kind of output we should see at the end. Even though I’ve completed the exercises correctly, and despite the assignment text claiming:

you can see that the model generates text that makes sense…

The output is fairly nonsensical & underwhelming. Earlier the assignment mentions:

The model was only trained for 1 step due to the constraints of this environment. Even on a GPU accelerated environment it will take many hours for it to achieve a good level of accuracy.

Perhaps it would have been helpful to mention this point again at the end of the assignment to set clearer expectations for the model output?

arvyzukai · December 7, 2023, 6:13am

Hi @esav

I understand your points and I agree with you - expectations could have been managed better - in regard that “positivity/expectations” could have been geared towards learning and not towards performance.
But I would also argue that nothing is perfect (at least some thing to every single person) and there are always ways to improve. And the questions when we approach the course should be like “What can I learn from this?”, “Why people did these things back then and what I’ve would have made different?”, “What are similarities and differences between these approaches?” etc., instead of only focusing on “This model is gonna be great!” (but having some of this is good too ).

Cheers

Topic		Replies	Views
C3 W1 assignment NLP with Sequence Models week-1	1	493	August 16, 2023
Anyone completed C3W1 Deep n Grams Assignment NLP with Sequence Models week-1	1	490	March 15, 2024
C3W3 NLP, Higher Accuracy than expected, much trouble having the same results as expected by the grader NLP with Sequence Models week-3	2	379	February 28, 2024
Confusion regarding Week 1 Coding assignment NLP with Sequence Models week-1	5	483	May 28, 2023
C5 W2 A1: Analogy finding doesn't seem to work that good Sequence Models	3	410	July 10, 2023

Not impressed with C3_W2 Programming Assignment (Deep N-grams)

Related topics