NLP C1_W1_Assignment : Frequency Dictionary missing a few entries?

lkj · February 23, 2025, 2:50am

Week#1
NLP C1_W1_Assignment:
Main Issue :
In the NLP C1_W1_Assignment, in the “Prepare the Data” Section, I get a mismatch as follows: (and as highlighted in the screenshot)

len(freqs) = 11406
does not match the Expected output:
len(freqs) = 11436

Please note- this is before any of the exercises or any additional changes on my part. I followed the instructions to request a fresh copy to be sure.

I was hoping it was a typo so I went ahead.

I passed all the tests until exercise 4, passing 11 tests but failing 5.
Also in Section 3 - Training your Model, my cost and trained weights were always off by a little:
My cost after training is 0.22525459.
My resulting vector of weights is [6e-08, 0.00053785, -0.00055884]

Expected Output:
The cost after training is 0.22522315.
The resulting vector of weights is [6e-08, 0.00053818, -0.0005583]

I’d really appreciate it if someone could clarify the first issue of the missing frequency dictionary entries.

paulinpaloalto · February 23, 2025, 4:25am

Yes, I just reran my notebook that used to pass all the tests and the grader and I get the same error that you show. In addition to that mismatching expected length, several of the unit tests in the notebook now fail for me.

This must be another side effect of the nltk data being changed. Here’s recent thread about similar problems in the NLP C1 W2 assignment. Mentor @Deepti_Prasad reported that other problem to the course staff, but we have not yet heard back from them.

Deepti, is it possible to add a note to your previous report suggesting that they scan all the NLP courses to see which assignments depend on imported NLTK data?

Deepti_Prasad · February 23, 2025, 5:47am

hi @paulinpaloalto

I will make sure to add this week too.

nltkdata issue actually started with course3 week1 assignment which was addressed few months ago. I will make sure report this assignment and in general the nltkdata version(it probably needs to get updated or be used with the version, the assignment was created.

@lkj thank you for reporting this.

Regards
DP

lkj · February 23, 2025, 2:02pm

thanks @paulinpaloalto @Deepti_Prasad for the quick response.

paulinpaloalto · February 23, 2025, 3:16pm

Hi, Deepti.

Thanks very much for working with the course staff to get this fixed. I’m not familiar with the nltk website, but maybe we should encourage them to find a “permanent” solution to this issue. Meaning a way that the assignments can ask for a particular version of the data or if the data cannot be guaranteed to be stable, then they just need to import the copies to make them fixed.

Regards,
Paul

Deepti_Prasad · February 23, 2025, 8:15pm

hi @paulinpaloalto

i suggested the same solution about version about nltk data when I have informed the l.t. of course

Regards
DP

Deepti_Prasad · February 24, 2025, 3:05pm

hi @lkj

as per recent update by @lucas.coutinho, the correction with the process tweet has been addressed with the stopword metadata correction. please close and open the lab to see the changes done.

Regards
DP

lkj · February 24, 2025, 4:08pm

Hi @Deepti_Prasad
I ran the updated lab but still get the mismatch (see screenshot below)
Still try logging out and getting a fresh lab? Or is my output now the actual “expected output”?

Deepti_Prasad · February 24, 2025, 4:11pm

@lucas.coutinho

can you please check week1 assignment too again once.

Regards
DP

Deepti_Prasad · February 24, 2025, 4:13pm

Try refreshing your classroom page and then open the lab, does it still give the same issue?

lkj · February 24, 2025, 4:28pm

yes.
(just to confirm: I did indeed get the pop up that said unittest had been changed when I opened the updated file this morning)

lucas.coutinho · February 24, 2025, 5:06pm

I ended up patching the notebook update as a silent update, so if you want to get the updated copy of it, you may refresh your workspace as usual to get a new copy, but it shouldn’t impact the unittests or the grader.

lkj · February 24, 2025, 6:12pm

freqs dictionary still off - is that OK or?

type(freqs) = <class 'dict'> len(freqs) = 11406

Expected output

type(freqs) = <class 'dict'>
len(freqs) = 11436

Deepti_Prasad · February 24, 2025, 6:19pm

@lkj did the the subsequent unittest fail again? can you run down till the unittest and confirm once?

lkj · February 25, 2025, 12:44am

no, it passed - thanks!

It was just this markdown part that was misleading:

Deepti_Prasad · February 25, 2025, 1:54am

hi @lkj

according to recent update, the Freq is differing because of the stopword of the process tweet which is currently addressed with changes done in the metadata of the assignment notebook.

This has been updated temporarily as changing the process data, then would require autograder changes too which might take some more time.

So the l.t. @lucas.coutinho resolved the current issue as autograder changes as well as process tweet changes in the assignment might take a little longer time to make changes.

so please go ahead for now as you should not have problem with submission and unittest cell. if you encounter any such issue, please let us know. Thank you again for reporting this and being patient while the staff was addressing the issue.

Regards
DP

lkj · February 25, 2025, 2:43am

thanks very much @Deepti_Prasad @paulinpaloalto @lucas.coutinho for your help

Topic		Replies	Views
C1_W1_Assignment 1: Exercise 4 different model accuracy NLP with Classification and Vector Spaces week-module-1	7	25	May 19, 2025
Issue: Wrong number of keys in loglikelihood dictionary NLP with Classification and Vector Spaces week-module-2 , week-module-3	3	624	March 22, 2023
Can't seem to figure out C1_W2_Assignment NLP with Classification and Vector Spaces week-module-2 , week-module-3	3	639	December 22, 2021
Doubt in Week 2 coding assignments NLP with Classification and Vector Spaces week-module-2	9	108	October 22, 2024
C2_w2 assignment Unit test outdated? NLP with Probabilistic Models week-module-2	4	147	June 11, 2024

NLP C1_W1_Assignment : Frequency Dictionary missing a few entries?

Expected output

Related topics