Main Issue :
In the NLP C1_W1_Assignment, in the “Prepare the Data” Section, I get a mismatch as follows: (and as highlighted in the screenshot)
len(freqs) = 11406
does not match the Expected output:
len(freqs) = 11436
Please note- this is before any of the exercises or any additional changes on my part. I followed the instructions to request a fresh copy to be sure.
I was hoping it was a typo so I went ahead.
I passed all the tests until exercise 4, passing 11 tests but failing 5.
Also in Section 3 - Training your Model, my cost and trained weights were always off by a little:
My cost after training is 0.22525459.
My resulting vector of weights is [6e-08, 0.00053785, -0.00055884]
Expected Output:
The cost after training is 0.22522315.
The resulting vector of weights is [6e-08, 0.00053818, -0.0005583]
I’d really appreciate it if someone could clarify the first issue of the missing frequency dictionary entries.
Yes, I just reran my notebook that used to pass all the tests and the grader and I get the same error that you show. In addition to that mismatching expected length, several of the unit tests in the notebook now fail for me.
This must be another side effect of the nltk data being changed. Here’s recent thread about similar problems in the NLP C1 W2 assignment. Mentor @Deepti_Prasad reported that other problem to the course staff, but we have not yet heard back from them.
Deepti, is it possible to add a note to your previous report suggesting that they scan all the NLP courses to see which assignments depend on imported NLTK data?
nltkdata issue actually started with course3 week1 assignment which was addressed few months ago. I will make sure report this assignment and in general the nltkdata version(it probably needs to get updated or be used with the version, the assignment was created.
Thanks very much for working with the course staff to get this fixed. I’m not familiar with the nltk website, but maybe we should encourage them to find a “permanent” solution to this issue. Meaning a way that the assignments can ask for a particular version of the data or if the data cannot be guaranteed to be stable, then they just need to import the copies to make them fixed.
as per recent update by @lucas.coutinho, the correction with the process tweet has been addressed with the stopword metadata correction. please close and open the lab to see the changes done.
Hi @Deepti_Prasad
I ran the updated lab but still get the mismatch (see screenshot below)
Still try logging out and getting a fresh lab? Or is my output now the actual “expected output”?
I ended up patching the notebook update as a silent update, so if you want to get the updated copy of it, you may refresh your workspace as usual to get a new copy, but it shouldn’t impact the unittests or the grader.
according to recent update, the Freq is differing because of the stopword of the process tweet which is currently addressed with changes done in the metadata of the assignment notebook.
This has been updated temporarily as changing the process data, then would require autograder changes too which might take some more time.
So the l.t. @lucas.coutinho resolved the current issue as autograder changes as well as process tweet changes in the assignment might take a little longer time to make changes.
so please go ahead for now as you should not have problem with submission and unittest cell. if you encounter any such issue, please let us know. Thank you again for reporting this and being patient while the staff was addressing the issue.