C3W1 Assignment Section 2.1 - df_all_breeds.pkl dataset

I do not believe my work is incorrect even though I got this error when running this section:

The dataset generated from the generator functions is not identical to the expect one

Adding this to output prints True which indicates the error is due to natural floating point precision error.

print(np.allclose(df_all_breeds, pre_loaded_df))

Did others have this issue? And if so, is there a reason this hasn’t been addressed? I wasted a bit of time on this.

Are you referring to the inverse_cdf_gaussian() function and its tests?

Please post a screen capture image of your results.

I believe the tests in Section 2.1 only consist of some print statements, I don’t think any of them cab generate the text “The dataset generated from the generator functions…”.

Note that this lab was updated on Aug 31. Are you using the newest version?

I added the print statement that starts with “Mine:”, the rest of it is not my code:

 if not df_all_breeds.equals(pre_loaded_df):
        print("The dataset generated from the generator functions is not identical to the expect one.\n\nFalling back to the pre-loaded one.")
        print("Mine: Is this due to floating point error: ",np.allclose(df_all_breeds, pre_loaded_df))
        df_all_breeds = pre_loaded_df

It prints:

The dataset generated from the generator functions is not identical to the expect one.

Falling back to the pre-loaded one.
Mine: Is this due to floating point error:  True

I just did an “update lab from latest version” it still shows a date of “Instructor update: May 17, 2023 6:31 PM EDT”. Note I opened the lab either yesterday or the day before for first time. Doing an equals on floating point is generally considered a very bad practice. So, assuming that is the issue.

I was incorrect about there being an update on Aug. 31. I was looking at the wrong repo. There are lots of assignments named C3 W1.

I have not previously heard of the issue you’re reporting.

I agree, using an equality operator on floating point values isn’t a great idea.

Looks to me like the block of code you posted is an additional check on whether your code for these functions works as expected.

  • gaussian_generator
  • binomial_generator
  • uniform_generator

Can you post screen capture images of the results for those unit tests (from earlier in the notebook)?

Also, when you did “update from latest version”, did you also re-name your existing notebook file first?

No. Didn’t want to redo my work. That said everything passed other than that error message and I got a 100% when I submitted my work (before I submitted the original post). Only posted, as I was hoping to save other(s) the pain I faced. Or better yet, someone sees it who can fix the issue by using isclose instead of equals. Thanks for all your help! Much appreciated.

I would like to understand the pain you faced, so I can provide some data for the course staff. I’d like to give them a concrete example of when that message is triggered. It didn’t happen in my notebook, and I haven’t seen any reports of it.

Seeing your results from those functions might be useful.

Can I just privately message you a pdf of my assignment? Is that considered “cheating” since you are a super mentor? Or can you just view my assignment?

I have the same problem. When I run the provided code in C3_W1_Assignment Section 2.1:

# Read the pre-loaded dataset
pre_loaded_df = pd.read_pickle("df_all_breeds.pkl")

try:
    # Generate the dataset using the graded functions from section 1
    df_all_breeds = utils.generate_data(gaussian_generator, binomial_generator, uniform_generator)
except:
    # In case of an error
    print("There was an error when generating the dataset using the generator functions.\n\nFalling back to the pre-loaded one.")
    df_all_breeds = pre_loaded_df
else:
    # In case that the generated dataset does not match the pre-loaded one
    if not df_all_breeds.equals(pre_loaded_df):
        print("The dataset generated from the generator functions is not identical to the expected one.\n\nFalling back to the pre-loaded one.")
        df_all_breeds = pre_loaded_df

        
# Print the first 10 rows of the dataframe
df_all_breeds.head(10)

I got the following message:

The dataset generated from the generator functions is not identical to the expected one.

Falling back to the pre-loaded one.

I passed all the tests before this. What is possibly wrong with my code?

I previously submitted a ticket to address this issue in the notebook, but it is still pending completion.

Don’t worry about it, most likely there is no problem in your code.