C3W3_Assignment (outdated and confusing copy/paste comments in notebook)

When creating a post, please add:

  • Week # must be added in the tags option of the post.
  • Link to the classroom item you are referring to:
  • Description (include relevant info but please do not post solution code or your entire notebook)

Week -3

Exercise 4.2

Instructions

  • Loop through the incoming data in batch_size chunks, you will again define a tensorflow.data.Dataset to do so. This time you don’t need the labels, so you can just replace them by None,
  • compute v1, v2 using the model output,
  • for each element of the batch - compute the cosine similarity of each pair of entries, v1[j],v2[j] - determine if d > threshold - increment accuracy if that result matches the expected results (y_test[j]) Instead of running a for loop, you will vectorize all these operations to make things more efficient,
  • compute the final accuracy and confusion matrix and return. For the confusion matrix you can use the tf.math.confusion_matrix function.

Instructions seems like partially incorrect and confusing. All test cases fails. How to seek help?

Exercise 5

def predict(question1, question2, threshold, model, verbose=False):
“”"Function for predicting if two questions are duplicates.

Args:
    question1 (str): First question.
    question2 (str): Second question.
    threshold (float): Desired threshold.
    model (tensorflow.keras.Model): The Siamese model.
    data_generator (function): Data generator function. Defaults to data_generator.
    verbose (bool, optional): If the results should be printed out. Defaults to False.

There is no data generator argument

Comment says, call the predict method

Call the predict method of your model and save the output into v1v2

v1v2 = model((question1, question2))
1 Like

Hi @sugi205

You’re correct, the docstring is a bit incorrect.

That is true, you should use model.predict() (which unfortunately is very similar to the name of the function you’re implementing). Note that model.predict should take in the generator and verbose as arguments (since question1 and question2 are wrapped inside the generator).

Additional hints

Cheers

1 Like

Thanks. How about exercise 4.2?

1 Like

First of all, thanks. Second of all, lot of instructions and code comments contradict to each other. It seems like this particular notebook is not vetted properly and lot of last minute copy/paste issues.

There was no instruction to split the concatenated outputs from Siemese model. Some dead code were left which was confusing. The reason why l2 norm is not needed to calculate for cosine similarity was noted in a different cell.

Overall, it was painful process to finish this notebook especially the one with confusion matrix and accuracy. I didn’t get the accuracy right.

Vectorized way and also asking to loop confused me as well. The cell should have either left blank or it should have correct code template to complete. Hope it helps other folks.

Remember to not pay attention and at same time pay attention to some of code comments and also read instructions from other cells. Don’t forget to split the concatenated output for evaluation using pre-trained model.