#C3W1 Assignment - perplexity score

q3nius · April 27, 2024, 1:36pm

Hi,

I have a question regarding the last graded exercise in C3W1 (GRU models)
There, we are supposed to compute the perplexity given a batch of predictions and a batch of targets.

My code was failing at the last unittest, and when I investigated the cause, I found that the failing case had a different shape of preds than expected.
Precisely - I’d expect the input tensor of shape (batch_size, seq_length, vocab_size), but the problematic preds are 4-dimensional. See shape comparisons of all test examples:
(1, 5, 3)
(1, 5, 3)
(1, 5, 3)
(1, 5, 3)
(1, 8, 5)
(1, 8, 5)
(1, 8, 5)
(1, 7, 3)
(2, 1, 7, 3)

I figured if I removed the 2nd dimension, the shape would match the expected shape (w.r.t. to targets), but my log-perplexity is slightly off.

Am I doing something wrong, or the test inputs are incorrect?
Thanks

Deepti_Prasad · April 27, 2024, 2:26pm

Can you share your outputs with the expected output.
You do not need to share codes.

Regards
DP

q3nius · April 27, 2024, 3:38pm

This is it:

The only test I failed in the Assignment, everything else passed.
And for the context, I treated the suspected prediction with tf.squeeze to remove the extra dimension

arvyzukai · April 29, 2024, 5:09am

Hi @q3nius

You can check your intermediate values by printing them and comparing with these. That way you could catch where your solution deviates from the intended one.

Cheers

Deepti_Prasad · April 29, 2024, 5:33am

Hello @q3nius

With what other mentor has provided link.

also notice for pred.shape instructions provided as hint were

To convert the target into the same dimension as the predictions tensor use tf.one_hot with target and preds.shape[-1].
You will also need the np.equal function in order to unpad the data and properly compute perplexity.

Also refer this to assign the corrective axis value for the perplexity score.

If the input indices is rank N , the output will have rank N+1 . The new axis is created at dimension axis (default: the new axis is appended at the end). So choosing 1 is incorrect in this scenario.

https://www.tensorflow.org/api_docs/python/tf/one_hot

This link provides detail on correct assign value for axis.

Next make sure
Identify non-padding elements in the target,
You should check if the target equals to PADDING_ID which is 1.

Regards
DP

q3nius · April 29, 2024, 8:40pm

Thank you for answering. My problem is 1:1 to your post (I used the same solution as Cawnpore_Charlie and arrived at the same perplexity value; see the screen above).
So, is there a way to pass the last unittest (just out of curiosity, I’ve already submitted my assignment) as I don’t seem to get that from your post.

Deepti_Prasad · April 29, 2024, 8:54pm

As I already mentioned in my previous comment, your code is incorrect exactly where I suspected.

You are using incorrect pred.shape, also not using tf.one_hot label.
The instruction in the exercise.clearly mentions
Calculate log probabilities for predictions using one-hot label

Next in the non_pad, kindly use padding_id value as 1, instead of using padding_id

Next in the sum of probabilities your axis choices are incorrect.

Go back to the instruction section, go point by point, you will find the solution.

Also please make sure not to edit or add any codes outside of ##Start and end code here.

Regards
DP

Topic		Replies	Views
C3W1 Log perplexity, All tests passed in unit test, but getting 0 during grading NLP with Sequence Models week-1	4	53	October 5, 2024
C3W1 perplexity calculation NLP with Sequence Models week-1	2	305	March 11, 2024
Assigment 1: log_perplexity: There was an error grading your submission. Details: operands could not be broadcast together with shapes (2,7) (2,) NLP with Sequence Models week-1	17	115	July 18, 2024
NLP Specilization-C3W1_Assignment-ValueError NLP with Sequence Models week-1	2	282	February 23, 2024
C3W1_Assignment 5.exercise I need help NLP with Sequence Models week-1	1	297	March 28, 2024

#C3W1 Assignment - perplexity score

Related topics