Emoji_v3a model() must give perfect accuracy

arnhan02 · July 19, 2021, 3:21pm

I can’t find the bug in

# Optimization loop
    for t in range(num_iterations): # Loop over the number of iterations
        for i in range(m):          # Loop over the training examples
            
            ### START CODE HERE ### (≈ 4 lines of code)
            {mentor edit: code removed}

Eventhough the assertion is ‘‘model() must give perfect accuracy’’ and pred = [0. 0. 0. 0. 0. 0. 0. 1. 1. 0. 0. 1.] instead of Y = [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1].
In the training cell below it seems like there’s a missmatch between the dimensions of avg and b (eventhough sentence_to_avg is initialized correctly with np.zeros(word_to_vec_map[any_word].shape)

Help, I’m staring at the 4 lines code for an hour.

Best,
Hannes

TMosh · July 19, 2021, 3:50pm

“forward propagate the avg” means you multiply by W and add b.
The instructions for Exercise 2 give you the equation.

TMosh · July 19, 2021, 3:53pm

Also, you should check your code for computing the cost.

arnhan02 · July 19, 2021, 4:13pm

Oh dear, I didn’t see the “.” in the formula. But the cost should be fine, it’s the vectorized version for the sum ;).

TMosh · July 20, 2021, 4:32am

Does your code work correctly now?

arnhan02 · July 21, 2021, 12:34pm

Thanks, yes! After adding the dotproduct everything worked out!!

Henry_Tseng · February 11, 2023, 8:27pm

A note about using np.average would be useful in this problem as well.

mosmattik · July 29, 2024, 12:55pm

I’m completely stuck on this section. Unlike the OP the shapes of avg, W, a, and z all seem to be correct.

The problem seems to be the calculation of the cost function, but I have tried different variations, using np.dot and *, but the result is always the same: “AssertionError: Model must give a perfect accuracy”

The accuracy of the model decreases as well:
Epoch: 0 — cost = 2.664198098365268
Accuracy: 0.9166666666666666
Epoch: 100 — cost = 96.19998254000362
Accuracy: 0.5

Any suggestions would be appreciated.
Matt

paulinpaloalto · July 29, 2024, 3:10pm

Well, notice that your cost is going up rather than down with more iterations. Maybe the problem is not how you compute the cost, but your gradients and how you are applying them. E.g. are you subtracting the gradient terms (times learning rate of course) or maybe adding them?

Also note that this thread is more than a year old, so there is no guarantee that any of the participants are still listening. I just happened to notice because I had set “Watching” on this thread back when it first happened.

mosmattik · July 29, 2024, 3:28pm

Hi Paul, thanks for answering.

So the gradients are computed for us, they are outside of the area we are supposed to code. I did look at them, but I don’t see any glaring issues, and I assume someone else would have asked the question by now if that was the problem.

Here are the gradient computation and updating sections we are given:

Compute gradients

        dz = a - Y_oh[i]
        dW += -np.dot(dz.reshape(n_y,1), avg.reshape(1, n_h))
        db += dz

        # Update parameters with Stochastic Gradient Descent
        W = W - learning_rate * dW
        b = b - learning_rate * db

paulinpaloalto · July 29, 2024, 5:08pm

Good point. Sorry, I forgot to notice which parts were the template. So it must just be your cost code itself.

Ok, looking again more closely at the code and thinking \epsilon harder, notice that both the cost and the gradients depend on the value of a, which is computed by your code. Nothing depends on the cost. Maybe your logic for computing a is incorrect, but it should be pretty simple. The math formula is:

a = softmax(W \cdot avg + b)

Not much that could go off the rails there. If that suggestion doesn’t pan out, it’s time to just look at your code. We aren’t supposed to do that on a public thread, but I’ll send you a DM about how we can proceed with that.

paulinpaloalto · July 29, 2024, 9:40pm

Ok, to close the loop on the public thread, we had a followup conversation by DM and it turns out the problem is simple and you can see it in the fragment of the template code that Matt shows above. Notice that an errant minus sign got added to the code that computes dW. Sorry that I didn’t spot that earlier, but it took doing the direct compare with my code on a line by line basis to finally spot it.

Topic		Replies	Views
Assertion Error: Model must give a perfect accuracy Sequence Models coursera-platform	11	683	May 26, 2023
AssertionError: Model must give a perfect accuracy Sequence Models coursera-platform	6	508	March 21, 2023
C5W2A2 - Emoji_v3a - cost error Sequence Models coursera-platform	5	671	March 28, 2023
Week 2 Logistic Regression with a Neural Network Mindset Neural Networks and Deep Learning coursera-platform	25	833	March 27, 2025
WK#2 final code assignment-code passed tests but I keep getting errors with Model function Neural Networks and Deep Learning week-module-2 , coursera-platform	42	511	January 26, 2024

Emoji_v3a model() must give perfect accuracy

Compute gradients

Related topics