Course 1 - Week 2 - Exercise 5 - propagate

I’ve been struggling with this exercise for along time and can’t figure out what I’m doing wrong. I’ve tried to implement all the equations of cost, dw and db function with and without using of numpy but the result is not as expected

  • dw.shape is (2,3) not (2,1)
  • type(grads[“db”]) is numpy.ndarray not np.float64
  • for the assertion test I got this message
    "Wrong shape for grads[‘dw’]. (3, 4) != (3, 1) "

Any guidance how to overcome this …
Your help is highly appreciated…

The first step is to figure out why your dw values are the wrong shape. My guess is that you are using “elementwise” multiply instead of dot product in the formula for dw. Here’s the math formula that you need to write the code for:

dw = \displaystyle \frac {1}{m} X \cdot (A - Y)^T

The key point is that the operation between X and (A - Y)^T is a dot product style matrix multiply, not an “elementwise” multiply.

Look at the dimensions of the inputs to the first test case for propagate:

w is 2 x 1
X is 2 x 3
Y is 1 x 3

That means A will also be 1 x 3. So look at the dimensions on that dot product:

X is 2 x 3 and (A - Y)^T will be 3 x 1. So the result will be 2 x 1, which is the same as the shape of w. But if you use * instead of and leave out the transpose, then you will end up with a 2 x 3 output.

Here’s a thread about how to figure out when to use elementwise multiply and when to use dot product. Please have a look and see if that helps.

Thank you for this explanation.

I’ve updated my code but didn’t pass the assertion test.

  • db is float not float64
    here is my formula for db
    {moderator edit - solution code removed}

  • Also I got this assertion error message for cost
    AssertionError: Wrong values for cost. [3.75577540e-04 5.08619180e-05 4.19465073e-02 2.00008385e+00] != 2.0424567983978403
    my cost formula is :
    {moderator edit - solution code removed}

As @paulinpaloalto guessed correctly, you are using element-wise multiplication (using the Python * operator instead of a dot product. Calculus is not essential for the specialization, but a rudimentary knowledge of linear algebra is an implicit requirement, matrix operations, in particular.

The NumPy function for dot products and matrix multiplication is The NumPy documentation is found here. I recommend taking some time to digest al of this before moving on.

For the db problem, I suggest you not use np.subtract and just use np.sum of A - Y. But there is no need to specify the axis parameter in that case.

The problem for the cost is that you can see that your answer is a vector, not a scalar. There are two ways to implement that:

  1. With a dot product, which does the multiply and the sum in one shot, but you’ll need a transpose to get the dimensions to work.
  2. Use *, but in that case, you need to follow that with a sum. You’ve only done the first step.