Week 2 - Logistic Regression Assignment - Exercise 5 - Forward and Backward propagation

I just computed:

  • A using the previously created sigmoid() function, using np.dot to multiply w.T by X and summing X
  • cost, multiplying -1/m by the sum of the multiplication of Y.T by log(A) with the multiplication of (1-Y).T by log(1-A)
  • then I computed dw by multiplying (1/m) by the multiplication of X by (A-Y).T
  • db was computed using the same formula as dw, only replacing m for b

I got this error message which doesn’t help me understand the problem:
image

Can someone help me?

db does not require any multiplication by X.

But note that the assert your code triggered has to do with the data type, not the value.

2 Likes

In general I recommend you use some print(.shape) commands, to inspect the shapes of the results.

Numpy can co some non-obvious things with vectors and the shapes of the results, none of which will throw any runtime errors.

1 Like

Note that b is a scalar, as we discussed earlier. So that means that db is also a scalar. So what type is it? Add this statement to find out:

print(type(db))

Also note Tom’s point that X is not involved in the formula for db.

1 Like

I understand that the formula to be applied is “(1/m)*np.sum(A,-Y)”, right?

I tried “print(db.shape)” just below the second line of code referring to db, but it didn’t print anything. The same for “print(type(db))”.

Now the assert has changed, but it seems to me to be the same type problem: “TypeError: only integer scalar arrays can be converted to a scalar index”

I also tried applying [0] to db, but it didn’t change anything. And using astype to “force” the type of db. But still getting the same errors.

What is that “comma” doing in your np.sum call? np.sum takes a vector or matrix as the single argument. There are some positional arguments, like axis and keepdims, but you don’t want to supply those here. So putting that comma there is probably what is causing the problem and is making the np.sum call interpret -Y as some sort of optional keyword argument. In fact, it probably thinks you are specifying the axis parameter, which is why it complains that the value is not an integer.

Programming is a game of details: a single wrong character or character in the wrong place can ruin everything. :nerd_face:

1 Like

Also note that if you add a print statement and nothing happens, that probably means that you didn’t click “Shift-Enter” on the actual function cell to get that new statement “compiled”. So when you ran the test again, you were still running the old code. Just calling the function without “Shift-Enter” on the actual function runs the old code again without the print statement.

You can easily demonstrate to yourself how that works, now that I’ve explained it. Try it and watch what happens:

Add a new print statement and then call the function again. Nothing.

Click on the function cell and then type “Shift-Enter” and then run the test again. Bingo.

1 Like

But the other reason is that you don’t get to the print statement because the np.sum call is throwing an exception. The print statement didn’t get executed, because it never got there as a result of the exception.

1 Like

Got it! Almost there!
I fixed the db formula for “(1/m)*np.sum(A-Y)”, so now I reached a value.
And, with this precious shift+enter tip, I was able to run the prints. I received: shape =() and type =<class ‘numpy.float64’>

And now the error is that the cost (“(-1/m) * np.sum(np.dot(Y.T, np.log(A)) + np.dot((1-Y).T, np.log(1-A)))”) is not what was expected.
I believe this is another problem, right? So db is ok now?

If you do Y^T \cdot log(A) that is an m x 1 vector dotted with a 1 x m vector, which gives an m x m result. That is why you needed the np.sum there. If you did the np.dot correctly, the sum would not be necessary. To see why your method is wrong and to understand the correct method, please see this thread.

1 Like

That indicates a problem.

Ok, I understood that to get the right shape, I had to transpose the other element of the formula (the logs related to A) and I got all tests passed!

But the shape of db is still (). Why? And what would be the problem here?

If all the tests passed, then you are ok. I think Tom was concerned about whether your db value was still a numpy array instead of a scalar, but it looks like everything is correct now. Onward! :nerd_face:

I hope some good lessons were learned on this assignment, because this is just the beginning. Paying attention to what the math formulas say and knowing the workings of the various numpy and python constructs here will be critical in all the upcoming assignments. Things will only get more complex from here :scream_cat:

As you could see, I faced some basic difficulties. I feel like they are more related to concepts than Python itself, which is relatively easy, if you have some background in programming.
What did you identify as my points of difficulty and what would you suggest?

Yes, there were a number of conceptual errors like the hard-coding of the dimension in “initialize” and not understanding why the way you did the transposes would not implement the same thing that the math formula for the cost specified. We always need to start by making sure we understand what the math formula says.

And then there was not understanding the syntax of the arguments to np.sum. You may think you are “saving time” by not reading the documentation, but you probably ended up wasting way more time than you “saved”. We are going to be using lots of numpy functions here and will eventually graduate to TensorFlow which is even more complex. It’s never a good idea to just assume you know what some function does and skip reading the documentation. Well, in some cases, you get lucky and Prof Ng will show us examples or they will give you sample code in the notebooks. But it’s still a good practice to just read the documentation yourself. You may also learn about other useful features that Prof Ng didn’t show you.

I’m curious how much programming you have done and in what sort of “solution space”. E.g. were you implementing webpages or iPhone apps or … And in what languages?

1 Like

I recommended reading the documentation, but forgot to mention that it’s easy to find the numpy documentation. E.g. if you want to read the docpage for the np.sum function, just google “numpy sum”.

1 Like

I’m getting into this subject now and I think I haven’t dealt with the subject of matrices and logs since school. But I find it very interesting. If there are other support courses that you think would be useful, you can recommend them! I really want to work in the area.
I thought that understanding the concepts would be more important than programming in Python for this specialization. I didn’t think I should delve into the language. Therefore, I ended up working more “shallowly” on the codes. From what you tell me, we’re going to delve even deeper into the language, right? I’m thinking about taking an introduction to Python course to have a better foundation.
I programmed in Dbase (decades ago), creating systems for video stores (a business that also no longer exists) and, more recently, php and html, in projects with Moodle and Wordpress. But always preferring “low-code” plugins. This gave me ease with logic.

Machine learning is a combination of understanding the methods and being able to implement them in software.

You need both skills.

1 Like