What is that “comma” doing in your np.sum call? np.sum takes a vector or matrix as the single argument. There are some positional arguments, like axis and keepdims, but you don’t want to supply those here. So putting that comma there is probably what is causing the problem and is making the np.sum call interpret -Y as some sort of optional keyword argument. In fact, it probably thinks you are specifying the axis parameter, which is why it complains that the value is not an integer.
Programming is a game of details: a single wrong character or character in the wrong place can ruin everything.
Also note that if you add a print statement and nothing happens, that probably means that you didn’t click “Shift-Enter” on the actual function cell to get that new statement “compiled”. So when you ran the test again, you were still running the old code. Just calling the function without “Shift-Enter” on the actual function runs the old code again without the print statement.
You can easily demonstrate to yourself how that works, now that I’ve explained it. Try it and watch what happens:
Add a new print statement and then call the function again. Nothing.
Click on the function cell and then type “Shift-Enter” and then run the test again. Bingo.
But the other reason is that you don’t get to the print statement because the np.sum call is throwing an exception. The print statement didn’t get executed, because it never got there as a result of the exception.
Got it! Almost there!
I fixed the db formula for “(1/m)*np.sum(A-Y)”, so now I reached a value.
And, with this precious shift+enter tip, I was able to run the prints. I received: shape =() and type =<class ‘numpy.float64’>
And now the error is that the cost (“(-1/m) * np.sum(np.dot(Y.T, np.log(A)) + np.dot((1-Y).T, np.log(1-A)))”) is not what was expected.
I believe this is another problem, right? So db is ok now?
If you do Y^T \cdot log(A) that is an m x 1 vector dotted with a 1 x m vector, which gives an m x m result. That is why you needed the np.sum there. If you did the np.dot correctly, the sum would not be necessary. To see why your method is wrong and to understand the correct method, please see this thread.
If all the tests passed, then you are ok. I think Tom was concerned about whether your db value was still a numpy array instead of a scalar, but it looks like everything is correct now. Onward!
I hope some good lessons were learned on this assignment, because this is just the beginning. Paying attention to what the math formulas say and knowing the workings of the various numpy and python constructs here will be critical in all the upcoming assignments. Things will only get more complex from here …
As you could see, I faced some basic difficulties. I feel like they are more related to concepts than Python itself, which is relatively easy, if you have some background in programming.
What did you identify as my points of difficulty and what would you suggest?
Yes, there were a number of conceptual errors like the hard-coding of the dimension in “initialize” and not understanding why the way you did the transposes would not implement the same thing that the math formula for the cost specified. We always need to start by making sure we understand what the math formula says.
And then there was not understanding the syntax of the arguments to np.sum. You may think you are “saving time” by not reading the documentation, but you probably ended up wasting way more time than you “saved”. We are going to be using lots of numpy functions here and will eventually graduate to TensorFlow which is even more complex. It’s never a good idea to just assume you know what some function does and skip reading the documentation. Well, in some cases, you get lucky and Prof Ng will show us examples or they will give you sample code in the notebooks. But it’s still a good practice to just read the documentation yourself. You may also learn about other useful features that Prof Ng didn’t show you.
I’m curious how much programming you have done and in what sort of “solution space”. E.g. were you implementing webpages or iPhone apps or … And in what languages?
I’m getting into this subject now and I think I haven’t dealt with the subject of matrices and logs since school. But I find it very interesting. If there are other support courses that you think would be useful, you can recommend them! I really want to work in the area.
I thought that understanding the concepts would be more important than programming in Python for this specialization. I didn’t think I should delve into the language. Therefore, I ended up working more “shallowly” on the codes. From what you tell me, we’re going to delve even deeper into the language, right? I’m thinking about taking an introduction to Python course to have a better foundation.
I programmed in Dbase (decades ago), creating systems for video stores (a business that also no longer exists) and, more recently, php and html, in projects with Moodle and Wordpress. But always preferring “low-code” plugins. This gave me ease with logic.