Hi everyone. I’m in the first course of the Neural Networks and Deep Learning Specialization course. In the second week’s programming assignment titled “Logistic_Regression_with_a_Neural_Network_mindset” I’m running into some errors in my code that I can solve…
In Exercise 4.2 - Initializing parameters I get an assertion error when I run the following code found in the first screenshot:
After getting this error I moved on because my parameters w and b were still being implemented. However, I get stuck again in the next excises. In Exercise 4.3 - Forward and Backward propagation I’m having trouble coding the cost function.
Any help will be very appreciated. Even if you know where else I can find some instruction. Thanks and good luck to you all!
I don’t think that is what you want for b and seems to be the root of your troubles…you need a scalar there, not an array, even if it only has a single element.
Maybe take a look at
Return a floating point number constructed from a number or string x .
You’re right, it does need to be scalar. I’m pretty sure np.zeros((1)) is a scalar array. Is this incorrect?
To satisfy what I think is this course’s way of grading, I casted my variable “b” to a float, satisfying the code in the second code block: assert type(b) == float
It worked! However for the next step, I got a type error. It doesn’t like that the variable ‘b’ is a float. I think it wants a float64. I looked around but I couldn’t figure it out. How can I convert a float to a numpy float64?
Do you think this is the right way to solve it or would you recommend something else?
I also notice that you are moving forward, without passing the previous tests. This is not recommended. You have to pass all the previous exercises before moving to the next one as it calls the previous code.
Yes I’m aware, you’re right I should have tried to solve the initial problem. My thought process was skewed.
Why is b=0 the right implementation? In the lecture videos, the code Andrew presented was b = np.zeros((diminions)). In the future for example, if I needed to implement the parameter b as a 3x1 column vector, would I use np.zeros((3, 1))? If this isn’t the case, how would I do it? What differentiates this example and the one in the programming assignment. I guess I’m confused why np.zeros(()) isn’t used.
As @saifkhanengr replies above, this is not correct. Firstly, numpy.zeros() returns an array, not what you call a scalar array. An array with a single element is still an array.
Secondly, from the type system perspective, there is a special type for numpy that refers to an extracted element of an array, not the array itself, again even if the array has only a single element. In your case, when initializing with np.zeros(), b and b[0] are different types.
And notice that the extracted element is an array scalar, not scalar array, which is an oxymoron. A numpy float array scalar corresponds to Python float, but isn’t exactly the same. Here is what the linked numpy doc says. Some of the scalar types are essentially equivalent to fundamental Python types
Hope, you have understood where you were lacking behind in resolving the query. There’s one this request of removing the codes as sharing them here is against the code of conduct.
Great that’s helpful, thank you. Why should “b” be a scalar python float for this assignment? Is that what’s creating this error?
When A=sigmoid(np.dot(w.T, X)+b) I get the following error. Is this because dZ is trying to compute the variable X, w.T, and b. Where X and w.T are arrays and b is a scalar float? Would converting “b” to an array be the correct way of solving this issue?
I think there are several issues going on in your propagate()
The one that is causing the TypeError is that you implemented the equation in the notes too literally. There is no implied mathematical operation in Python; you have to explicitly provide one. In your current line 49, Python thinks (\frac{1}{m})(X * dZ) is a function call on \frac{1}{m} because there is no math operator between the two sets of parens ()().
You might also want to review the lecture videos for guidance on appropriate choice of multiplication, dot product, and transpose in this context.
As far as I know, this level of screen capture is acceptable. It’s the implementation of the graded functions you should avoid sharing. The parts that leak through in the error traceback are allowed.
Ohh ops, I’ve been using element wise multiplication rather than matrix multiplication and I forgot a few np.sum()'s. Geez it seems so obvious now that it works. Thanks for your help!