I am unable to fix this issue, any suggestions?
I tried using np.subtract and np.multiply, that didn’t solve it either.
W1
and dW1
should be numpy arrays. learning_rate
should be a float. Does this hint help?
I’m sorry but not really, aren’t they already np arrays? As I said I have tried np.multiply but didn’t work.
Please click my name and message your notebook as an attachment.
The point is your W1 value is not correct. You must have set it equal to the parameters dictionary instead of extracting the relevant entry from the dictionary.
The error message is pretty clear, right? Print the type of your W1:
print(type(W1))
Does it say numpy array or python dictionary? Ok, why?
Note that the type of your dW1 is also probably incorrect. If multiplying it by a float gives a float, that must mean it is not a numpy array either. You need to extract it from the grads dictionary, right? There’s a consistent pattern here.
If dW1
was a float, you can still subtract a float from W1
via vectorization.
Sure, but that is not what should be happening, right? dW1 should be a numpy array of the same shape as W1, right?
The error message is pretty clear: the problem is that the first operand to the subtract is a dictionary, which is wrong.
But my point is that the second operand is also the wrong type.
You’re right. I was just pointing out that if dW1
was a float, you’ll fail the test due to value mismatch. The function would run to completion.
I was reading into If multiplying it by a float gives a float, that must mean it is not a numpy array either
.
Thank you for your assistance sir . I made a mistake (I forgot my mistake) while making a copy of W1
using W1 = copy.deepcopy(parameters["W1"])
. After I corrected my copy.deepcopy
function all suddenly started working fine. My only question would be the significance of making “deep” copies.
If you are copying the parameters individually as you are doing, the deepcopy is not necessary: a simple copy will work. But if you want to copy the whole dictionary in one shot (it’s less code), then you need the deepcopy. This is explained on this thread. Please read all the way through. The deepcopy is explained later in the thread.
The summary is that “deep” copy is necessary when the object you are copying itself contains object references. The “deep” copy goes through depthwise and makes new copies of everything. If you do “simple” copy, it just copies the top level object. In your example above, W1 is just a numpy array, so it’s a python object, but it doesn’t reference any other objects. That’s why a simple copy is all that is required in that case.