Hi, I’m trying to debug the following error for the past 1.5 hours, but it seems like there’s something happening in the backend that leads to an incorrect value of w being returned:
The error reads:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in
----> 1 model_test(model)
~/work/release/W2A2/public_tests.py in model_test(target)
109 y_test = np.array([1, 0, 1])
110
--> 111 d = target(X, Y, x_test, y_test, num_iterations=50, learning_rate=1e-4)
112
113 assert type(d['costs']) == list, f"Wrong type for d['costs']. {type(d['costs'])} != list"
<ipython-input-36-b9a9ca57a444> in model(X_train, Y_train, X_test, Y_test, num_iterations, learning_rate, print_cost)
42 b = params['b']
43
---> 44 Y_prediction_train = predict(w,b,X_train)
45 Y_prediction_test = predict(w,b,X_test)
46
<ipython-input-16-b1ae5c93c959> in predict(w, b, X)
16 m = X.shape[1]
17 Y_prediction = np.zeros((1, m))
---> 18 w = w.reshape(X.shape[0], 1) # I tested this multiple times and w is returned correctly
19
20 # Compute vector "A" predicting the probabilities of a cat being present in the picture
ValueError: cannot reshape array of size 2 into shape (4,1)
Using %debug, I was able to peek deeper into what was happening:
ipdb> params
{'w': array([[-0.08608643],
[ 0.10971233]]), 'b': -0.1442742664803268}
ipdb> X.shape
(4, 3)
ipdb> print(initialize_with_zeros(X.shape[0]))
(array([[0.],
[0.],
[0.],
[0.]]), 0.0)
ipdb> grads
{'dw': array([[0.12311093],
[0.13629247]]), 'db': -0.14923915884638042}
The expectation is that X_train has dimension (4,3), so w should be a list of size 4. This is successfully created with my initialize_with_zeroes(), as you can see in the debug. However, whenever my model() is being run, initialize_with_zeroes() failed to return the right size - it’s stuck at 2. This leads to the incorrect number of parameters being calculated. This is why Python is complaining about not able to reshape a (2,1) vector to a (4,1) vector as required).
Now I have looked through my code a thousand times and it does not seem there’s a problem in my code. Everything ran fine until exercise 8. I notice that there are two unexplained lines in optimize():
#w = copy.deepcopy(w)
#b = copy.deepcopy(b)
I have commented them out since they assign w & b, but that does not solve the problem.
I suspect there’s something going on in the backend that keeps w stuck at 2. Or maybe I’m wrong, but either way can you look into this and see what’s going on? Thanks.