W2_A2_Ex-8 Value error due to shape of the matrix

Hi All,

I have passed all the test but there is an error while putting it all together (model function). I am getting a Value error where the shape of the matrix are the reason for error but I am unable to figure which Variable is the cause of the problem. It would be helpful if anyone direct me towards it.

Hello @vishu_bandari,

It’s important to analyze the error traceback because it can give us a lot of information in code debugging. It traced from the function that we call on the jupyter notebook, down to the call of the last function that really triggers the error, and in your case, we need to focus on here:

The error happened in the propagate function which is our exercise 5. The arrow tells us which line had triggered the error, and combining with the error message that there were unaligned shapes, we can deduce that the problem is at np.dot.

A dot product dots two matrices together, and the rule for a valid dot product is, as told in the error message, the first matrix’s last shape value be equal to the second matrix’s first shape value. Your Y had a shape of (1, 7), whereas your np.log(A) had a shape of (1, 4), and that violated the rule.

Now the problem is, despite you had passed many tests, probably including the ones for propagate, was your implementation for it indeed wrong?

If you also read the DocString for propagate:

    """
    Implement the cost function and its gradient for the propagation explained above

    Arguments:
    w -- weights, a numpy array of size (num_px * num_px * 3, 1)
    b -- bias, a scalar
    X -- data of size (num_px * num_px * 3, number of examples)
    Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)

    Return:
    cost -- negative log-likelihood cost for logistic regression
    dw -- gradient of the loss with respect to w, thus same shape as w
    db -- gradient of the loss with respect to b, thus same shape as b
    
    Tips:
    - Write your code step by step for the propagation. np.log(), np.dot()
    """

Y is supposed to have a shape of (1, # samples) which is consistent with the shape revealed in the error message. A isn’t listed in the DocString but since we know how we calculated A, we can judge that the shape revealed in the error for A is reasonable.

With the shapes of Y and A reasonable but the rule for a valid dot product was violated, I believe that’s where you need to make some corrections for.

Above is how I would analyze your bug report, and I hope this can be an example for your future debugging work!

Cheers,
Raymond

The worrying thing there is that it looks like you are transposing the input X matrix for some reason. If you look carefully at all the formulas that we are using, there should be no reason to do that. We need to transpose w in the propagate function, but there is nowhere that we transpose X, right?

If you look the dimensions of the test case here, the X value is 4 x 7, right? That means there are four “features” in each input sample and there are 7 total input samples in the training set. So w should be 4 x 1, right?

1 Like

Hi @paulinpaloalto
Thanks for the reply

I’m Transposing X to flatten the X input matrix to a 1-dim array, for that I’m using this
X_tr = X_train.reshape(X_train.shape[0], -1).T

even if I change this to
X_tr = X_train.reshape(X_train.shape[0], -1)

I’m getting the same error

should the dim(W) = (1,7) and dim(Y) = (4,7) be this or am I missing something?

I have tried changing Y to Y.T in the propagate function, but that’s leading me to another error

You don’t need to “flatten” the X that is input to this test case. The only time we need to do the flattening is to convert the original 4D images to a 2D matrix.

So dimensions of X are n_x (the number of “features”) x m (the number of samples). In this particular example that is 4 x 7. So there is no further conversion you need to do on X in this case.

Also note that the problem is that your w is the wrong shape. You have initialized it to have the number of elements that matches the number of input samples. That is incorrect: it should be the number of “features”, right? So how do you need to change the code in order to fix that?

@paulinpaloalto
I have changed the w shape to (4x1) but still getting the same error

It looks like you must have transposed X. The actual input is 4 x 7, so how did it end up being 7 x 4? I thought we already agreed that you don’t need to transpose X here.

1 Like

Thanks for reply, this solved the problem.

I am currently getting incorrect value of dw.

That assertion is failing for the value of w, not dw. The usual cause for that type of error is not calling the optimize function correctly. Make sure you are not “hard-coding” any of the parameters like the learning rate and that you pass the values of all of them. There should be no “equal signs” in your parameter list that you pass to optimize, because that would mean you are over-riding the actual values that are being requested by the test case.

How do I get the first version of the notebook, I have made too many changes to the notebook and just want to run the clean notebook

There is a topic about that on the DLS FAQ Thread.

I have run the completely new file and still getting the same error

Ok, so that means there’s a bug in your code. Now you need to find it. Did you check the point I made before about how you call the optimize function from model? Did you specify the learning rate and number of iterations without any equal signs? In other words, did you pass the actual values from the test case that were passed in at the top level call to model?

I am first initializing the values w,b from function initialize_with_zeros with parameter X_train.shape[0]
then using optimize function with w, b, X_train, Y_train and getting “params”
where I am assigning values w = params[‘w’] and same for b
Finally calling the predict function with w, b and X

Apart from this the rest of the code is working fine, as all the other tests are passed.

I am not specifying the learning rate and num of iterations in the model, as I am simply running the function that was prior written and I have not passed any values to the function

Ok, that’s wrong. What that means is that you will be using the default values for learning rate and number of iterations that were declared in the definition of the optimize function. That gives the same answer every time, regardless of what values are actually passed into the model function, right? So what if the test case passes different values? Those values matter, right? A different learning rate will give you a different solution. So how are you going to handle that?

It sounds like maybe you are new to the concept of “named” optional parameters in python. It might be worth googling that and reading some tutorials about the meaning of optional parameters.

1 Like