Need Help Resolving IndexError in Neural Network Backpropagation Code

Week #4 deep-learning-specialization:dls-course-1

Hello everyone,

I’m currently working on a deep learning assignment and have hit a roadblock that I’m struggling to overcome. I’m trying to implement the backpropagation algorithm for a deep neural network, but I keep encountering an IndexError when running my code.

The error message states: IndexError: boolean index did not match indexed array along dimension 0; dimension is 3 but corresponding boolean dimension is 1

I’ve attached screenshots of my Jupyter notebook where the error occurs. The issue arises when I attempt to run the L_model_backward function, which is supposed to compute the gradients for backpropagation.

I’ve reviewed the dimensions of the arrays and the indexing, but the error persists, and I’m not sure how to proceed. I would be very grateful for any guidance on how to debug this issue. I suspect the problem might be with how I’m indexing into the arrays or maybe a misunderstanding of how the backpropagation should be implemented.

If anyone has experienced a similar issue or has insights into how I might resolve this, your input would be highly appreciated!

Thank you in advance for your assistance!

Best,
Kim

{moderator edit - solution code removed}

Hi @Watcharakorn_Osathan ,

The backward propagation start from the last layer, and the activation function for that is sigmoid. It looks like you are calling Relu instead. Please check.

1 Like

In addition to the important point that @kic made, also notice how each layer of back propagation works: at each layer l, you get as input the value dA^{[l]} and your outputs are dW^{[l]} and db^{[l]} and dA^{[l-1]}. That is not what your code does in the case of the hidden layers. That is probably what is causing your issues with wrong shapes.

The reason it works that way is that it’s the mirror image of forward propagation. In forward propagation at layer l, you take A^{[l-1]}, W^{[l]} and b^{[l]} as the inputs and then you produce A^{[l]} as the output.