Hi,
So I’ve been struggling along with the implementation of this function for a few hours now and am at the point of just repeating things I’ve already tried numerous times in the hope that it will somehow miraculously work hah.
I keep getting a mismatch error for the shape of the matrices, and while I fully understand matrix multiplication rules and what dimensions must match up in order to be able to carry out the matmul, I am clearly misunderstanding something.
I also understand that during the loop of the Linear->ReLU for L-1 iterations, the input to the new layer is the output from the activation of the previous layer.
I’m not looking for the answer as I appreciate that this is an assignment, I am however seeking some guidance on where I might be placing my misunderstandings.
I can provide my code for the L_model_forward(…) function.
Thanks.
Edit 1: Currently working through this lovely post.
Edit 2: All good now, after finding out where I could access the test cases input I was able to sketch up what the architecture should look like along will all the respective shapes at various points. I very quickly realised I was not passing the correct W and b values.
I was only passing W1 and b1 during the entire loop, not sure why I never noticed it sooner, it’s very late here in the UK!
Hopefully, this helps some future weary travellers.