I’m stuck on exercise no 5 and don’t know the issue with my code. I know we are not supposed to post any code so I’m wondering how can i request help. I have spent a couple hours trying to figure what’s wrong and feel I’m stuck and need some help here. I do not know whom to reach out to.
Please click my name and message your notebook as an attachment.
Do update your post with stacktrace / cell output (without sharing the actual code). It’s impossible for anyone to know what the problem is.
I recommend you post some screen capture images showing any error messages or asserts or results that appear when you run your code.
Typically, the way to fix shape errors is to transpose one of the arguments (either Y or A).
Thanks, that fixed the dw and db values, but the value of cost is incorrect (7.41 instead of the expected 0.15) for some reason. Any ideas?
I am just walking by, and think that an article on matrix multiplication might help
Google for more until at least you know why there is a requirement on what should be on the left hand side of the multiplication and what should be on the right hand side. For multiplication between two numbers, we can have them in either arrangement, but for matrix, there is a rule and we can’t do it at will.
There are a total of 4 ways to arrange the multiplication and the transpose, and only 2 ways will pass that rule, and you have asked about only one of the two ways.
This is all I can share and I will leave all the rest to you to figure out on your own.
I know we cant multiply a 1 by 3 with another 1 by 3 matrix but had failed to understand the error. I now have correct value values for db and dw but incorrect value for the cost function. Any ideas on what could be wrong
I will leave it to you to read and experiment yourself.
Please note that a solid knowledge of basic linear algebra is a prerequisite for success here. If you are not familiar with how matrix multiplication works, you should put this course on “pause” and take a look at some linear algebra courses, e.g. the one from Khan Academy.
Here is a thread which shows some examples of how to use np.dot and transpose that may be relevant to this computation.
You’ve sent me the notebook with the same error as shown in the stacktrace on this topic. Please read the links from Raymond & Paul to get a better idea on solving this error.
Moving forward without the knowledge of linear algebra and partial derivatives is not a good option.
@rmwkwok , @paulinpaloalto @balaji.ambresh Thanks for the guidance. I was taking transpose of Y. the solution was correct after I transposed A. When i multiply 31 matrix with a 13 matrix i get a 33 matrix and get a 11 if i flip the order. Why is it ok to transpose the ln(A) and ln(1-A) matrices and not vice versa. What is the linear algebra equivalent if we transpose Y instead? In other words, why is not ok to transpose Y?
Also the output did not change even when i commented this code: cost = np.squeeze(np.array(cost))
That was all explained and demonstrated on the thread that I linked earlier. If you missed that the first time around, it is worth a look.
But it fundamentally comes down to understanding what the operations mean. It will be clear why y^T \cdot log(a) is not the same thing as y \cdot log(a)^T if you examine the examples given on that thread.
Thanks, it made some sense when i read that when we get a 1*1 output we have the sum of squares.