It is a mistake to add that reshape on the X at all and in particular to do it with hard-wired dimensions. We are trying to write general code here that works for inputs with any (matching) shapes. You’ll notice that they use relatively small dimensions in the individual test cases for the various functions here, but the actual image data we are using has 12288 features, right? So how is 3 x 2 going to work in that case?

The other mistake here is that the operation between w^T and X is a dot product style matrix multiply, not an “elementwise” multiply, which is what you have done. You need to use *np.dot* there.

Here’s a thread which talks about the notational conventions that Prof Ng uses for matrix multplications.