Hi,

I am a little confused by the first calculation in this exercise. Applying the sigmoid function in Course 1 Week 2 exercise 5

We are given (in the calling module) :

w = a 2 by 1 np.array and X, a 2 by 3 np.array.

We are to transpose w using the .T function and this immediately makes the two arrays incompatible for broadcasting.

Am I reading this wrongly? When I run the code with w.T it fails.

When I run it without the .T it appears to work ( I haven’t got to the second calc yet!)

Thanks

Ian

It sounds like the problem is that you are using “elementwise” multiply where you should be using a dot product. The formula for the “linear activation” (the first step before you apply *sigmoid*) is this:

Z = w^T \cdot X + b

So the operation between w^T and X is a dot product. w is n_x x 1 and X is n_x x m so that will work. 1 x n_x dotted with n_x x m gives a 1 x m result, right?

It is important to realize the notational conventions that Prof Ng uses. Here’s thread about when to use * versus dot product.

Thank you Paul.

I spent a few hours trying to decipher the term ‘activation’ and assumed I was being asked to apply the sigmoid function. Clearing that up + my confusion over dot product vs elementwise multiplication should set me on the right track.

Regards

Ian

Great! Of course you do apply the sigmoid, but only *after* doing the “linear” activation. Logistic Regression is the model for how all layers in Neural Networks will work when we get to that in Week 3: you first do a linear transformation, followed by the application of the non-linear activation function (sigmoid in this case, but we’ll see other such functions later).