How should we think when initializing w & b?

I seem to be having a problem while initializing the parameters w and b in the model function. Could you give any tips on how to think about their shapes? Shouldn’t we be just considering the equation transpose of x times w + b?

Hi, @Kutay_Eroglu . Welcome to the specialization. When creating a new topic in Discourse, it is a great help to the community if you post the week, assignment number, and Exercise number in your topic heading. Example: W3, A1, Ex 3: How should we think … .

Also, please post a snapshot the “traceback” (i.e. the error log) generated by your code. Do not post the code directly from the function that you are trying to complete. That would be a violation of the course honor code.


1 Like

Here is an example of another learner’s post just today:

Week 4 Exercise 5 - L_model_forward shape challenges

1 Like

Hi, @Kutay_Eroglu !

Yes, you are right for lineal layers. For convolutional layers, check this post.

If you are using other types of layers or more complex ones, the output size will difer.

I’m guessing this question is about the C1 W2 Logistic Regression assignment. There the linear activation formula is:

Z = w^T \cdot X + b

In that case b is a scalar value and w is a column vector with dimensions n_x x 1, where n_x is the number of “features” or elements in each input sample. Each column of X is one input sample, so n_x is the number of rows in the X matrix.

All this was discussed in the lectures and also in the notebook. Please read the material in the notebook again carefully if you still have questions.

1 Like