What are the Logistic Regression parameters?

Hi

Why the Parameters of Logistic Regression are w and b ?
Its not clear in the Week 2, 2nd video what w and b represent?
Why do we even need b?

Regards
Jorge.

The w and b values are the parameters in the formula for the linear activation step of logistic regression:

z = \displaystyle \sum_{i = 1}^{n_x} (w_i * x_i) + b

\hat{y} = sigmoid(z)

So the w_i values are the coefficients in that linear formula. You can consider this to be the multidimensional generalization of the familiar formula for a line in the plane:

y = mx + b

If you eliminate the b term in logistic regression, it is analogous to eliminating b in the formula for a line: without b, it means you can only define lines (planes) through the origin. That is what mathematicians would call “a significant loss of generality”. :nerd_face:

1 Like

When Prof Ng calls the w and b values the “parameters”, it means those are the things that we can change (vary) to find a better solution. The point is that you cannot change the x_i values, right? Those are the predefined inputs.

Does w here stand for weight? As in, these coefficients are weights that are being calculated as part of the minimization of the loss? In the course materials, they’re referred to as w transpose. Why are they being tranposed? I feel this is really glossed over completely in the videos, maybe I missed some course materials.

Yes, the w vector represents the “weights” or coefficients in the linear transformation that is the first step in calculating the output of Logistic Regression. It is an arbitrary choice that Prof Ng makes to define all standalone vectors as column vectors. So if w is a column vector and x (one of the input “sample” vectors) is also a column vector, then you need to transpose w in order to compute the dot product between w and x, which is the vectorized way to express the mathematical formula I showed in my earlier response on this thread.

Note that once we get to full Neural Networks in Week 3 of Course 1, the weights become matrices instead of vectors and then Prof Ng can choose to arrange them in such a way that the transpose will no longer be required.