hi.
i had a question about the assignment mentioned in the title.
isnt the logistic regression in the assignment kind of a linear neural network?
i searched and different sources are all suggesting that in order for image classification we need to add something else like using kernels because these type of problem isnt linearly separable.
so how can this code run correctly?
It’s not really a neural network, because there is no hidden layer.
The hidden layer (and its non-linear activation function) is what makes neural networks more capable than simple regression.
Yes, Logistic Regression has two steps:
The first is a linear transformation of the inputs, using a set of learned weights and a bias value. Well, the addition of the bias term makes it an “affine transformation” in the fully technical math terminology. The output of that first stage is the equivalent of Linear Regression, but what happens next is that you pass that value through the sigmoid function, which maps that to a number between 0 and 1 in a non-linear fashion and we interpret that as the probability that the classification answer is “yes” if the output is > 0.5. Of course the key point here is that the reason it works is that we train the weight and bias values by using the labeled training data and back propagation.
You’re right that the net result of this is that what Logistic Regression does is create a decision boundary that is a hyperplane (the higher dimensional equivalent of a line) that separates the “yes” and “no” answers. That’s what they mean by saying that Logistic Regression can only do “linear separation”. Of course how well that can work depends on your data: it may be that a linear decision boundary just doesn’t work in your particular case. If it doesn’t and you need a non-linear decision boundary, then you have to graduate to real neural networks with multiple layers. As Prof Ng mentions in the lectures, you can think of Logistic Regression as a “trivial” neural network with only one layer.
thank you very much