As written in the title, are their any intuition behind why logistic regression gives us a straight border of red and blue, where as a shallow neural network can give us a more accurate border for planar data classification?
That is what Logistic Regression does. Look at what happens: you learn a single weight vector w and a bias value b which is a scalar. Then you apply the following linear transformation to any input sample vector x:
z = w^T \cdot x + b
Then you apply the sigmoid function to that. If you think about the geometric interpretation of what is happening there, the combination of w and b define a hyperplane in the input space. That affine transformation above is just the dot product of the normal vector to the plane with the input vector x. The sign of the resulting z value tells you which side of the plane the input vector x is on. If the result is zero, then the input is right on the decision boundary. Then the sigmoid just converts positive numbers into values > 0.5 and negative values to outputs < 0.5.
When we graduate to a real neural network with more than one layer, then the function it can represent gets much more complicated.
“The sign of the z value tells you which side of the plane the input vector x is on” this was the intuition I needed. Thank you Paul!