hey everyone!
how come the decision boundaries in the solution are piecewise linear even though the tanh activation function is used in the hidden layer?
thanks!
hey everyone!
how come the decision boundaries in the solution are piecewise linear even though the tanh activation function is used in the hidden layer?
thanks!
Please post a screen capture image that shows exactly where your question applies.
Hi Nikola, welcome to the community!
If I understand your question correctly, the combination of nonlinear activation in a hidden layer followed by a linear output layer results in the observed piecewise linear decision boundaries. Each neuron in the hidden layer computes a weighted sum of the input features followed by the nonlinear activation (in this case, tanh
). While the tanh
function introduces nonlinearity in the output of individual neurons, the output of the hidden layer is still a linear combination of these neurons.
After passing through the hidden layer, the network still applies a linear transformation to produce the final output (logistic regression in the output layer). This means that the decision boundary is defined as a combination of the linear contributions of the hidden layer, which results in piecewise linear decision boundaries.
The nonlinearity introduced by tanh
operates in the hidden layer’s feature space. However, in the original 2D input space (where the decision boundaries are visualized), this often results in regions that are separated by straight lines or piecewise linear sections. The network uses several such linear boundaries to approximate more complex shapes, but each section between two different outputs is still piecewise linear.
Hope this helps!
thank you so much!