I get the reson of activation functions. When the changes in slope are abrupt we need some advances transformation, that is why activation function like relu is used
but I am still confused why polynomial is called linear regression. We do feature engineering in this to multiple the feature X with itself twice of thrice like X^2 or X^3 and so on. So why it is not an activation function but a linear regresion?
If you define a variable x1 equal to x, x2 equal to x*x, and x3 equal to x*x*x you could write that y = w1*x1 + w2*x2 + w3*x3 + b, with w1, w2, w3, and b floats, constitutes a linear equation and that determining w1, w2, w3, and b constitutes ‘linear regression’.
When an equation includes an activation function, this is no longer possible. Take relu as an example: it cuts off negative values while passing through positive ones. When negative values are cut off, the equation is no longer linear. You can make a similar argument for sigmoid, tanh, leaky relu, selu and other activation functions.