Why we always calculate Z using a linear equation?

Mukul_Tewathia · April 19, 2023, 4:38am

In all the examples seen in neural network training & forward propagation why do we always use a linear function to calculate Z

Z = W*X + B          # Linear function of X
Output = Activation_Function(Z)

Can’t we use a non-linear function to calculate Z, like below?

Z = W*(X^2) + B     # Quadratic function: Notice X^2 instead of X
Output = Activation_Function(Z)

saifkhanengr · April 19, 2023, 4:45am

Hi @Mukul_Tewathia!

This is an interesting question. I’ve never thought about this before. I just pasted your question in ChatGPT and this is what it reply:

While it is technically possible to use a non-linear function like a quadratic function to calculate Z in a neural network, using linear functions like Z = WX + B is a common and effective approach for several reasons:*

1) Linearity simplifies the learning process: Using linear functions makes the optimization problem of training a neural network much simpler. Linear functions have only one global optimum, which can be found using techniques like gradient descent. Non-linear functions, on the other hand, can have multiple local optima, which can make the optimization process more challenging.

2) Non-linear activation functions already provide non-linearity: In neural networks, the non-linear activation functions like sigmoid, ReLU, or tanh already provide non-linearity to the model. Using non-linear activation functions allows the model to capture complex patterns and relationships between the input and output without needing to use non-linear functions to calculate Z.

3) Efficiency: Linear functions are computationally efficient, and their derivatives can be easily calculated. Using non-linear functions to calculate Z can significantly increase the computational complexity of the model, making it slower and more computationally expensive.

Isn’t it make sense?

Best,
Saif.

TMosh · April 19, 2023, 10:31pm

Non-linear models are created by adding non-linear combinations of the original features.

This allows us to use simple and easily-computed linear models but still get complex relationships.

Topic		Replies	Views
Doubt in formula for calculating Z Neural Networks and Deep Learning coursera-platform	3	599	May 5, 2021
Non-linearity in Z Neural Networks and Deep Learning coursera-platform	1	503	September 13, 2022
The need for a bias term AI Discussions	1	57	May 17, 2021
Why do you need Non-Linear Activation Functions? Neural Networks and Deep Learning coursera-platform	3	701	March 15, 2022
Isn't Relu just a lineer regression function for z>=0 Supervised ML: Regression and Classification week-module-3	6	690	December 24, 2022

Why we always calculate Z using a linear equation?

Related topics