I am struggling to understand what the “LINEAR” part is referring to in the Deep Neural Network - Application assignment (section 3.1), other than the input vector that has been flattened for the first LINEAR layer and then the outcome of the first function for the second LINEAR layer (that the activation function is applied to).
INPUT → LINEAR → RELU → LINEAR → SIGMOID → OUTPUT
What would be the best way to conceptualize or understand this?
Thank you!
Prof Ng explained this in the lectures. The processing at each layer of a neural network consists of two steps:
The “linear activation” which is expressed by this formula:
Z^{[l]} = W^{[l]} \cdot A^{[l-1]} + b^{[l]}
The “nonlinear activation” which is the elementwise application of the non-linear activation function for the layer to the Z^{[l]} value:
A^{[l]} = g^{[l]}(Z^{[l]})
In the first layer of the diagram you show, g^{[1]}() is ReLU and in the second layer g^{[2]}() is sigmoid.
Note that step 1 is what is called a “linear transformation” in mathematical terms. Well, if you want to go “full terminology” it is an “affine transformation” which is a particular type of linear transformation.