How does the activation function in between two layers in CNN’s help? What’s the point in adding activation function there?
Activation functions are applied to the layer’s output. In this case, the activation function is calculated from the output of the first layer and before the second one.
With respect to the second question, activation functions (and precisely non-linear activation functions) are one of the most important parts of a neural network. Without them, the network could only apply linear transformations to the input (very simple), so they give it the ability to approximate non-linear functions, which can be really powerful and perform really well.
But the convolution operation by itself is non linear isn’t it? Why there is a need of another non linear operator to its output?
The convolution is a linear operation (multiplication or else), check this video for more understanding. Convolutional Layer
@gent.spah thanks for sharing that video!
I got one more question, Is the no. of filters should always be same as no. of channels in the input image of the conv layer?
It’s the other way around. The number of filters is the number of output channels after the convolutional layer
Convolutional Neural Network is a class of deep neural network. The Convolutional Layer makes use of a set of learnable filters.
This filter is convolved (slided) across the width and height of the input file, and a dot product is computed to give an activation map. Different filters which detect different features are convolved on the input file and a set of activation maps is outputted.
Activation Function (What’s the point in adding activation function there?)
They help to decide if the neuron would fire or not.
The activation function is the non linear transformation that we do over the input signal. This transformed output is then sent to the next layer of neurons as input.
ReLU function is the most widely used activation function in neural networks today. One of the greatest advantage ReLU has over other activation functions is that it does not activate all neurons at the same time. From the image for ReLU function above, we’ll notice that it converts all negative inputs to zero and the neuron does not get activated. ReLU converges six times faster than tanh and sigmoid activation functions.