L_model_forward qns

why is it that only the relu function action requires a loop to make it an L model but sigmoid function requires no looping ?

Hi @zheng_xiang1

First we build an model with many layers, and we want to make our function generalize over many layers, so we decide that all activation functions of all the hidden layers is relu so we make an for loop ovel hidden layers, and the output layer we decide the activation function of it is sigmoid because we built binary classification model like the image below

Best Regards,

i see, so this is for a specific L sized model with the output being a sigmoid right?

thanks !

Yes, the network architecture we are implementing here can have any number of hidden layers and you can choose the sizes of all those layers, but all the hidden layers use ReLU as their activation function. The network is performing a binary classification, so there is one neuron in the output layer and the activation function needs to be sigmoid to convert that into the probability that the answer is β€œyes”.

1 Like