I know that relu converges faster, but the zero platue seems counter inuitive. would’nt some neurons eventually reach a zero in wieghts, effectvely making them useless? That is impossible in sigmoid since there is never a straight line.
Well, it depends on the problem. When dealing with logistic problems, sigmoid is better but for regression problems, ReLU is better. Furthermore, it also depend where you are using them, hidden layers or output layer. Moreover, we have some variants of ReLU, like Leaky ReLU, which can produce some negative results, instead of making them zero.
Who is better, a shark or a lion? It depends on different conditions, right? If the condition is a sea, then the shark is better, and vice versa. Similarly, for binary classification problems, for the output layer, sigmoid is better.
Best,
Saif.
1 Like
Thanks. This really helped