has anyone tried using ReLU activation function in week 3 programming assignment of course 1?
I tried using it and I am getting accuracy of only 59 % and the boundary layer is similar to that of logistic regression. Can anyone let me know why is it so?
Yes, you can get this to work, but it requires a couple of things:
First check that you did the complete implementation of ReLU
: it’s not just forward propagation that is affected, right? You need to modify your back prop logic as well. The derivative of ReLU
is different than the derivative of tanh
.
Then it just turns out that you need quite a few more neurons in the hidden layer and more iterations in order to get reasonable performance from ReLU on this particular task.
Here’s a thread from a while back that is on this same topic and goes into quite a bit of detail. And here’s one about the derivative of ReLU.
1 Like
Here’s another good thread I found by searching for “planar data relu”.
1 Like
Got it, Thanks a lot