W3_A1_ReLu as Activation function

has anyone tried using ReLU activation function in week 3 programming assignment of course 1?
I tried using it and I am getting accuracy of only 59 % and the boundary layer is similar to that of logistic regression. Can anyone let me know why is it so?

Yes, you can get this to work, but it requires a couple of things:

First check that you did the complete implementation of ReLU: it’s not just forward propagation that is affected, right? You need to modify your back prop logic as well. The derivative of ReLU is different than the derivative of tanh.

Then it just turns out that you need quite a few more neurons in the hidden layer and more iterations in order to get reasonable performance from ReLU on this particular task.

Here’s a thread from a while back that is on this same topic and goes into quite a bit of detail. And here’s one about the derivative of ReLU.

1 Like

Here’s another good thread I found by searching for “planar data relu”.

1 Like

Got it, Thanks a lot