W3_A1_ReLu as Activation function

Yes, you can get this to work, but it requires a couple of things:

First check that you did the complete implementation of ReLU: it’s not just forward propagation that is affected, right? You need to modify your back prop logic as well. The derivative of ReLU is different than the derivative of tanh.

Then it just turns out that you need quite a few more neurons in the hidden layer and more iterations in order to get reasonable performance from ReLU on this particular task.

Here’s a thread from a while back that is on this same topic and goes into quite a bit of detail. And here’s one about the derivative of ReLU.

1 Like