Course1 - Week3 Assignment - ReLU gave worse performance than tanh

synguyen · August 23, 2021, 5:02am

Hi,

I tried the ungraded exercises and implemented ReLU (as well as Leaky ReLU) as the activation function for the hidden layer. On this assignment, it turned out that ReLU performed far worse than with tanh.

I did a quick Google search and it seems ReLU is recommended over tanh and sigmoid for simple neural networks for computational efficiency.

Did anyone try and what is your observation? Is ReLU supposed to be inferior to tanh in this particular case?

Thank you!
Sy

paulinpaloalto · August 26, 2021, 4:12am

You should be able to get pretty decent performance from ReLU on this exercise, but you may need to fiddle with the learning rate and number of iterations. It also doesn’t do so well with small numbers of neurons in the hidden layer. I was able to get 81% accuracy with n_h = 40, \alpha = 0.6 and 12k iterations.

One important thing to check is that you did the complete and correct implementation: note that you need to change more than just forward propagation, right? The derivative of the activation functions is part of back propagation, so that needs to change as well.

AYOUBKECHCHOUR · September 9, 2021, 3:07pm

Hey,
I guess we should modify the function and also g[1]’(Z[1]) , but how can we implement its derivation on dZ[1] ?

paulinpaloalto · September 9, 2021, 3:37pm

Well, what is the formula for dZ^{[1]}?

dZ^{[1]} = \left ( W^{[2]T} \cdot dZ^{[2]} \right ) * g^{[1]'}(Z^{[1]})

So (as you say), you need the formula for the derivative of ReLU. You have this:

g(Z) = max(0, Z)

Which means Z if Z >= 0 and 0 if Z < 0. So what would the derivative of that look like? It would be 0 if Z < 0 and 1 if Z >= 0, right? I can write that in one easy line of python and numpy.

Topic		Replies	Views
How to apply relu function in Exercise of week 3(optional).) Neural Networks and Deep Learning coursera-platform	5	540	July 12, 2023
W 3_A1_ReLU vs tanh accuracy Neural Networks and Deep Learning coursera-platform	8	650	November 13, 2022
ReLU and sigmoid alternatives in Week 3 assignment Neural Networks and Deep Learning coursera-platform	11	887	July 20, 2022
W3 A1 Relu Activation doesn't work Neural Networks and Deep Learning week-module-3 , coursera-platform	2	38	October 29, 2024
W3_A1_ReLu as Activation function Neural Networks and Deep Learning coursera-platform	3	623	March 30, 2023

Course1 - Week3 Assignment - ReLU gave worse performance than tanh

Related topics