Loss function of RNN

qihua-william · July 13, 2024, 2:58am

I learned image 1 from Week 1 of Course 5 , Sequence Model, of Deep Learning Specialization and image 2 from Week1 of Course 3 , Sequence Model, of NLP Specialization.
(image 1)

(image2)

The cost function looks different in above 2 courses. Which one is right? Thanks for help.

paulinpaloalto · July 13, 2024, 4:09am

There is some flexibility in how you define the loss in cases like this. There are a lot of different possible RNN architectures. I think the differences are pretty straightforward here. In the DLS C5 case, it looks like the classification is binary, so you have the binary cross entropy loss and they take the sum across the timesteps. In the NLP case, it is a multiclass case, so you have the softmax version of cross entropy loss and they also choose to take the average over the time steps instead of the sum over the time steps. As long as you are consistent in how you do that in a given case, you can choose either method.

qihua-william · July 13, 2024, 9:15am

Thank you for the explanation.

Topic		Replies	Views
Week 1 questions Sequence Models coursera-platform	1	526	December 26, 2021
RNN Cost Function Sequence Models coursera-platform	3	505	April 20, 2023
Can someone explain why an RNN is depicted like this (Cost Function)? NLP with Sequence Models week-1	1	19	May 25, 2025
W3_A1_Compute cost_cross entropy loss Neural Networks and Deep Learning week-3 , coursera-platform	7	547	December 14, 2023
Misunderstanding of softmax loss in RNNs Sequence Models coursera-platform	2	360	September 10, 2023

Loss function of RNN

Related topics