Week5 Question about GRU

Alenn · August 2, 2022, 9:41pm

Like the picture shown, why the softmax use the c instead of the c~ to predict the result? If the gamma in c is 0, then the X will not be used to predict the result. In my opinion, should’t the predict result are affected by the x and c?

anon57530071 · August 3, 2022, 1:47am

Here is the flow in GRU.

As you see, the input to Softmax is c^{<t>}, which consists of two key terms, \tilde{c}^{<t>} and c^{<t-1>}. The point here is how to “balance” old information, c^{<t-1>}, and "new information,\tilde{c}^{<t>}. And, the update gate has that responsibility, and generates “\Gamma_u” for that purpose.

Think about a sentence. To generate a next word, sometimes an old information is important, but sometimes, only the last input should be referred.

So, \Gamma_u =0 is a valid option, but not sure it will be exact 0 or close to 0 in the real world.

And, you are very close.

If the gamma in c is 0, then the X will not be used to predict the result.

That’s the purpose of \Gamma_u, but the objective is as you wrote,

In my opinion, should’t the predict result are affected by the x and c?

GRU uses \Gamma_u for balancing.

Hope this helps.

Alenn · August 3, 2022, 7:17pm

That make sense thanks.

Topic		Replies	Views
C5-W1-quiz GRU question Sequence Models coursera-platform	9	742	May 24, 2022
C5W1 GRU RNN, activation preserved? Sequence Models coursera-platform	6	544	May 23, 2021
GRU Gates, c<t> vs a<t> Sequence Models coursera-platform	1	478	May 23, 2023
[Week 1] Can some explain each choice in this question? Sequence Models coursera-platform	3	531	November 26, 2021
Gated Recurrent Unit [GRU] Sequence Models coursera-platform	1	495	March 12, 2023

Week5 Question about GRU

Related topics