Week1 Quiz doubt

vaibhavoutat · April 9, 2023, 8:05am

{Moderator’s Edit: Quiz Solution Removed}

Can you kindly elaborate on this explanation?

Elemento · April 16, 2023, 7:22am

Hey @vaibhavoutat,
Apologies for the delayed response. The answer to this lies majorly in 2 equations for the GRU network, as follows:

\tilde{c}^{<t>} = tanh (W_c[\Gamma_r * c^{<t-1>}, x^{<t>}] + b_c) \\ c^{<t>} = \Gamma_u * \tilde{c}^{<t>} + (1 - \Gamma_u) * c^{<t-1>}

Now, in order to remove the vanishing gradient problem for long sequences, we need to make sure that the gradients can back-propagate from c^{t} to c^{(t-1)} with as less bottlenecks as possible. Note that the question doesn’t ask us about \Gamma_u, but about \Gamma_r, hence, in equation 2, the only term, we should concern ourselves with is \tilde{c}^{<t>}, since the rest of the terms are not in our control.

Now, we want to make this as much dependent on c^{(t-1)} as possible, and to do this, we can simply set \Gamma_r = 1, hence, the explanation. Let us know if this resolves your error.

P.S. - Posting solutions publicly is strictly against the community guidelines, so I will be removing the image.

Cheers,
Elemento

Topic		Replies	Views
GRU and vanishing gradients Sequence Models	6	634	November 7, 2022
Course 5, week 1: How is it that -- because the GRU update gate is usually close to 0 -- we do not have a vanishing gradient problem? Sequence Models	5	560	June 26, 2022
C5-W1-quiz GRU question Sequence Models	9	698	May 24, 2022
Week 1 - Quiz Sequence Models	1	306	December 21, 2023
C5_W1’s Quiz: Sarah and Ashely Sequence Models	1	555	September 10, 2023

Week1 Quiz doubt

Related topics