Sampling and RNNs

arvyzukai · August 7, 2023, 2:09pm

Well… I think you are correct if I interpret your sentences correctly:

Intuitively, the reset gate controls how much of the previous state we might still want to remember. Likewise, an update gate would allow us to control how much of the new state is just a copy of the old state.

Not quite. There are different sampling techniques (during inference). The “greedy” sampling is as you said - choosing the best one. While there are other sampling techniques which have some parameter (usually called temperature), that helps you control how “greedy” you want to be. For example, if you had the probabilities of [0.1, 0.2, 0.7], the “greedy” version would always choose the character at the third position (0.7), while others, depending on the temperature (and other) settings might sample from the distribution [0.1, 0.2, 0.7] or [0.0001, 0.05, 0.9499] or other.

And usually that (sampling) has nothing to do with training the model on some dataset.

If I understand you correctly - no.

First, update gate value is not c (c in your picture is the hidden state, or H_t in my previous illustration), but update is z (\Gamma_u in your picture, Z_t in my illustration), which influences the hidden state (but is not the same).

Second, the update gate value is calculated at every step (from previous hidden state and current input).

Third, it depends how you understand layers - I think you confuse layers vs. steps (check this post).

So, if you understand what I wrote, then the update value controls how much this step (hidden state of this step c_t) is a copy of the previous hidden state (c_{t-1}).

Cheers

Topic		Replies	Views
Course 5 week 1 - GRU and Sampling novel sequences questions Sequence Models coursera-platform	1	648	May 9, 2022
Sequence Models Week 1 Quiz Sequence Models coursera-platform	15	738	December 4, 2024
Course 5, week 1: How is it that -- because the GRU update gate is usually close to 0 -- we do not have a vanishing gradient problem? Sequence Models coursera-platform	5	561	June 26, 2022
Question about GRU Sequence Models coursera-platform	1	419	July 22, 2023
Week 1 - Quiz Problem Sequence Models week-1 , coursera-platform	1	297	January 20, 2024

Sampling and RNNs

Related topics