Week 1 RNN Concepts

yeoh_zhewei · March 29, 2023, 1:10pm

In the basic RNN assignment, input x is one hot encoded. For the Dino Island one, the input x are integer representations of the characters. Is the one hot encoding function inside the helper functions like rnn_forward?

Another question, in basic RNN, forward propagation is done by looping through time steps with each time step having m examples. For Dino Island, it is done with 1 example at 1 time. This is because stochastic gradient descent is used instead right?

Kic · March 29, 2023, 2:49pm

Hi @yeoh_zhewei ,

The rnn_forward() function is taking care of the one hot encoding. You can see the source code by clicking:
file ->open->utils.py

The model is given a collection of dino names as training examples, and the learning is done on a real dino name one at a time. There is nothing from the model() function that uses stochastic gradient descent. No sure how stochastic gradient descent would help the generation of dino names if we want the model to generate as close to the real dino name as possible.

paulinpaloalto · March 29, 2023, 4:10pm

Yes, notice that there are two completely separate things going on in this exercise:

The gradient descent to train the model (the optimize function) and that works on the full batch of training data.

Then there is the sample function which uses the trained model to generate one name at a time.

yeoh_zhewei · March 31, 2023, 3:22pm

In the optimize() code; gradient decent, it takes in X and Y where:
X – list of integers, where each integer is a number that maps to a character in the vocabulary.
Y – list of integers, exactly the same as X but shifted one index to the left.

In the model() code:
x is a single example, 1 word.
y is the ground truth of the 1 example.

using the for loop with j in num_iterations, the optimize function is used which its input is 1 example. with each loop iteration, the parameters are updated using the optimize function. In this sense, isn’t it doing gradient descent every 1 example? (I might have midunderstood stochatic gradient descent)

Anyway, my question is why is it updating parameters every 1 example instead of on the whole batch of training data?

RISHABH_RAJ_PATEL · May 29, 2023, 6:06pm

so Only reply here if:

You have additional details
The solution doesn’t work for you

If you have an unrelated issue, please start a new topic instea

paulinpaloalto · May 29, 2023, 6:24pm

I think it’s what you said in your parenthetical comment: they are doing the RNN version of Stochastic GD and updating the parameters after each sample.

Topic		Replies	Views
Dinosaurus_Island_Character_level_language_model optimize Sequence Models week-1 , coursera-platform	5	494	January 7, 2024
Week 1, Programming assignments (# 2, #3 ) Sequence Models week-1 , coursera-platform	4	43	October 6, 2024
Week 1 Coding Assignment 2 Sequence Models coursera-platform	4	429	August 12, 2023
Week1-Assignment2-Exercise4: Model() Sequence Models coursera-platform	1	509	July 18, 2022
Fix this explanation, please Sequence Models coursera-platform	1	464	May 26, 2023

Week 1 RNN Concepts

Related topics