Week 1 Coding Assignment 2

Hi guys, I am working on the week 1 dinosaur name generator assignment and I am having trouble grasping how the network is learning anything with the training examples that we create during the assignment, here is an example of one:

single_example_chars [‘t’, ‘u’, ‘r’, ‘i’, ‘a’, ‘s’, ‘a’, ‘u’, ‘r’, ‘u’, ‘s’]
single_example_ix [20, 21, 18, 9, 1, 19, 1, 21, 18, 21, 19]
X = [None, 20, 21, 18, 9, 1, 19, 1, 21, 18, 21, 19]
Y = [20, 21, 18, 9, 1, 19, 1, 21, 18, 21, 19, 0

If the input sequence X and target sequence Y are the same then how is the network learning anything?

Does this text from the notebook help?

  • At each time-step, the RNN tries to predict what the next character is, given the previous characters.
  • \mathbf{X} = (x^{\langle 1 \rangle}, x^{\langle 2 \rangle}, ..., x^{\langle T_x \rangle}) is a list of characters from the training set.
  • \mathbf{Y} = (y^{\langle 1 \rangle}, y^{\langle 2 \rangle}, ..., y^{\langle T_x \rangle}) is the same list of characters but shifted one character forward.
  • At every time-step t, y^{\langle t \rangle} = x^{\langle t+1 \rangle}. The prediction at time t is the same as the input at time t + 1.
2 Likes

I think it does, so if I am understanding correctly, for the second pair of numbers in the sequence, the network is mapping how to get 21 when 20 is the “x” input, simultaneously taking the “a” input into account from the previous cell.

Your understanding is correct.
RNN aims to predict character at time t + 1 when X represents character at time t and the hidden state represents an encoding of the data seen till time t - 1.

1 Like

thank you!