C5, W1, A2 - dinosaurs character language modelling

I’m struggling to understand the instruction that is set for the last question about training the model, shown below:

When examples[index] contains one dinosaur name (string), to create an example (X, Y), you can use this:

Set the index idx into the list of examples
  • Using the for-loop, walk through the shuffled list of dinosaur names in the list “examples.”
  • For example, if there are n_e examples, and the for-loop increments the index to n_e onwards, think of how you would make the index cycle back to 0, so that you can continue feeding the examples into the model when j is n_e, n_e + 1, etc.
  • Hint: (n_e + 1) % n_e equals 1, which is otherwise the ‘remainder’ you get when you divide (n_e + 1) by n_e.
  • % is the modulo operator in python.

I already figured this out. In case anyone had the same challenge at the beginning, kindly take note of the following:

  • The code expects a single training sample per iteration and not the whole dataset. This is unlike a standard feed-forward network, where each iteration takes in the whole dataset in mini-batches.
  • However eventually with the number of iterations (35K), the network ends up going through the samples more than once. Think of it as having a batch size of the whole dataset.

I hope I am correct about this.

UPDATE: I am still yet to obtain the correct answer. I now have the model outputting some interesting results. The loss I am outputting is actually lower than that of the grader. I cannot seem to place where the error is. I have some cool dinosaurs name too:

The last lines of my output look something like this:

Iteration: 20000, Loss: 21.056823


Iteration: 22000, Loss: 20.578871


One common mistake is to use the direct inputs, as opposed to using the “shuffled” version that they generate for you in the template code.

1 Like

Thanks @paulinpaloalto

Just adjusted the code and it worked.