I’m struggling to understand the instruction that is set for the last question about training the model, shown below:
When examples[index]
contains one dinosaur name (string), to create an example (X, Y), you can use this:
Set the index idx
into the list of examples
- Using the for-loop, walk through the shuffled list of dinosaur names in the list “examples.”
- For example, if there are n_e examples, and the for-loop increments the index to n_e onwards, think of how you would make the index cycle back to 0, so that you can continue feeding the examples into the model when j is n_e, n_e + 1, etc.
- Hint: (n_e + 1) % n_e equals 1, which is otherwise the ‘remainder’ you get when you divide (n_e + 1) by n_e.
-
%
is the modulo operator in python.
I already figured this out. In case anyone had the same challenge at the beginning, kindly take note of the following:
- The code expects a single training sample per iteration and not the whole dataset. This is unlike a standard feed-forward network, where each iteration takes in the whole dataset in mini-batches.
- However eventually with the number of iterations (35K), the network ends up going through the samples more than once. Think of it as having a batch size of the whole dataset.
I hope I am correct about this.
UPDATE: I am still yet to obtain the correct answer. I now have the model outputting some interesting results. The loss I am outputting is actually lower than that of the grader. I cannot seem to place where the error is. I have some cool dinosaurs name too:
The last lines of my output look something like this:
Iteration: 20000, Loss: 21.056823
Rixtstapnosaurus
Miceadsomabosaurus
Owutoosaurus
Rabaessaacitatornythaycerogavsaurus
Zuromibosaurus
Haadropcarus
Yuocheroptosaurus
Iteration: 22000, Loss: 20.578871
Hutusaurus
Euca
Eustrioppn
Hocamptopanceus
Xuspeodon
Elacropechus
Uspeodon
One common mistake is to use the direct inputs, as opposed to using the “shuffled” version that they generate for you in the template code.
1 Like
Thanks @paulinpaloalto
Just adjusted the code and it worked.