Dinosaur Island model function - wrong output

My model function is successfully running through iterations, but the loss is not decreasing at all. I’m not sure what I’ve done wrong here.

As always, there are a number of details that all need to be right in order for things to work correctly. I don’t recognize that output as a syndrome that I’ve seen before. One common mistake here is to use the sorted names as the input instead of the randomly shuffled version that they create for you in the template code. But if you make that mistake, you still get sensible looking names that are just different than expected. In your output, the names are basically junk, so something more serious must be wrong.

Are you sure that all your previous functions pass their test cases in the notebook?

Actually I just checked my notebook and here’s what I am seeing after 0 iterations:

Notice that my output at 0 is exactly the same as yours after 22000 iterations. So that would indicate that whatever the bug is it causes exactly nothing to happen or “train” over the course of all those iterations.

I hope that is a clue as to the nature of the error.

To close the loop on the public thread, this was a really interesting bug:

I had never noticed this before, but the way optimize works is that it treats the parameters dictionary as a global object and it needs to update the values “in place”, meaning it is changing the global values. Notice that it does not return the parameters as a return value.

The problem in this case was that the optimize code was incorrect because it used “=-” as the operator to apply the gradients instead of “-=”. The former has completely different semantics and memory behavior.

But this case also turned up a serious deficiency in the test cases for optimize: they don’t catch this bug. I will file a bug about that.

1 Like

Actually now that I look a little bit harder at this, it turns out we don’t need to do the manual updating of the parameters with -= at all: they gave us the function update_parameters as a utility function and mention this in the instructions. They also explain the point that the parameters dictionary is treated as a global object and not returned by the optimize function.

1 Like