This is about Exercise 3 in Week 3 of Course 5 of the Deep Learning specialization.
I was just checking Tensorflows docs for Adam, and decay isn’t a parameter when creating an Adam optimizer. I believe this could be an error. There is a
weight_decay parameter though.
I also checked the base class Optimizer and there isn’t a decay parameter.
Is there a decay parameter? Which class implements it? Or maybe was it present in a previous version of Tensorflow?
The weight_decay parameter in TF should be similar to the decay parameter in the course notebook.
Ok. But why is it there if it doesn’t appear in the docs? Is there an Adam defined somewhere I’m not aware of in the code?
The Keras documentation for Adam discusses “Learning rate decay/scheduling” and gives a link where you can learn more.
The Keras documentation often omits arguments that are inherited from a parent class. You have to go on an expedition backward through the class hierarchy to find them.
In a brief search, I was not able to locate where the “decay” argument is used. There is a “weight_decay” argument though.