Source for Weight Decay technique

In the video “Train a CNN for Image Classification” when the Weight Decay technique is being explained there is a reference mentioned:
Mahmud (2021)

Here do you refer to the next link or it is a different source?

It does not appear in the Resources section.

Hi Jesus_Martos,

I am not sure to which publication Mahmud(2021) refers.

In a paper from 2021, Mahmud, Morshed, and Hasan (https://arxiv.org/pdf/2107.02543) refer to an article by Loshchilov and Hutter (2019) (https://arxiv.org/pdf/1711.05101). In their turn, Loshchilov and Hutter refer to Hanson and Pratt (1988) as their source of the idea of weight decay. Their reference is in fact incorrect, as it should be pointing to the 2nd International Conference on Neural Information Processing Systems:

Stephen José Hanson and Lorien Y Pratt. Comparing biases for minimal network construction with back-propagation. In Proceedings of the 2nd International Conference on Neural Information Processing Systems, pp. 177–185, 1988. (https://dl.acm.org/doi/10.5555/2969735.2969756)

In their turn, Hanson and Pratt refer to a personal communication by David Rumelhart (1987) as the source of the idea. David Rumelhart - Wikipedia

Rumelhart was the first author of the famous back-prop paper from 1986: Learning representations by back-propagating errors | Nature

In this context, the wikipedia page on backpropagation may be of interest to you. Backpropagation - Wikipedia

Hopefully this provides some kind of answer to your question.

1 Like