Referring this image,
Just curious, is there any specific reason for the use of variable names of V and S in these equations (I think they are coming up in other optimisation algorithms like RMSprop and Gradient descent with momentum). Do they stand for something particular?
I think as far as I remember now, they stand for velocity and speed which is an anlogy of particle movements in physics!
2 Likes
-
V stands for
Velocity
(an exponentially decaying moving average of past gradients) -
S stands for
Squared Gradient
(accumulates the square of gradients to adjust the learning rate for each parameter individually)
Hope it helps!
5 Likes