Is the momentum term scalar or vector?

In gradient descent with momentum, are v_{dW} and v_{db} scalar, or a vector of the same dimension as dW and db?
I understood the concept and the use of EWAs pretty well, but I don’t get this. Because of python allows for broadcasting, it could be a scalar and then be converted to a vector while updating gradients.

Hi, Utkarsh.

With a basic understanding of fundamental physics, it is very well established that momentum is a vector quantity, as it contains both the magnitude and direction. With gradient descent, we are trying to reach the global minimum, the converging area.

Here’s also a good read that could bring in perspectives related to your query.

1 Like