Hi, @MalayAgr.
It’s the same m
, the size of the mini-batch. Here’s an interesting discussion about this scaling factor.
When you add momentum, W is still calculated as. But V_{dw} depends on dw, which now has an additional term:
Hope that helped
Hi, @MalayAgr.
It’s the same m
, the size of the mini-batch. Here’s an interesting discussion about this scaling factor.
When you add momentum, W is still calculated as. But V_{dw} depends on dw, which now has an additional term:
Hope that helped