Does vdw reset to 0 after every epoch or do you it carry on. If it does carry on how do you account for removing bias during every iteration t if t resets to 0.
At each iteration, vdw gets updated based on the derivatives from the backdrop. And this is independent of
t you are referring to. The number of iteration does not play a role in in this updating formula. (Sorry if I’m misunderstanding.) In general, you want to carry over the vdw from the previous iteration so you keep the momentum from the previous iteration.
Does it help?