Minibatch Reinforcement Learning

HI Sir,

Assume we have 10 training example, minibatch of size 2. so there will be five batches. Out doubt is

  1. After train the model, ie trained all the 5 batches of dataset only we need to set Q=New

or

  1. After processed each mini-batch of dataset, do we need to Q=QNew.

Hopefully, this would become clearer once you complete the programming exercise.