Minibatch Reinforcement Learning

Assume we have 10 training example, minibatch of size 2. so there will be five batches. Out doubt is

  1. After train the model, ie trained all the 5 batches of dataset only we need to set Q=New


  1. After processed each mini-batch of dataset, do we need to Q=QNew.

Hopefully, this would become clearer once you complete the programming exercise.