With batch, one epoch allows you to take one GD step, whereas with mini-batch, a single epoch allows you to take t gradient descent steps. So does this mean you need less epochs with mini batch to achieve the same results as batch?
That is the intent and the hope. As with everything here, it is not a guarantee, since it also depends on hyperparameters being well chosen (e.g. learning rate, minibatch size and so forth).
1 Like