Hello, I have a question according to the working principle of Mini-batch gradient descent. If I got it right in each epoch I end up with several cost functions J{t} as well as several weight matrices W[l]{t} and biases b[l]{t}. How do I combine them?
It’s not that you need to combine different cost functions, weight matrices, and biases. Is that you update them more often, one time per mini-batch, instead of one time per epoch.
I hope I understand your question well and my answer is helpful.