Weights are independent of samples. The dimensions of W^{[l]} and b^{[l]} are determined only by the number of features in the input and then the number of output neurons in each layer, right?
The key point about minibatch gradient descent is that the all the weights are updated after each minibatch. That is as opposed to “full batch” gradient descent in which the weights get updated only after processing the entire batch of samples. The terminology is that one “epoch” means one training iteration through all the training samples. So in “full batch” the weights get updated once per epoch. In the case of minibatch, one epoch means iterating over all of the minibatches that form the full set of training samples, so the weights will be updated multiple times per epoch. That is the primary advantage of minibatch: when it works well, you end up being able to get the same level of convergence with fewer total epochs of training. Of course, as with everything here, there is no guarantee that always happens, since you can also have more statistical noise in the gradients in the minibatch case, particularly if the minibatch size is very small. So you may also need to also apply momentum or other techniques to mitigate that.