What I don’t get from the following video is that why are their 1 to m amounts of z in one hidden layer? Is it indexes of the hidden neurons in a layer? can someone please explain this to me, Thanks in advance.
Hi, @James_Nathan_O.
m
is the size of the mini-batch. For each activation you have m
values in the mini-batch, and you’re computing their mean and variance.
I think the explanation in section 3 of the paper is very clear, in case you want to take a look at it.
Hope you’re enjoying the course