Happy model Assignment 2, Course 4

For our dataset here, the axis 0 is the batch, 1 and 2 are the dimensions and axis 3 is channel. So applying batch normalisation on axis 3 normalises it along the channels. For to why it’s normalised along the channel, you can read this thread

