I was doing the programming assignment 2 of week 1 “Convolution_model_Application”. In exercise 1 while building sequential model we are using batch normalization for axis 3. As per my understanding axis 3 is for channels.
Q1) Why is batch normalization being used specifically for axis 3?
Also after running the command happy_model.summary(), it was shown that the params for batch normalization were 128.
Q2) How did batch normalization resulted in creating parameters?