During the fit call, i see for each step the loss value is decreasing so is it ok to assume that after every step (batch, usually 32 samples) the backprop is happening, if this is true shouldnt it be inefficient. I mean at end of the epoch makes sense by taking an avg of all the losses in each step.
The loss value not only change, but start decreasing with increase in step counter of same epoch
There is a backprop in every step. However, the mini-batch arrangement and the backprop are NOT for calculating more loss values and then finally average them. The loss calculation is just a monitoring, but it is NOT the cause of all these.
Here are 2 videos on mini-batch gradient descent (Video 1, Video 2) from the Deep Learning Specialization Course 2 Week 2 that discuss why we want to do it mini-batch wise.
Calculating many losses in one epoch is an effect, but not the cause.