I didn’t understand the point why dividing a3 by keep prob and later on z3 by keep prob? If we want to scale up . shouldn’t we multiply instead?
I didn’t understand the point why dividing a3 by keep prob and later on z3 by keep prob? If we want to scale up . shouldn’t we multiply instead?