Gradient Descent Query

In Batch GD:
The params are updated simultaneously after considering entire dataset for each iteration of GD.

In Stochastic GD:
Params are updated one at a time after condiering each training example
Hence while calculating gradient for (say) b, the value for parameter w is the new value that
was got after training using the previous example.

Is this a correct understanding? :thinking:

Yes, your understanding is correct. In Batch Gradient Descent, the parameters are updated simultaneously after considering the entire dataset for each iteration of Gradient Descent. This means that the algorithm calculates the average gradient of the cost function for all the training examples and updates the parameters accordingly.

On the other hand, in Stochastic Gradient Descent (SGD), the parameters are updated one at a time after considering each training example. The algorithm randomly selects a training example, computes the gradient of the cost function for that example, and updates the parameters in the opposite direction. As you mentioned, while calculating the gradient for one parameter (e.g., b), the value for the other parameter (e.g., w) is the new value that was obtained after training using the previous example.