Is there are another way to optimize gradient descent

code like that to solve a result of partial derivative I noticed that two nested for loops here that means the time complexity is O(n^2) that means in the big data set I think it will be not efficient to use this formula or should I say use different other optimized gradient descent like Stochastic Gradient Descent. sklearn use a Stochastic Gradient Descent instead of a normal gradient descent. so I am right about my intuition about O(n^2) is not efficient in big data set ?

def compute_gradient(X, y, w, b): 
# moderator edit: code removed

For-loops are not needed. Matrix algebra is the preferred tool for practical use.

However, for-loops are used in this course because it’s a basic-level introduction, and there are no mathematical prerequisites for knowledge of matrix algebra. For-loops are more intuitive for students with little (or no) algebra or programming experience.

1 Like

Optimization methods are discussed later in the course.

2 Likes

Sorry about my curiosity. for you it’s dump questions I know.
thank you so much for helping

Note that I’m doing to delete the code from your message, because that function is part of a programing assignment later on.