# C1_W2_Lab02_Multiple_Variable_Soln, problem with understanding compute_gradient(X, y, w, b)

In the C1_W2_Lab02_Multiple_Variable_Soln, in the part of compute gradient we have the following function:

``````def compute_gradient(X, y, w, b):
"""
Computes the gradient for linear regression
Args:
X (ndarray (m,n)): Data, m examples with n features
y (ndarray (m,)) : target values
w (ndarray (n,)) : model parameters
b (scalar)       : model parameter

Returns:
dj_dw (ndarray (n,)): The gradient of the cost w.r.t. the parameters w.
dj_db (scalar):       The gradient of the cost w.r.t. the parameter b.
"""
m,n = X.shape           #(number of examples, number of features)
dj_dw = np.zeros((n,))
dj_db = 0.

for i in range(m):
err = (np.dot(X[i], w) + b) - y[i]
for j in range(n):
dj_dw[j] = dj_dw[j] + err * X[i, j]
dj_db = dj_db + err
dj_dw = dj_dw / m
dj_db = dj_db / m

return dj_db, dj_dw
``````

Since the derivatives are a cumulate sum over all examples, shouldn’t it be
`dj_dw[j] += err * X[i, j]` and then divide by m? I having trouble associating this code with the provided formulas. Can someone help me?

The code you propose would add dj_dw[j] to itself on every iteration of j, resulting in the gradients being 2 times too large.

You can either use “q += …”, or “q = q + …”, but not both at the same time.

I correct the code, in my option it should be `dj_dw[j] += err * X[i, j]` the same for b, `dj_db += err` . I’ve changed, and got the same results

I’m just having a hard time to understand the code how it was written. I can’t associate it with the original formula

OMG! I’m feeling dum right now hahahaah. I have just understood what you said. The cumulative sum is already happening, I just did not paid attention to the code. The inner for loop can also be changed to `dj_dw += err * X[i, :]` to update the entire vector all at once. It also yields the same result.

You can also omit the for-loop over the features, and compute that using a matrix product.