Issue with Assignment week 3

Hello Everyone,

I am having issue in the graded assignment for the compute_gradient function.

My code is as follows:

def compute_gradient(X, y, w, b, lambda_=1):
Computes the gradient for logistic regression

  X : (ndarray Shape (m,n)) variable such as house size 
  y : (array_like Shape (m,1)) actual value 
  w : (array_like Shape (n,1)) values of parameters of the model      
  b : (scalar)                 value of parameter of the model 
  lambda_: unused placeholder.
  dj_dw: (array_like Shape (n,1)) The gradient of the cost w.r.t. the parameters w. 
  dj_db: (scalar)                The gradient of the cost w.r.t. the parameter b. 
m, n = X.shape
dj_dw = np.zeros(w.shape)
dj_db = 0.

for i in range(m):
    z_wb = 0
    for j in range(n):
        z_wb_ij = X[i, j] * (w[j])
        z_wb += z_wb_ij
    z_wb += b
    f_wb = sigmoid(z_wb)
    dj_db_i = f_wb - y[i]
    dj_db += dj_db_i
    for j in range(n):
        dj_dw_ij = (f_wb - y[i])* X[i][j]
        dj_dw[j] = dj_dw_ij
dj_dw = dj_dw / m
dj_db = dj_db_i/m

return dj_db, dj_dw

initial_w = np.zeros(n)
initial_b = 0.

dj_db, dj_dw = compute_gradient(X_train, y_train, initial_w, initial_b)
print(f’dj_db at initial w (zeros):{dj_db}’ )
print(f’dj_dw at initial w (zeros):{dj_dw.tolist()}’ )

i think my code is good but still having different result thus failing the assignment, could anyone please help?

Thank in advance.

Best Regards,

Hi @bhuvan_gunesh ,

Please revisit the hints on how to

get dj_dw for each attribute

Also, you have changed the definition of compute_gradient() with lambda set to 1. The default for lambda is set to none. Please don’t change the default value.

I will check it but the value for dj_db is not good even if i think the code is good.

Best Regards,

Yes, dj_db is not correct because the calculation is wrong. You will see where the problem is when you revisit the hints.

Hello Again,

Also, i checked and the code is the same as in the hint.


Hi @bhuvan_gunesh ,

Please check your code below against the hint:


I changed the code for dj_dw as follows and it works:

dj_dw_ij = (f_wb - y[i])* X[i][j]
dj_dw[j] += dj_dw_ij

many thanks for your help

However for dj_dw it still dont work and i dont find the issue.

Can you help.


Hi @bhuvan_gunesh

I couldn’t find anything particular wrong with code after the changes made. Try refreshing the kernel and clear all output and rerun the code. Hopefully it would work.