General code debugging tips

rmwkwok · October 1, 2022, 12:37pm

Hello learners,

There are mainly 2 types of error:

syntax error: our implemented function can’t run completely, and doesn’t produce any output
non syntax error (run time error): our implemented function run through and produce output, but the output is not as expected

Non syntax error

I suggest we use (1) simple inputs and (2) printing variables to check our implemented function. To illustrate the idea, I edited a provided function by an optional lab to make it wrong, and here it is:

Situation: implement the gradients defined as the following formula:

Screenshot from 2022-10-01 17-39-42

Wrong code:

1  def compute_gradient(x, y, w, b): 
2      """
3      Computes the gradient for linear regression 
4      Args:
5        x (ndarray (m,)): Data, m examples 
6        y (ndarray (m,)): target values
7        w,b (scalar)    : model parameters  
8      Returns
9        dj_dw (scalar): The gradient of the cost w.r.t. the parameters w
10       dj_db (scalar): The gradient of the cost w.r.t. the parameter b     
11     """
12      
13     # Number of training examples
14     m = x.shape[0]    
15     dj_dw = 0
16     dj_db = 0
17     
18     ### START CODE HERE ###
19     for i in range(m):  
20         f_wb = w * x[i] + b 
21         dj_dw_i = f_wb - y[i] # <-- wrong line, missing "* x[i]"
22         dj_db_i = f_wb - y[i] 
23         dj_db += dj_db_i
24     dj_dw += dj_dw_i # <--  wrong line, missing indentation
25     dj_dw = dj_dw / m 
26     # <-- missing line "dj_db = dj_db / m 
27
28     if dj_db + 3 > 0: # <-- unneeded line
29         dj_db = 0 # <-- unneeded line
30     ### END CODE HERE ###
31
32     return dj_dw, dj_db

Debugging steps:

Creating inputs

We create a set of simple input. By Line 1, we know there are 4 inputs: x, y, w, b. By Line 5 to 7, we know what they should look like.
By Line 5, x is an array of size (m, ). Because we want it simple, let’s create x with 3 examples (m = 3): x = np.array([1, 2, 4])
By Line 6, y is an array of size (m, ). Let’s create y = np.array([2, 1, 3])
By Line 7, w and b are scalars, so let’s create w = 2, b = 1.
Note that we use simple (but NON-ZERO) integers such as 1, 2, 3, and 4 so they are easy to work with.
Created inputs:

x = np.array([1, 2, 4])
y = np.array([2, 1, 3])
w = 2
b = 1

Adding print lines

add print lines only between ### START CODE HERE ### and ### END CODE HERE ###
print variables every time AFTER they are assigned a new value
print the condition of BEFORE every if statement
print the returning variables
add a number to identify which printed result refers to which print line. add the variable name also if preferred.
After adding print lines:

1  def compute_gradient(x, y, w, b): 
2      """
3      Computes the gradient for linear regression 
4      Args:
5        x (ndarray (m,)): Data, m examples 
6        y (ndarray (m,)): target values
7        w,b (scalar)    : model parameters  
8      Returns
9        dj_dw (scalar): The gradient of the cost w.r.t. the parameters w
10       dj_db (scalar): The gradient of the cost w.r.t. the parameter b     
11     """
12      
13     # Number of training examples
14     m = x.shape[0]    
15     dj_dw = 0
16     dj_db = 0
17     
18     ### START CODE HERE ###
19     print(1, "m", m) # <-- ADDED
20     for i in range(m):  
21         print(2, "i", i) # <-- ADDED
22         f_wb = w * x[i] + b 
23         print(3, "f_wb", f_wb) # <-- ADDED
24         dj_dw_i = f_wb - y[i] # <-- wrong line, missing "* x[i]"
25         print(4, "dj_dw_i", dj_dw_i) # <-- ADDED
26         dj_db_i = f_wb - y[i] 
27         print(5, "dj_db_i", dj_db_i) # <-- ADDED
28         dj_db += dj_db_i
29         print(6, "dj_db", dj_db) # <-- ADDED
30     dj_dw += dj_dw_i # <--  wrong line, missing indentation
31     print(7, "dj_dw", dj_dw) # <-- ADDED
32     dj_dw = dj_dw / m 
33     print(8, "dj_dw", dj_dw) # <-- ADDED
34     # <-- missing line "dj_db = dj_db / m 
35
36     print(9, "dj_db + 3", dj_db + 3) # <-- ADDED - print condition of `if` statement
37     if dj_db + 3 > 0: # <-- unneeded line
38         dj_db = 0 # <-- unneeded line
39         print(10, "dj_db", dj_db) # <-- ADDED
40     print(11, "dj_dw, dj_db", dj_dw, dj_db) # <-- ADDED - print returning variables
41     ### END CODE HERE ###
42
43     return dj_dw, dj_db

Inspect the code with the printings

Run the code below

x = np.array([1, 2, 4])
y = np.array([2, 1, 3])
w = 2
b = 1

compute_gradient(x, y, w, b)

Receive the output below

1 m 3
2 i 0
3 f_wb 3
4 dj_dw_i 1
5 dj_db_i 1
6 dj_db 1
2 i 1
3 f_wb 5
4 dj_dw_i 4
5 dj_db_i 4
6 dj_db 5
2 i 2
3 f_wb 9
4 dj_dw_i 6
5 dj_db_i 6
6 dj_db 11
7 dj_dw 6
8 dj_dw 2.0
9 dj_db + 3 14
10 dj_db 0
11 dj_dw, dj_db 2.0 0

Inspect the output together with the formula, by first calculating the expected outcome (here comes the benefit of using simple integers):
Formula:

Expected outcomes:
\frac{\partial{J}}{\partial{w}} = \frac{1}{3}[(2\times1+1-2)\times1+(2\times2+1-1)\times2+(2\times4+1-3)\times4] = 11
\frac{\partial{J}}{\partial{b}} = \frac{1}{3}[(2\times1+1-2)+(2\times2+1-1)+(2\times4+1-3)] = 3.66667

Inspection work below:

1 m 3  # OK, because my created input has 3 samples

2 i 0 # OK, because i is iterated from 0 up to 2
3 f_wb 3 # OK, because the model is w * x + b, and for the first sample, it is 2*1+1=3
4 dj_dw_i 1 # OK, because by the formula, it should be (f - y) * x, so (3 - 2) * 1 = 1
5 dj_db_i 1 # OK, because by the formula, it should be (f - y) , so (3 - 2)  = 1
6 dj_db 1 # OK, the purpose of this line is to accumulate  dj_db_i, it should be 0 + 1 = 1
# WRONG !!! expecting a line to accumulate dj_dw_i, but there is none!

2 i 1 # OK, because i is iterated from 0 up to 2
3 f_wb 5 # OK, 2*2+1=5
4 dj_dw_i 4 # WRONG!!!! expecting (5 - 1) * 2 = 8
5 dj_db_i 4 # OK, (5 - 1) = 4
6 dj_db 5 # OK, (4 + 1) = 5

2 i 2 # OK, because i is iterated from 0 up to 2
3 f_wb 9 # OK, 4*2+1=9
4 dj_dw_i 6 # WRONG!!!! expecting (9 - 3) * 4 = 24
5 dj_db_i 6 # OK, (9 - 3) = 6
6 dj_db 11 # OK, (5 + 6) = 11

7 dj_dw 6 # WRONG!!! The loop has ended, and this variable should have accumulated all dj_dw_i which is 1 + 8 + 24 = 33
8 dj_dw 2.0 # WRONG!!! Because by the formula, it should be 33/3 = 11
# WRONG!!!! expecting a line to divide dj_db with m, but there is none. it should be 11/3 = 3.666667

9 dj_db + 3 14 # 14 is greater than 0, so the `if` statement is triggered
10 dj_db 0 # WRONG!!! `dj_db should be 3.666667`

11 dj_dw, dj_db 2.0 0 # WRONG!!! expecting 11 and 3.666667. ALSO, according to the code's Line 9 and 10, they should both be scalar which is correct.

Correct the code based on the finding

1  def compute_gradient(x, y, w, b): 
2      """
3      Computes the gradient for linear regression 
4      Args:
5        x (ndarray (m,)): Data, m examples 
6        y (ndarray (m,)): target values
7        w,b (scalar)    : model parameters  
8      Returns
9        dj_dw (scalar): The gradient of the cost w.r.t. the parameters w
10       dj_db (scalar): The gradient of the cost w.r.t. the parameter b     
11     """
12      
13     # Number of training examples
14     m = x.shape[0]    
15     dj_dw = 0
16     dj_db = 0
17     
18     ### START CODE HERE ###
19     print(1, "m", m) # <-- ADDED
20     for i in range(m):  
21         print(2, "i", i) # <-- ADDED
22         f_wb = w * x[i] + b 
23         print(3, "f_wb", f_wb) # <-- ADDED
24         dj_dw_i = (f_wb - y[i]) * x[i] # <-- wrong line, missing "* x[i]"
25         print(4, "dj_dw_i", dj_dw_i) # <-- ADDED
26         dj_db_i = f_wb - y[i] 
27         print(5, "dj_db_i", dj_db_i) # <-- ADDED
28         dj_db += dj_db_i
29         print(6, "dj_db", dj_db) # <-- ADDED
30         dj_dw += dj_dw_i # <--  wrong line, missing indentation
31     print(7, "dj_dw", dj_dw) # <-- ADDED
32     dj_dw = dj_dw / m 
33     print(8, "dj_dw", dj_dw) # <-- ADDED
34     dj_db = dj_db / m # <-- missing line "dj_db = dj_db / m 
35     print(8.1, "dj_db", dj_db) # <-- ADDED
36
37     print(11, "dj_dw, dj_db", dj_dw, dj_db) # <-- ADDED - print returning variables
38     ### END CODE HERE ###
39
40     return dj_dw, dj_db

Run the code again, and get the following outputs

1 m 3
2 i 0
3 f_wb 3
4 dj_dw_i 1
5 dj_db_i 1
6 dj_db 1
2 i 1
3 f_wb 5
4 dj_dw_i 8
5 dj_db_i 4
6 dj_db 5
2 i 2
3 f_wb 9
4 dj_dw_i 24
5 dj_db_i 6
6 dj_db 11
7 dj_dw 33
8 dj_dw 11.0
8.1 dj_db 3.6666666666666665
11 dj_dw, dj_db 11.0 3.6666666666666665 # as EXPECTED!

Remove the print lines and any other code for this inspection work so they won’t interfere with the grader

1  def compute_gradient(x, y, w, b): 
2      """
3      Computes the gradient for linear regression 
4      Args:
5        x (ndarray (m,)): Data, m examples 
6        y (ndarray (m,)): target values
7        w,b (scalar)    : model parameters  
8      Returns
9        dj_dw (scalar): The gradient of the cost w.r.t. the parameters w
10       dj_db (scalar): The gradient of the cost w.r.t. the parameter b     
11     """
12      
13     # Number of training examples
14     m = x.shape[0]    
15     dj_dw = 0
16     dj_db = 0
17     
18     ### START CODE HERE ###
19     for i in range(m):  
20         f_wb = w * x[i] + b 
21         dj_dw_i = (f_wb - y[i]) * x[i]
22         dj_db_i = f_wb - y[i] 
23         dj_db += dj_db_i
24         dj_dw += dj_dw_i 
25     dj_dw = dj_dw / m 
26     dj_db = dj_db / m 
27     ### END CODE HERE ###
28
29     return dj_dw, dj_db

Good luck!
Raymond

J_Jeslin · October 5, 2022, 2:27pm

Thank you for helping me out . I found my error and solved the problem.

rmwkwok · October 5, 2022, 2:32pm

I am glad to hear that @J_Jeslin!

Kazoule · November 21, 2022, 1:09pm

After returning dj_dw and dj_db,what about the learning rate ?

W = W - alpha × dj_dw

Should we write another code ?

rmwkwok · November 22, 2022, 1:18am

Hello @Kazoule,

It is up to the requirement of the assignment / your project. The purpose of the example code is to compute the gradients and so done they are. If you need to do gradient descent then the answer is yes.

However, the objective of this post is to demo a general debugging approach. If you have a question about a ML topic or an assingment, please open a new topic.

Cheers,
Raymond

Pramil.Gupta · January 10, 2023, 7:29am

Hi Raymond,

While executing the compute_cost function, I am getting stuck with the below -

Not sure how to move forward. You can check my notebook - C1_W2_Linear_Regression_PG. Pls excuse me as I dont have any coding background.

Thanks!
Pramil

rmwkwok · January 10, 2023, 12:27pm

Hello @Pramil.Gupta,

It is the first assignment of this specialization, so it is a good time for you to practice how to debug your code. Let’s look at the error traceback. It says you have this error message

And it pointed to this line

This indicates that the error was raised when your exercise work was being checked by the public test. To check out the content of the public test, you might click “File” > “Open” > “public_tests.py”. Then in the file, search for “Cost must be 2” which is part of the error message. Then you should find this test:

    # Case 2
    x = np.array([2, 4, 6, 8]).T
    y = np.array([7, 11, 15, 19]).T
    initial_w = np.array([2.])
    initial_b = 1.0
    cost = target(x, y, initial_w, initial_b)
    assert cost == 2, f"Case 2: Cost must be 2 but got {cost}"

Here you can see that what x, y, w, and b are used for testing your work. If you review the cost function and calculate the cost with these parameter values, you should also be able to come up with this: \frac{1}{4}[(2\times2+1-7)^2 + (2\times4+1-11)^2 + (2\times6+1-15)^2 + (2\times8+1-19)^2]=\frac{1}{4}(4+4+4+4) = 2

However, your function computes a cost of 0.5. Now, there are 2 ways to inspect your work:

read the code. This is easier, but you need a very clear mind and ask youself why each of your lines are written the way they are. For example, one common error is wrong indentation. You need to indent correctly to make looping code lines to be in a loop. Check this post for more about indentation.
print your function’s progress. This topic is a demo of how to add print lines to print out all intermediate variables for you to keep track of how the numbers change. You need to read the printouts and ask yourself whether each printed numbers are up to your expectation. The code is written by you and it should work up to your expectation.

Good luck!
Raymond

Pramil.Gupta · January 11, 2023, 12:40pm

Hi Raymond - thanks a lot for your prompt response and valuable inputs.

I could successfully execute all the functions and pass all the tests yet after submitting the assignment I am getting 0/100. Where can i find out the issue with assignment submission?

Thanks!

rmwkwok · January 11, 2023, 1:05pm

Hello @Pramil.Gupta,

I am glad you have fixed the bugs yourself, and there is NOTHING better than that. Debugging can cost a lot of time but we are independent practitioner, ain’t we?

OK, back to your question. Can you share a screenshot of the message the autograder has given to you? Besides the 0/100 marks, it probably should also have some other messages.

Raymond

Pramil.Gupta · January 11, 2023, 1:47pm

Absolutely and thanks for ur help again

Autograder’s output wasn’t generated then it is available now and it says that implementation of: Compute_Cost & Compute_Gradient, functions is incorrect but you can check that the both the functions got executed successfully

AND

What am I missing here

rmwkwok · January 11, 2023, 2:22pm

Hey @Pramil.Gupta,

The first thing I will have to ask you is to remove any print line that you have added earlier for inspecting your code. For example, the print lines that generated the following:

because they might interfere with the autograder.

Please remove those print(...), and then run your notebook from top to bottom again, and make sure this time there is no those “printouts”. After that, please submit again. This time, if the autograder still doesn’t give you full marks, please take a screenshot of the autograder messages. I want to see them in full.

Thanks,
Raymond

Pramil.Gupta · January 11, 2023, 3:30pm

I think that was the last thing to be updated. Grading is complete now with 100% Thanks

rmwkwok · January 11, 2023, 3:38pm

That’s wonderful!