Hi,
I’ve submitted the C1_W2_Linear_Regression
lab and would like to share with you some thoughts about the approach I’ve taken.
As we learnt in weeks 1 and 2, the linear regression equation can be calculated with a for
loop as:
m = x.shape[0]
for i in range(m):
f_wb[i] = w * x[i] + b
Or, more efficiently, with the NumPy vectorised function np.dot
as:
f_wb = np.dot(x, w) + b
The cost function or mean squared error (MSE) is calculated as:
total_cost = 0
for i in range(m):
f_wb = np.dot(x, w) + b
cost += (f_wb - y[i])**2
total_cost = cost / (2 * m)
By using another vectorised function, np.sum
, we no longer need to calculate the summation in a for
loop. Therefore, it’s more efficient, easier to write and read.
Lastly, the gradient descent is calculated as:
dj_dw = 0
dj_db = 0
m, n = x.shape
for i in range(m):
err = (np.dot(x[i], w) + b) - y[i]
for j in range(n):
dj_dw[j] += err * x[i, j]
dj_db += err
dj_dw = dj_dw / m
dj_db = dj_db / m
Let’s see how we can get rid of each for
loop:
-
The
for
loop of the gradient of b:for i in range(m): dj_db += err
is replaced by the vectorised sum, as mentioned previously.
-
The nested
for
loop of gradient of w:for i in range(m): for j in range(n): dj_dw[j] += err * X[i, j]
is replaced by the dot product of the transpose of \mathbf{x} and the error \epsilon.
By following these steps, we calculate the gradients in just 4 variables using vectorised functions.
I’ve done more than the exercise asked me for but I’ve learnt a lot of new things along the way.