Module 2 - Vectorization Part 2

letusc · July 9, 2025, 5:44am

Video time 2:36, explanation on gradient descent with vectorization starts. 16 parameters (w1, w2, …, w16) are understood. But I fail to understand how can there be 16 derivative terms for these 16 weights? As per my understanding, to calculate the derivative we have the formula for derivative as 1/m summation i=0 to m-1 of (f_w_b(x(i)) - y(i)) * x(i). So the 16 weights all get used in calculating one single f_w_b(x(i)). I am confused how prof. Ng is showing vector of 16 derivative values. I am really missing some key point here. Help please!

letusc · July 9, 2025, 5:53am

Oh, I think I know why but I am not too sure unless someone can confirm.
When calculating derivative, (f_w_b(x(i)) - y(i)) results in a single scalar value. Further, this scalar value is multiplied with a x(i) vector of size 16. Maybe that is resulting in the sixteen partial derivative values.

gent.spah · July 9, 2025, 7:02am

Yes thats right you need to take partial derivatives with respect to each wight Wi.

Topic		Replies	Views
C2_W2_Assignment - section 3 / Exercise 5 - clarification Calculus for Machine Learning and Data Science week-module-2	1	133	May 22, 2024
Derivative of regularization term Supervised ML: Regression and Classification week-module-3	22	1575	November 6, 2024
Dividing by "m" in back propagation using vectorized implementation Neural Networks and Deep Learning week-module-3 , coursera-platform	3	463	February 19, 2024
Weight updating in multiple regression Supervised ML: Regression and Classification week-module-2	5	290	November 27, 2023
Multiple Linear Regression - Gradient Descent formula Supervised ML: Regression and Classification week-module-2	6	530	August 2, 2022

Module 2 - Vectorization Part 2

Related topics