In the Backprop lecture in week 3, I am a bit confused on the derivation of some of the derivatives.
How are these derivatives derived? Thanks!
In the Backprop lecture in week 3, I am a bit confused on the derivation of some of the derivatives.
How are these derivatives derived? Thanks!
By taking the derivatives. But the point is you need to know vector and matrix calculus in order to derive the formulas shown here. The principles are the same as in univariate calculus, but the added dimensions make things a bit more complicated.
The high level answer is that Prof Ng has specifically designed these courses not to require any knowledge of calculus, even univariate calculus. So he does not show the actual derivation of most of the formulas. Here’s a thread with links to more information about all this and to some backgroound information about matrix calculus if you have some math background and want to dig deeper.
The good news is that you don’t need to know calculus, but the flip side of that is that you just need to take Prof Ng’s word for the formulas in that case.
Thank you! This is a random question but if I were looking for jobs as an application ML Engineer, would the matrix calculus be necessary for me to know? I am just trying to understand the industry standard!
If you are applying ML to solve real world problems, you don’t need to know matrix calculus. That’s why Prof Ng does not require it in these courses. You would only need that if you wanted to be a researcher working on developing new types of algorithms or enhancing one of the “frameworks” like TensorFlow or PyTorch. The calculus is “built in” to the algorithms in TF and Torch and other platforms.