I try to explain how to calculate dw (dL/dw), but I get more confused.

Could someone please help me

How to do matrix derivative?

Can we use chain rule on matrix derivative?

After derivative, how to know which component need to be transposed or not?

Here’s what I try to do and Andrew’s slides

Matrix calculus is not covered in this course. If you want to delve deeper into that, here is a thread with links about the underlying mathematics and derivations of back prop and the like.

Ok, I’ll try the instruction you provided, thank you so much!!