Just finished the Week3 assignment. I had a look on hints after completing the assignment, and found out that Hints could be improved along with some descriptions in the course.
- The last 4 points of hints regarding transpose are redundant, it is basically the same without transpose.
Just using matrix multiplication. - It could be much clear to state out that each word is an observation with n features, which corresponding to a (nxp) data matrix, therefore, each row is a word vector.
In Deep learning specialization, Andrew prefers to use (p x n) to represent data, but here is (n x p). I think making it clear upfront is very important as this is the final data structure before applying models.