I am not sure what they mean by that. I can think of two theories:
- They really mean the “feature vectors” in the W^{[1]} case. Those are legitimately column vectors.
- They mean it in the sense explained in the lectures where Prof Ng treats the rows of W as the transpose of the w weight vectors as we had them in Logistic Regression. So they mean the w vectors before the transpose.
If it is theory 2), then it is explained on this thread.
In either case, I think that way of explaining their answer is unnecessarily confusing. I will research more and then file a bug asking them to make this more clear.
Thanks for pointing this out.