The activation function is applied “elementwise”, so the derivatives are also computed “elementwise”. That means that the shape of \displaystyle \frac {\partial A}{\partial Z} is the same shape as A and Z. In Prof Ng’s notation, that would be 1 x m, where m is the number of input samples.