I do not think that formulas (6) are correct based on the log loss function (5). The log loss is a sum divided by -m. The derivative of a sum is the sum of derivatives, divided by what is outside the summation which should be -m in this case and not m.

Below formulas (6) it says that dL/da(i) * da(i)/dz(i) = (a(i) - y(i)), “as discussed in the videos”. However, what was discussed in the videos was a log loss calculation in which the minus sign was inside the summation. Actually there was no summation at all as it was for a single example.

L = -y(i)*ln(a(i)) - (1-y(i))*ln(1-a(i))

dL/dz= (a(i)-y(i))

In presentation (5) of the log loss formula dL/da(i) * da(i)/dz(i) would be -(a(i) - y(i)). The minus inside the summation would then cancel out with the minus outside the summation which results in formulas (7).

L = y(i)*ln(a(i)) + (1-y(i))*ln(1-a(i))

dL/dz= -(a(i)-y(i))

The bottom line is that formulas (6) would be correct if the minus sign was inside the summation in formula (5).