I noticed between the videos “Computing a Neural Network’s Output” and “Vectorizing Across Multiple Training Examples” the notation for describing the matrices becomes a little confusing.

For a single training example, the video provides the notation as.

a_{i}^{[l]},l =layer,i=node

For multiple training examples the notation is:

a^{[l]}^{(i)} ,l =layer,i=training example

i now takes on a new meaning. In the same video Andrew describes the X matrix as a n_x by m matrix, where m is the number of training examples. It seems m and i are used interchangeably which is confusing for newbies.

Am I correct in my interpretation? If so, I suggest a more standardized approach could be taken such that i is not “re-used” with multiple meanings, and then the same super/subscripts are used throughout multiple videos creating better consistency. I think the notation for multiple training examples is probably better described as

a_{i}^{[l]}^{(m)}, l= layer, i= node, m= training example

Lastly, I realize it makes sense in a loop to say “for I = 1 to m”. So maybe the issue is in the first video with using i to describe the node.