a1 is the probability for digit 0

a2 is the probability for digit 1

…

a10 is the probability for the digit 9.

What is your question exactly?

in the text of the presentation, you have “p(y=1)”, “p(y=2)” , … , “p(y=9)”,“p(y=10)” .

The pictures are for digits 0 to 9. What is “y” referring to? You don’t have pictures showing digits 1 to 10.

Are you referring to the index position? You can’t be referring to the image representation.

There is no “10” in the image representation. You would also be missing the “0”.

For the text to make sense. p(y=1) must refer to glyph for “0”, p(y=2) must refer to glyph for “1”, …, p(y=10) must refer to glyph for “9”,

Sorry for the delayed response. In the context you mentioned, ‘y’ refers to the output labels.

In the video ‘Softmax,’ it is generally shown that for N outputs, ‘y’ ranges from 1 to N. Therefore, prof Andrew may have used the example of y ranging from 1 to 10.

However, in practice, such as recognizing handwritten digits, ‘y’ typically has labels ranging from 0 to 9.

Thank you! I appreciate it.