I’ve investigated a little and it says there must be a pre-processing to the images, black and white so you can evaluate the color intensity of the pixels.
But it’s not clear, at least to me, how the neural network with math functions can determine a curve in a number, or whether the number is 9 or 4.
It cannot be magic, even if it seems, there must be an explanation of how it works under the hood.
If you recommend me a document to read so I can understand it deeper, I will do it.
If you tell me this you will understand in Deep Learning course, that’s fine, I will do it.
Hi @gmazzaglia this is a great question that I spent a lot of time trying to figure out when I started. I will try to explain how this works
In this example you have a picture, each pixel has a number associated, the number per se is not important, the important part is that is a representation of the image, you convert the picture using a number, if you have black pixel it will have the same number as another black pixel, and a different number for white, the algorithm recognize the pattern, and it learn how discriminate between the different patterns of numbers, so is not magic is math.
Let’s see an example
You have the picture of the number 1
let’s say you have only six pixels (obviously is an exaggeration of the example)
Column 1
Column 2
Column 3
0.002
1.313
0.002
0.001
1.312
0.002
The algorithm will look at this pattern and it will predict that it is similar to other similar pattern that represents number one, that could look something like this
Column 1
Column 2
Column 3
0
1
0
0
1
0
Note that in this case the number of the matrix are representations of the intensity of the picture at different pixels, so if you have the number zero, the intensity will have 1 at the pixels where the writing is present.
Understood, thanks @pastorsoto. It’s more clear. Regarding the activation functions that you use to detect that pattern.
Were they selected based on try and failure? I mean you try what is the best combination of activation functions, or is there a pattern too? I mean those combined functions are better for images than others.