Throughout the course I’ve been thinking on this particular topic that has been in my brain. A figure from one of the labs should more precisely illustrate what I mean:

How come unit 1 knows that unit 0 already separated two types of group (c0 with c1 and c2 with c3)? That is, is not there a possibility that unit 1 groups the classes c0, c1, c2 and c3 the same way unit 0 does? And if it is impossible for differents neurons to grouped in the same way, how come they basically “know” what the other neurons already classified?

Certainly neurons do not “talk” to each other directly, however, they do “interact” with each other through the cost function, because this is the one place where all neuron’s weights gather together to see how good they are doing their jobs in minimizing the cost.

Their stories are pretty machanical, and it starts from neurons having different sets of initial weight values, then the optimization algorithm (gradient descent) begins to give orders to each of the weights to change their values for a better cost.

If one of the neurons happens to find a very rewarding boundary that tremendously reduces the cost (such as the great boundary shown on the left of your screenshot), then at the next gradient descent update, the algorithm may decide that that champion neuron should not move dramatically and ofcourse the decision comes from estimating the gradients. So, that champion neuron may just linger around its current position. The other neurons will also feel that victory but continue to find their own ways to make contribution to the cost, until another champion boundary like the one on the right is found.

So, neurons 1 & 2 found the right boundaries with the guidance of the gradient descent. neuron 1 happens to find the boundary on the left and neuron 2 find the right because of their different initial weight values. If the weight values were swapped, then neuron 2 should find the left boundary and neuron 1 find the right one.

Thank you very much for the answer, it clarify some doubts.

Although different neurons start with, a priori, different weights, there is still a remote possibility that they do in fact do the same thing, that is, they converge to the same “victory” and “draw” the same decision boundary, right?

One last question: the fact that different decision boundaries are drawn is purely due to the fact that the initialization of neuron’s weights is random and, thus, differents weights for different neurons?

I do not know how many experiments you have to do in order to find two neurons initalized differently ending up sharing the same weights. You may change your mind after doing the experiments yourself.