Hello @popaqy
To answer your question: “However, I am wondering how each neuron’s sigmoid function determines its own cost function? t must be a brilliant technique.”
Yes, there is indeed a brilliant technique and it is called backpropagation.
Going back to the Basics: To update a weight parameter, what we need is not necessarily the Cost, rather it is the \frac {\partial J} {\partial w}.
There is only a single Cost value J for the overall network which we are trying to minimize. But we can still find the derivative of that Cost w.rt to every single parameter of the network - (\frac {\partial J} {\partial w_{i,j}} , \frac {\partial J} {\partial b}) for every layer of the network. In this manner, we do not just update the final layer (w,b) parameters but we update all the (w,b) parameters in all the layers of the network, all the way back to the first layer, such that the overall cost J is minimized - And, this adds to the magic of Neural Networks!
If you want to know more about this, you can take a look here