I am a bit confused about the number of activation values in each layer. According to the lecture, in layer 2 with 15 neurons, the output should have 15 activation values. But for first neuron in this 2nd layer
ie g(w1[2].a[1]+b1[2]) would there not be 25 values just for this neuron as there are 25 activation values ie a[1] fed into this neuron alone
Here’s what’s happening. The first neuron in the layer 2 receives the 25 “activation values” from the layer 1, and each of those “activation values” is multiplied with their corresponding weights in that neuron (i.e. the neuron has 25 w and 1 b). Together with the bias term, the 25 multiplication results are added to become one number which goes to the activation function and become the one and only one “activation value” from the neuron.
Thanks Raymond for your reply
So can I clarify if the 25 weights all have different values or are the 25 weights in the same neuron have the same value?
They are supposed to be initialized randomly, and then have different values throughout the training process, unless we knowingly overly strongly regularize them that push them to zeros, or we are suffering from some unwanted problems.
Cheers
Raymond
PS: this is a post that explains why neurons can diversify and in which there is another link to a post that demonstrates how you can make neurons not to diversify