Do we need to find parameters for every example of vector X or just improve one parameter set with many examples? I thought just like logistic regression in each unit we will be finding a single parameter set for vector X and improve the parameters using multiple examples, why do we find parameters for every example vector?
There is one set of weights and biases that are adjusted so they make the best fit to the entire set of data.
We do not have separate parameters for each example.
In a neural network, there are also weights between each combination of units in adjacent layers. So the weights are a matrix, not just a vector as they were in logistic regression