Hi,
It has been stated that the standard way of initializing the parameters was to create a matrix with random numbers for W and create a matrix with 0’s for b. Can I get the exact reason for this?
Hi,
It has been stated that the standard way of initializing the parameters was to create a matrix with random numbers for W and create a matrix with 0’s for b. Can I get the exact reason for this?
You need to initialize with random values for “symmetry breaking”. Here’s a thread which discusses this in more detail. That symmetry breaking thread is referenced from the FAQ Thread, which is also worth a look on general principles.