If you have train data with dimesion (1000 rows , 400 columns ) and you want to built an NN model to predict some output, the first hidden layer Consists of 25 units, so the weights of the first layer would be (something, 25), you have 1000 row so you want to predict 1000 value, and the prediction is XW + b so that the weights dimensions should be (400,25) to get the output, and the bias term is 25 every one for each unit so the output would be = X * W + b (1000,400)(400,25) + 25 = (1000,25) + 25 and after that you bass these output values into nonlinear function (activation function) to do more complex calculation and extract features, and so on for the rest layers
it’s take the data x and do what I said before but this operation after we update and tuned the weights by optimization algorithms like adam or gradient descent algorithms
They used when we doing model.fit(x,y data) to fit, and tune the weights to get prediction values close to real output
Thank you so much for your elaborate response.
So here is what I understood. Please correct me if I am wrong.
Get training sets - - >train your model to fit the values of weight and bias–>Use these values to predict