I may have missed it in week 1, I understand that Model parameters relates to the size of the model, but is there a simple explation of what a model paramter is? When you look at something like BloombergGPT that has 50B parameters, how is that derived and are there examples by model what are some of the specific parameters?
Model parameters include: weights, learning rate, layers and neurons (architecture), optimizer, and so on. So when you build a model and after training it, you need to save its blueprint (lets say so) so it can be used for predictions.
The Deep Learning Specialization will give you a good intro into neural networks, check it out.
Another way to explain what a PARAMETER is is to think of it as a WEIGHT or COEFFICIENT of a FEATURE (ie VARIABLE) in the model.
Let me illustrate with an example. You are trying to predict housing prices (e.g. P) . The prediction MODEL (eg a linear regression) has 3 FEATURES (e.g., number of bedrooms, number of bathrooms, average income in neighborhood): x, y, z. The model, after training on the data, gives you this equation:
P = 3x + 2y + 4z
In this example, “2”, “3” and “4” are the PARAMETERS.
In more complex models, such as deep neural networks there are many parameters.
If you need a good recommendation for a book “Hands On Machine Learning with Scikit-Learn, Keras & Tensorflow” by Aurelien Geron Is an excellent one. Pages 11 and 24 of the 3rd edition give good explanations of concepts like FEATURE, PARAMETERS etc.
TY for the clarifications