Hey Sara,
I believe this post contains the answer to your question.
Note that
- A bigger model and different architecture may be seen as a procedure of changing a model capacity.
- A hyperparameter is any parameter that we fix during training process. In that sense, a number of layers is a hyperparameter that changes model capacity, a regularization coefficient of norm penalties is a hyperparameter that allows us to increase or decrease regularization, etc.
- An optimization algorithm definitely has an effect on learning and its learning rate is probably the most important hyperparameter, but we do not consider it as an instrument for reducing bias or variance. It determines how fast we converge to some solution.