This might be a bit confusing question, but I just realized that in the videos under the topic of bias/variance there seems to be discussion where Andrew uses above mentioned terms quite interchangeably referring to bigger/smaller network. I am aware that some of those terms refer to each other conceptually but I don’t want to search through all of the material before in this Specialization to find out the answer if it is there.

So how would you connect the terms? Are adding more neurons to the layer as same as adding more polynomials to the function etc?

Please be kind and ask me to be more specific if needed

Bias and variance are concerned with underfitting and overfitting of training data , and polynomials is to do with feature engineering. These concepts are simple, but they can take a little time to digest. You may find it helpful and understand the concepts better if you revisit those lecture videos again.

I feel I understand the concept of adding polynomials and the concept of feature engineering, but in the last video of bias/variance series, Andrew begins suddenly to talk about when having too big network (referring to nodes/neurons and layers, one typically gets too much variance. So I was wondering how one can connect that realization to the feature engineering, because it seems to be related.

But yea, maybe I am just confusing different levels of complexity together. I try to search for relevant videos, there are just so much material so I was hoping someone could point me to right direction.

Adding features adds complexity to the hypothesis (f_wb). Too much leads to overfitting.

WIth an NN, the equivalent complexity is adding too many hidden layers or hidden layer units. Since the NN hidden layers are learning new features from the previous layers, having too many units adds too much complexity, leading to overfitting.