calculating logistic regression for many times with all the same variable (which connects to different nodes).
so all the training examples are connected to every nodes in the next layer. so every node calculates the same logistic regression? Need help understanding this concept! Thanks in advance!
Hi @maha and welcome to the course! Let’s try to claarify. It might be useful for you to conceptually separate the concept of a model’s architecture from the computation of it’s parameters with the data (training examples) through gradient descent.
As for architecture, it’s useful to have in mind the “graph” of the network pictured in the lectures. The model is a “fully connected” network so each node is connected to every node in the ensuing layer. Correct!
Now suppose that you are handed the values for the W and b parameters for each node. In other words, somebody else trained the network (on the training examples but pretend that was somebody else’s job!). With that, you can compute the output produced by any other example (training example, test example, or any other). That’s a simple matter of plugging in the values representing the example (e.g. a digitized image) into that parameterized network. The latter is simply a complicated function, but conceptually it’s simple algebra: y = f\left(x\right), where x is a vector of values. You can do this for any example for as many examples as you wish.
Now put yourself in the other person’s shoes. She “trained” the network by computing the corresponding output for every training example with gradient descent beginning from an initial guess of the parameters. In this iterative process the network (function) is evaluated for every training example many times. So maybe this helps with your first paragraph.
I find that it really helps to figure this stuff out with pencil and paper–write your own lecture notes as if you had to teach the material yourself. The the assignments then provide further clarification.