As a followup to the diagram for feed-forward computation in a 3-layer network doing logistic regression, a first diagram for back-propagation in a 3-layer network doing logistic regression, but only back to layer 3.
This was amazingly hard to do Many diagrams were discarded until a good approach was found. As were many pages of computing derivatives and thinking about whether the symbols say what they should say.
Enjoy!
Diagrams for back-propagation to layer 2 and layer 1 are currently in the works.
I think the diagram is getting more your-own-style because you are introducing new symbols like \rho, K and \eta that the DLS, I believe, has not used. However, I do not mean there is any problem with that .
I have not gone into every bit of it but I spotted two things:
Personally, personally, personally, I myself would only consider J, the cost, as a function of W and b, and the data (X and Y) as fixed and not variables. In other words, I would write something like J(W, b; X, Y). I can differentiate J w.r.t. W or b but not X or Y, so I would consider K to be a space of only W and b.
âCOLLECTED DURING FEED-FORWARD PHASEâ appeared a couple of times, but I wonder what exactly were collected, because I think there is no work of differentation during the feed-forward phase. Think about this: we have feed-forward phase at inference time, but there is no need for differentiation whatsoever because there is no training at inference.
You know that, the next challenge for diagram makers likes us is that, after some weeks or months, whether we will still know how to read the diagram and what it means
Thank you very much, Raymond, I will look int this once my RSI abates (I will definitely need aspirin)
For âCollected During Feedforwardâ, itâs just that that data is available/cacheable during Feedforward during the training phase, so one can accumulate it at that point. Differentiations would just consist in locally evaluating the derivative of the activation function on their respective zâs. Itâs really reaching into the domain of implementation - how do we implement the backpropagationâŠ
Also, the latest version of the diagram for backpropagation to layer 3 and backpropagation to layer 2 are out. Layer 1 is ready but I have to verify the last step, I feel I am missing something still.
Oh! Take care, David. I am sorry to hear about your RSI. I just googled for it, but I canât say I know how it is like, not to mention how it feels like, so please take care and I hope you will get well soon.
I think now I understand the âCollected During Feedforwardâ part. I think as long as the diagram is for you only, then to me as a volunteer mentor in this place, I will feel fine to see your sensible explanation.
But frankly speaking, I have not gone into them bit-by-bit because the way you present them isnât quite the way I will understand them. To me you are breaking the operations down to the smallest unit - I mean element-wise. It is ok, but the way you label them and describe them are not really my way, so for me to go through them in detail, I will have to be very focused So unless you need help, I will just glance them. But if you do really need help, please let me know which part it is and I will focus on that.
Again, I think the real challenge is, after a few weeks, whether you will still think you like those diagrams. I think we can come back to them later.
Thank you for your concern Raymond. Itâs better now, there is some damage somewhere due to ancient accidents and it seems to be woken by the yEd editor ⊠it demands very precise point-click operations
All right, we are basically done with diagramming, this seems to be correct, unless there is something I really didnât understand at all (this is always possible). Checked with a few pages of derivation by hand. No DeepSeek-r1.
Diagrams for backpropagation to layers 1,2,3 now exist!
Again, I think the real challenge is, after a few weeks, whether you will still think you like those diagrams.
Oh, I know about that part.
Ok, here are the direct links to the SVG versions. Graphml and PNG are in the file tree as siblings as usual under
SVG versions
Backpropagate to Layer 3
Backpropagate to Layer 2
Backpropagate to Layer 1
Previews
Absolutely unreadable scaled-down jpg versions, but they show the overall structure (and how to continue it to the left until memory/compute runs out)