A hopefully usable diagram for back-propagation in a 3-layer network doing logistic regression, back to layer 3

dtonhofer · January 30, 2025, 2:03pm

As a followup to the diagram for feed-forward computation in a 3-layer network doing logistic regression, a first diagram for back-propagation in a 3-layer network doing logistic regression, but only back to layer 3.

This was amazingly hard to do Many diagrams were discarded until a good approach was found. As were many pages of computing derivatives and thinking about whether the symbols say what they should say.

Enjoy!

Diagrams for back-propagation to layer 2 and layer 1 are currently in the works.

The “graphml” raw diagram can be found here:

rmwkwok · February 1, 2025, 1:19pm

Hello, @dtonhofer,

I am glad that I have not missed this.

I think the diagram is getting more your-own-style because you are introducing new symbols like \rho, K and \eta that the DLS, I believe, has not used. However, I do not mean there is any problem with that .

I have not gone into every bit of it but I spotted two things:

image451×213 4.61 KB

Personally, personally, personally, I myself would only consider J, the cost, as a function of W and b, and the data (X and Y) as fixed and not variables. In other words, I would write something like J(W, b; X, Y). I can differentiate J w.r.t. W or b but not X or Y, so I would consider K to be a space of only W and b.
“COLLECTED DURING FEED-FORWARD PHASE” appeared a couple of times, but I wonder what exactly were collected, because I think there is no work of differentation during the feed-forward phase. Think about this: we have feed-forward phase at inference time, but there is no need for differentiation whatsoever because there is no training at inference.

You know that, the next challenge for diagram makers likes us is that, after some weeks or months, whether we will still know how to read the diagram and what it means

Have a good day, David.

Raymond

dtonhofer · February 2, 2025, 7:55pm

Thank you very much, Raymond, I will look int this once my RSI abates (I will definitely need aspirin)

For “Collected During Feedforward”, it’s just that that data is available/cacheable during Feedforward during the training phase, so one can accumulate it at that point. Differentiations would just consist in locally evaluating the derivative of the activation function on their respective z’s. It’s really reaching into the domain of implementation - how do we implement the backpropagation…

Also, the latest version of the diagram for backpropagation to layer 3 and backpropagation to layer 2 are out. Layer 1 is ready but I have to verify the last step, I feel I am missing something still.

All still here:

Thanks again,

– David

rmwkwok · February 3, 2025, 6:57am

Oh! Take care, David. I am sorry to hear about your RSI. I just googled for it, but I can’t say I know how it is like, not to mention how it feels like, so please take care and I hope you will get well soon.

I think now I understand the “Collected During Feedforward” part. I think as long as the diagram is for you only, then to me as a volunteer mentor in this place, I will feel fine to see your sensible explanation.

But frankly speaking, I have not gone into them bit-by-bit because the way you present them isn’t quite the way I will understand them. To me you are breaking the operations down to the smallest unit - I mean element-wise. It is ok, but the way you label them and describe them are not really my way, so for me to go through them in detail, I will have to be very focused So unless you need help, I will just glance them. But if you do really need help, please let me know which part it is and I will focus on that.

Again, I think the real challenge is, after a few weeks, whether you will still think you like those diagrams. I think we can come back to them later.

Cheers!
Raymond

dtonhofer · February 3, 2025, 7:22pm

Thank you for your concern Raymond. It’s better now, there is some damage somewhere due to ancient accidents and it seems to be woken by the yEd editor … it demands very precise point-click operations

All right, we are basically done with diagramming, this seems to be correct, unless there is something I really didn’t understand at all (this is always possible). Checked with a few pages of derivation by hand. No DeepSeek-r1.

Diagrams for backpropagation to layers 1,2,3 now exist!

Again, I think the real challenge is, after a few weeks, whether you will still think you like those diagrams.

Oh, I know about that part.

Ok, here are the direct links to the SVG versions. Graphml and PNG are in the file tree as siblings as usual under

SVG versions

Backpropagate to Layer 3

Backpropagate to Layer 2

Backpropagate to Layer 1

Previews

Absolutely unreadable scaled-down jpg versions, but they show the overall structure (and how to continue it to the left until memory/compute runs out)

Backpropagate to Layer 3

Backpropagate to Layer 2

Backpropagate to Layer 1

rmwkwok · February 4, 2025, 12:42am

Amazing… seeing reading serious work full of energy is the best thing here.

dtonhofer · February 9, 2025, 6:09pm

Update on 2025-02-09: This is too complex and not actionable enough (also, the W’s are transposed relativer to the course’s conventions… ARG!)

All readers please fall back to the following:

A post with a better diagram

Topic		Replies	Views
Backpropagation dataflow diagram for NN with 2 layers, performing 2-class classification Neural Networks and Deep Learning week-module-3 , coursera-platform	3	27	February 9, 2025
Enjoy a diagram of a 3-leyer feedforward neural network Neural Networks and Deep Learning week-module-2 , coursera-platform	10	61	February 21, 2025
My reviewed diagrams for feed-forward processing in a 3-layer network Neural Networks and Deep Learning week-module-2 , week-module-3 , coursera-platform	4	27	February 9, 2025
Diagram for the differentiation of the COST function in a 2-layer ANN Neural Networks and Deep Learning week-module-2 , coursera-platform	4	12	March 24, 2025
Additional diagrams for Programming Assignment 1 of Week 4 Neural Networks and Deep Learning week-module-4 , coursera-platform	1	16	February 21, 2025

A hopefully usable diagram for back-propagation in a 3-layer network doing logistic regression, back to layer 3

SVG versions

Previews

Related topics