ResNets about Skip Connection

CourseraFan · June 13, 2022, 7:58am

Hey guys. I have finished the video “ResNets” and hasn’t gone further. I’ve realized that we now used Skip Connection to avoid exploding gradients when NN get very deep. But then, i see that let’s say from A[0], it skipped to the computation A[2]. Where A[2] = g[2](Z[2] + A[0]). The problem is that is this solution really that effective? Because where does it get Z[2]? It gets from Z[2] = W[2]A[1] + b[2] right? Where does A[1] come from? Well what i meant is that does it need computation like i show above? If this doesn’t make sense please inform me i can make it more clear.

anon57530071 · June 13, 2022, 8:03am

“Skip connection” does not mean, no data goes into the next neuron. The output from a previous layer, of course, goes into the next layer. In parallel, we store it as “residuals” and carries to the future neurons. So, it’s like a backup path to keep “network flow” active.

CourseraFan · June 13, 2022, 8:17am

Oh, but what i meant was that let’s say you have A[0] where you use skip connection to A[2] computation which A[2] = g[2](Z[2] + A[0]). My question was that based on that, How does it become effective? I mean that in order to get Z[2] values u have to used A[1], W[2], b[2] values right? And in order to get A[1] you should do g1 which in order to get Z[1] you have to have W[1], A[0], b[1]. So how to you get the Z[2] value? By the computation i show above? If so then how does ResNets become useful? Thanks ahead!

anon57530071 · June 13, 2022, 8:46am

You have all variables that you mentioned already. Which variables that you can not get ?

Using Andrew’s chart, let’s confirm the point that you are confused.

Think l =0, and a^{[l]} = A^{[0]}. Then, this should be same as your equations.

At the first neuron,
Z^{[1]} = W^{[1]}A^{[0]} + b^{[1]}, \ \ A^{[1]} = g(Z^{[1]}) \ \ \ (In here, A^{[1]} = a^{[l+1]} in Andrew’s picture)
At the 2nd neuron,
Z^{[2]} = W^{[2]}A^{[1]} + b^{[2]}, \ \ A^{[2]} = g(Z^{[2]}+A^{[0]}) \ \ \ (In here, A^{[2]} = a^{[l+2]} in Andrew’s picture)

We have all since A^{[0]} goes into the first neuron as well as is carried to the 2nd neuron as “residual”.

CourseraFan · June 13, 2022, 8:51am

So the thing that i can’t understand is that the skip connection. And also i was wondering if this skip connection technique is really effective. Based on what i know about skip connection, You use the information of A[0] and skip to the computation A[2] where A[2] = g[2](Z[2] + A[0]). Maybe the whole point that i pleased you to answer was, What’s the difference between Skip connection and Main path based on computation?

anon57530071 · June 13, 2022, 9:02am

That is what Andrew explained in “Why ResNets work”. I would recommend you to watch it again for better understanding.
On the other hand, there are some other thread to talk about ResNets. Here is the one. ResNets Question

CourseraFan · June 13, 2022, 9:03am

Oh, sorry about that and the reason i was confused is that i haven’t watched the video “Why ResNets work”. Thanks!

Topic		Replies	Views
How does ResNets work? Convolutional Neural Networks	2	627	June 13, 2022
RESNET Explanation Convolutional Neural Networks	1	481	August 26, 2022
Skipped connection in ResNet Convolutional Neural Networks	4	523	March 28, 2024
Quiz week 2, question Q5 Convolutional Neural Networks	2	528	November 7, 2021
Why do ResNets work? Convolutional Neural Networks	3	513	February 21, 2023

ResNets about Skip Connection

Related topics