C1W2 - Batch Normalization

weishun_soong · January 2, 2024, 8:56am

Anyone understood the following formula:
z[i][L] = ∑(i=0) w[i][l]*a[i][L-1]

Since i represents the node, and L represents the layer, and in a fully connected network, should’t the value of z[i][L] factor in all values of a[i][L-1], where i=0 to n?

pastorsoto · January 2, 2024, 11:56am

Hi, great question!

The formula you’re referring to is a part of the calculations used in neural networks, specifically in the context of a fully connected layer. Let’s break it down:

z[i][L] represents the weighted input to node i in layer L.
w[i][l] represents the weight from node l in the previous layer (layer L-1) to node i in layer L.
a[i][L-1] is the activation from node i in the previous layer L-1.

The summation symbol ∑ indicates that to calculate z[i][L], you sum over all nodes l from the previous layer L-1, multiplying the activation a[l][L-1] of each node l by the weight w[i][l] connecting it to node i in the current layer L.

In a fully connected network, each node in layer L is connected to every node in layer L-1, so yes, z[i][L] should factor in all values of a[i][L-1] for all nodes i in the previous layer.

I hope this helps

weishun_soong · January 2, 2024, 6:23pm

Sorry, I didn’t quite understand the part of w[i][L] representing the weight from node l in the previous layer (layer L-1) to node i in layer L.

I thought *w[i][l] referred to the weight from node i in layer [L]?

But in this formula: ∑(i=0) w[i][l]*a[i][L-1], it seems to suggest that we multiply the weight of each node to the respective output from the previous layer as compared to multiplying the weight of each node with all outputs from the previous layer.

For e.g. when i=0, the formula will become: z[0][L] = w[0][L]*a[0][L-1].
I thought it should be something like z[0][L] = w[0][L]*a[0][L-1] + w[0][L]*a[1][L-1] + …+w[0][L]*a[n][L-1]?

Nydia · January 2, 2024, 7:17pm

Hi, I think, maybe the confussion is given by the missing upper limit in the sum. I we have n nodes in the (L-1)th layer, the summation goes from i=0 to i=n−1, in these manner we include z[i][L].

weishun_soong · January 3, 2024, 4:02am

Thank you so much for clarifying!

Nydia · January 3, 2024, 9:19am

you’re welcome and happy learning

Topic		Replies	Views
Confusion about Calculating dZ^[l] Neural Networks and Deep Learning coursera-platform	3	808	October 26, 2022
Convolution Confusion (ResNets) C4W2 Convolutional Neural Networks week-2 , coursera-platform	11	179	April 25, 2024
Doubt in formula for calculating Z Neural Networks and Deep Learning coursera-platform	3	598	May 5, 2021
C2_W3_Assignment - Exercise 4 - Implementing forward_propagation() Calculus for Machine Learning and Data Science week-3	3	571	March 10, 2023
Week 3: Batch-Normalization confusion Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	616	May 29, 2022

C1W2 - Batch Normalization

Related topics