Forward propagation

LUKASZ_WOJTASINSKI · February 14, 2024, 5:13pm

Hi! While I understand how to implement forward propagation algorithm ,as it was described in machine learning specialization course 2, I found it difficult to understand how exactly our model is learning proper weights? First we generate some random weights to start with something. Then we feed our first neuron with them and with X. Next step is to apply activation function in case of my scenario it is sigmoid. We repeat over and over. Where exactly is the learning part? I suppose it’s about how sigmoid function works but I might be wrong. Thanks and wish you all the best.

ai_curious · February 14, 2024, 6:10pm

My attempt to explain in a mostly non-mathematical way…

Run forward propagation and produce an output value (prediction)
Compute error between prediction and known correct value. (Loss Function)
Make a small change in each weight in the direction that you think will reduce total error. (Backwards Propagation)
Repeat

This process of iterating many times to gradually reduce error between predicted and correct known value we anthropomorphize as learning.

Note that math powers both the ‘compute the error’ and the ‘in the direction of reducing total error’ parts, and to really understand what learning means, you need to read and comprehend those equations/expressions.

EDIT: @TMosh uses the important word gradient below, which is part of the ‘in the direction of reducing error’ computation. Learning is a much more accessible way of describing this process than iterative first-order optimization of a locally differentiable function

TMosh · February 14, 2024, 6:29pm

Using TensorFlow or sklearn, the learning happens behind the scenes. When you fit a model, in the background the layers are computing the gradients based on the results of forward propagation as @ai_curious described. This learning process (updating the initial weights) is the step 3 in his reply.

The same process is discussed in the earlier portions of the MSL course, where you implement functions that are usually called “compute_gradients” and “update_parameters()”.

If you haven’t reached that point yet, you’ll get there soon.

LUKASZ_WOJTASINSKI · February 15, 2024, 9:29am

The thing is code which was introduced as ,from scratch neural network" is looking like that:

def my_dense(a_in, W, b):
“”"
Computes dense layer
Args:
a_in (ndarray (n, )) : Data, 1 example
W (ndarray (n,j)) : Weight matrix, n features per unit, j units
b (ndarray (j, )) : bias vector, j units
Returns
a_out (ndarray (j,)) : j units|
“”"
units = W.shape[1]
print(f"units {units}“)
a_out = np.zeros(units)
for j in range(units):
w = W[:,j]
print(f"w {w}”)
z = np.dot(w, a_in) + b[j]
a_out[j] = g(z) # g(z) is just a sigmoid function
return(a_out)

def my_sequential(x, W1, b1, W2, b2):
a1 = my_dense(x, W1, b1)
a2 = my_dense(a1, W2, b2)
return(a2)

def my_predict(X, W1, b1, W2, b2):
m = X.shape[0]
p = np.zeros((m,1))
for i in range(m):
p[i,0] = my_sequential(X[i], W1, b1, W2, b2)
return(p)
I don’t understand where is gradient in this code. It’s just passing inputs computed with sigmoid g(z) function to next layers, without performing any gradient.

LUKASZ_WOJTASINSKI · February 15, 2024, 4:11pm

Nevermind i just found out that above this code we generate weights with keras model and then we pass them to our code. Thanks for help!

ai_curious · February 15, 2024, 5:17pm

What your code fragment doesn’t include is any training loop. This is where the iterative computation of prediction, error, and gradient occurs. As shown, your model isn’t doing any learning and your observation is entirely correct…it’s just making a prediction on one static set of weights and inputs.

Topic		Replies	Views
Understanding forward propagation Advanced Learning Algorithms week-1	2	498	March 8, 2023
Sigmoid function within each layer Advanced Learning Algorithms week-1	6	566	November 27, 2022
Weights in each layer Advanced Learning Algorithms	4	294	January 9, 2024
Week 4 Exercise 9 - Backpropagation, L_model Neural Networks and Deep Learning	4	722	August 11, 2022
NN Training from Scratch Advanced Learning Algorithms week-2	3	480	January 7, 2023

Forward propagation

Related topics