C2_W1_Lab02_CoffeeRoasting_TF weights and randomization

DagerD · November 27, 2023, 3:37pm

Hello, I have questions about weights:

Why do we have 3 weighs for L2? According to this slide we should have 1 weight multiplied by 3 activation values.

image1800×966 124 KB
I see that we should initially set some values for weights and they can’t be 0 - we use some random function to do it. The question is: is it possible that random gives us such weights that our algorithm will not reach convergence?

There is also question about this part:

I don’t understand how it helps us since we just repeated our data?

TMosh · November 27, 2023, 9:42pm

The arrows in that diagram don’t show the weights. They show the data flow.

In a Dense network, each unit in a layer is connected to all of the units in the adjacent layers.

Possible, yes. Likely, not.

DagerD · November 27, 2023, 10:18pm

In a Dense network, each unit in a layer is connected to all of the units in the adjacent layers.

I see, but formula confuses me, because our L2 has only 1 unit (j-index) and then we should have only 1 weight. Am I understand correctly, that amount of connections (and weights) differs from one layer type to other and I will learn it later?

TMosh · November 27, 2023, 10:37pm

For counting the weights, it’s not the type of layer that matters. It’s the connections between layers.

Each place where you can draw a line between units in adjacent layers, that represents a weight value.

This figure from the Coffee Roasting lab is an over-simplification, because it doesn’t show the W1 weights effectively.

This is a better representation (omitting the bias units for clarity):

There are two input features.
There are three hidden layer units
There is one output layer unit.

Each unit in A1 and A2 also has a bias value (not shown here).

Kic · November 27, 2023, 10:42pm

Hi @DagerD ,

j is just a variable addressing a particular unit in a layer. If you look at the enlarge portion of the diagram you posted, you can see for layer3, there are 3 units. Each circle within a layer is a unit.

So for layer 2, there are 5 units. If we were to take layer 2 unit 4, j would be 4 and l would be 2.

DagerD · November 27, 2023, 10:56pm

Got it, thank you!

Could you please also clarify this part?

Is it true, that there are two options (in lab case):

work with initial data with more epochs
duplicate data with less epochs

and both are equal to work with, but second chose just for optimization?

TMosh · November 27, 2023, 11:31pm

Iteration is a slow process, so putting multiple copies of the same data into the training set allows us to make fewer iterations.

Either way, the model sees the same amount of data in total - it’s just in bigger chunks if you duplicate the examples.

Topic		Replies	Views
How did the coffee roasting NN get trained? Advanced Learning Algorithms week-1	5	615	September 26, 2022
Coffee Roasting Example. How come 3 neurons with the same activation function (sigmoid) provided outputs for 3 different regions? Advanced Learning Algorithms week-1	4	327	November 16, 2023
Values for weights and bias Advanced Learning Algorithms week-1	1	463	October 1, 2022
Weight count on C2_W1_Lab02_CoffeeRoasting_TF lab Advanced Learning Algorithms week-1	1	34	September 24, 2024
How can a layer with 3 units can have 6 weights Advanced Learning Algorithms week-1	4	212	March 7, 2024

C2_W1_Lab02_CoffeeRoasting_TF weights and randomization

Related topics