C4w2 multiple times fitting

When the following code is run once, the accuracy starts from 40%, but when I run this code 3-4 times the accuracy increases to approximately 99%. Is it due to the weights of the model being updated each time I run the code?? Earlier in the previous models as I fit the model the accuracy starts from the lowest point even If I had run the code 3-4 times. So is it due to the weight initializing, in this case, each time I run the code but it doesn’t happen in the case mentioned earlier???

Hey @Mohammad_Hamza,
The weights initialization is indeed the case. Had this code cell be performing the weights initialization, in that case, despite of how many times we run this code cell, the accuracy will start from the lowest point.

However, since the weights are loaded from a pre-trained model and just fine-tuned every time you run this code cell, hence, the number of times you run this code cell, the model will run for 5 epochs that many times. In this case, the training accuracy is most likely to increase, and the validation accuracy is most likely to decrease due to over-fitting.

Now, I am not sure which earlier assignments you are referring to, but yes, the crux is weights initialization only.

Cheers,
Elemento

But Sir in the resnet model in C4W2 there were initializations within the model but when I fit the model for the second time, the accuracy started from the point where it stopped last time as shown:


What is the reason in this case now?

Hey @Mohammad_Hamza,
I guess there is a gap between what I wanted to say and what I wrote in my last post, or let’s say that I wasn’t really explicit in stating that weights initialization is a process that is independent of transfer learning.

For understanding as to why this is happening, we just need to find out the line of code in which the weights are initialized. In the ResNet assignment, the line of code in which the weights are initialized is as follows:

model = ResNet50(input_shape = (64, 64, 3), classes = 6)

What does this mean for us? If you run the code starting from here till the point where model.fit() method is called, again and again, you will find that the weights are re-initialized every time, and the training starts from scratch. In simple words, irrespective of how many times you run the below lines of code together, the training will start from scratch.

model = ResNet50(input_shape = (64, 64, 3), classes = 6)
model.compile(optimizer=‘adam’, loss=‘categorical_crossentropy’, metrics=[‘accuracy’])
model.fit(X_train, Y_train, epochs = 10, batch_size = 32)

However, if you run the below lines of code only once;

model = ResNet50(input_shape = (64, 64, 3), classes = 6)
model.compile(optimizer=‘adam’, loss=‘categorical_crossentropy’, metrics=[‘accuracy’])

and run the below line again and again;

model.fit(X_train, Y_train, epochs = 10, batch_size = 32)

You will find the training to continue from the last remaining point, since the weight are not initialized again. In fact, the weights are the ones which were found the last time the above line of code was run. I guess this answers your query as to why you are getting what you are getting. Now, if you are wondering as to why in the assignment these lines of code weren’t put together to avoid this kind of confusion, then it is perhaps because putting them individually allowed the developers to put the documentation for each of these lines of code individually, and in a much more simpler manner for the learners to grasp. Also, for not over-loading the learners with information, this might be avoided.

Now, here’s a fun-fact. I am not sure if this has been already covered in the previous assignments or not, or will be covered in the future assignments or not, but let me put it out there, for those of you who are curious just like you.

model = ResNet50(input_shape = None, classes = 6)

In the above line of code, you will find that we haven’t defined the shape of the input. This means that the model doesn’t know the shape of the weights matrices yet, and hence, the above line of code doesn’t initialize the weights.

It is only after we pass the inputs to the model that the model figures out the shape of the weight matrices, and it initializes the weights, i.e., when the below line of code is run.

model.fit(X_train, Y_train, epochs = 10, batch_size = 32)

For more information about the same, check out the Version 12 of this kernel. So, now in your opinion, what do you think will happen if we run the below lines of code only once;

model = ResNet50(input_shape = None, classes = 6)
model.compile(optimizer=‘adam’, loss=‘categorical_crossentropy’, metrics=[‘accuracy’])

and run the below line of code again and again;

model.fit(X_train, Y_train, epochs = 10, batch_size = 32)

Do you think that the weights will get initialized again every time the above line of code runs? If so, then let me correct you my friend. In this case, as well, the training will continue from the last point, i.e., the weights aren’t initialized from scratch.

This is because the above line of code only initializes the weight if they haven’t been initialized yet, i.e., the model doesn’t have any weights yet, but once the model has some weights, the fit method will always use those weights instead of initializing them from scratch.

In fact, this query is the reason as to why I put the above line in my previous reply. When I was composing my previous reply, I checked out the ResNet assignment as well for reference (in case that’s the previous assignment you were referring to), but the same thing is supposed to happen in this assignment as well, as you just pointed it out, so I thought, perhaps you were referring to earlier assignments in earlier courses of the specialization. Anyways, I hope this helps.

Cheers,
Elemento