We had doubts in the below statement, can you please help us to understand ?
Statement at 6:44 minute, And the idea is that if you pick a data set and maybe have enough data not just to train a single softmax unit but to train some other size neural network that comprises the last few layers of this final network that you end up using
Statement at 7:01, Finally, if you have a lot of data, one thing you might do is take this open source network and weights and use the whole thing just as initialization and train the whole network. Although again if this was a thousand of softmax and you have just three outputs, you need your own softmax output. The output of labels you care about. But the more label data you have for your task or the more pictures you have of Tigger, Misty and neither, the more layers you could train and in the extreme case, you could use the ways you download just as initialization so they would replace random initialization and then could do gradient descent,training updating all the ways and all the layers of the network.
Statement 2 Doubts: Here neither more layers could be train, does it means total layers including freeze + later layers or only the later layers
And one more doubt, Download opensource implementation weights means should we use take the final updated weights of the opensource and use it as initilization then use gradient descent to train all the weights ?