How much images to collect for dataset to train a pre-trained model

There is no “one size fits all” answer to a general question like this. The only general way to answer that is “enough to get the training to work successfully”. Of course that depends on the nature of your problem and there are other design choices that you need to get right in order for things to work (e.g. the type and architecture of the network you use).

Here’s a recent thread that actually shows some experimental results with dataset size for a particular problem, although it is dealing with convolutional nets. Those are a more advanced topic that is covered in DLS Course 4, but the same principles for evaluating the performnce of a network apply. At least that may give some concrete ideas for how to approach answering this question in a given situation.