How much images to collect for dataset to train a pre-trained model

Usama_Hameed · March 5, 2023, 5:07am

How much images to collect for dataset to train a pre-trained model.

paulinpaloalto · March 5, 2023, 5:59am

There is no “one size fits all” answer to a general question like this. The only general way to answer that is “enough to get the training to work successfully”. Of course that depends on the nature of your problem and there are other design choices that you need to get right in order for things to work (e.g. the type and architecture of the network you use).

Here’s a recent thread that actually shows some experimental results with dataset size for a particular problem, although it is dealing with convolutional nets. Those are a more advanced topic that is covered in DLS Course 4, but the same principles for evaluating the performnce of a network apply. At least that may give some concrete ideas for how to approach answering this question in a given situation.

ai_curious · March 5, 2023, 12:05pm

In the thread linked above, I tried a simple experiment on an untrained model where I ran it over and over with different number of inputs from a given overall training set. It was easy and relatively quick to do because the model architecture was quite small and the dataset was only 15,000 records. Two caveats.

First, I was not using any transfer learning…the model was trained from scratch each time on its own input subset. I did this in part because that was the question I was exploring; how much labelled training data would I need to train a model from scratch. And secondly, I did it because I could; each training run executed in roughly a minute. It took far longer to screen capture and upload the images to this forum than to do all the training runs. If your model takes a week to train, you probably don’t have the luxury to figure this out using my naive, brute force approach.

The good news is that transfer learning exists to leverage and improve upon an investment already made in model training. The article linked by @Christian_Simonis in that thread obtained good results from only a handful of training inputs. And he previously provided a link to a more sophisticated approach than I took, again leveraging transfer learning, based on measuring how much uncertainty each additional training input removes.

So my takeaway for an untrained model is ‘it depends’ but don’t be surprised if it takes on the order of magnitude 10^4 inputs. For transfer learning, the answer is still ‘it depends’. It depends on how many layers are frozen and how many are being retrained (or maybe the architecture is being extended?). It depends on how similar the new training inputs are to the original inputs. I’m sure there are more variables. Nevertheless, expect to need far less than the 10^4 suggested by my non-transfer learning case study. Perhaps as few as order of magnitude 10.

Hope this helps

By the way, if you do some across some quality articles or do some of your own experiments, please share.

Topic		Replies	Views
How much data does a CNN need to learn - continuation Convolutional Neural Networks coursera-platform	3	1051	March 1, 2023
CNN models with a small dataset of images - are the results meaningful? AI Discussions	8	70	June 27, 2022
Making a training set Neural Networks and Deep Learning coursera-platform	5	499	August 27, 2023
3 Questions on transfer learning Advanced Learning Algorithms week-module-3	3	570	September 21, 2022
Question About Transfer Learning Convolutional Neural Networks coursera-platform	5	621	May 31, 2022

How much images to collect for dataset to train a pre-trained model

Related topics