In the video “Tips for getting started”, Andrew talks about trying out code on a small subset of data before training with a whole training set. When we talk about some complex algorithms, we know that they perform better when they are given more and more data.
So, how if we give a small subset of data to algo, we will know that it is performing good or bad?
Hi @mukul1997
Welcome to our community.
I guess that what Dr. Andre Ng suggest is just a sanity check of the model before spending hours to train it on a large dataset.
He does the example of the speech recognition system. He tried to overfit just one audio clip on the training set and realized that his system returned ‘space, space, space, space, space, space’. Clearly it wasn’t working. There wasn’t much point to spending hours and hours training it on a giant training set if it doesn’t work on a small dataset.
So the tip Andrew Ng recommends is to give the model a try just on a small dataset to understand if it works and avoid to spend many hours to train it on a giant dataset. He is not talking about the model performance.
Hope this can help
Regards