Course 3 Week 2 Miss match data

Termu · January 4, 2023, 1:52pm

In addressing data mismatch, we were talking about synthesizing data of 10000 hours of normal voice + 1 hour of car voice
So if it means we only get 1 hour of data with in car voice and remaining is just normal voice, so that’s the NN get overfitted ?? or it is something else.
Sir aslo said we repeat the car noise 1000 times so it will be 10000 hours of car noise,
so at final we get a in car noice of 10000 hours , so how it will overfit if all the data is similar ??

rmwkwok · January 5, 2023, 6:37am

Hello @Termu,

Let’s take a look again at what Andrew had said:

one thing you could try is take this one hour of car noise and repeat it 10,000 times in order to add to this 10,000 hours of data recorded against a quiet background. If you do that, the audio will sound perfectly fine to the human ear, but there is a chance, there is a risk that your learning algorithm will over fit to the one hour of car noise.

He said “there is a chance, there is a risk” that it can overfit. Therefore, using only 1 hour of car noise isn’t a sufficient condition for overfitting, but it may or may not happen. Andrew further explained why it may happen with the following illustration:

I think he meant that the 1 hour of car noise we collected might turn out to be not representative at all. For example, if person A collected one hour of car noise when they drove alone, then that hour of car noise CANNOT represent car noise when there are passengers like a family of 2 kids.

If we had used person A’s one-hour car noise, then our model is vulnerable to overfit itself to the kind of noise that person A had collected, and failed to recognize when noises by a family is present.

Cheers,
Raymond

Termu · January 5, 2023, 7:27am

Thank you for the response, I also had the same intuition.

Topic		Replies	Views
Week2 Quiz question 10 Structuring Machine Learning Projects coursera-platform	1	666	July 21, 2021
Week 2 - Addressing Data Mismatch (Video) - Car Audio Scenario Structuring Machine Learning Projects coursera-platform	2	547	May 11, 2023
Artificial data synthesis Structuring Machine Learning Projects coursera-platform	1	560	January 21, 2022
Week 2/Question 10 Structuring Machine Learning Projects coursera-platform	5	737	August 11, 2021
Data Augmentation to Address Foggy Images Quiz 2 Question 10 Structuring Machine Learning Projects coursera-platform	9	1748	November 18, 2022

Course 3 Week 2 Miss match data

Related topics