Week 2 - Addressing Data Mismatch (Video) - Car Audio Scenario

One possible solution to fit 1 hour of car noise over 10,000 hours or more clean audio, without the fear of over fitting to the noise is as follows,
Step 1 : Divide the 1 hour noise audio into 60 fragments of 1 minute audio.
Step 2 : Join those 60 fragments in a random manner to synthesis audio for long duration. 10^60 possibilities.

This may not stop the model to recognise the 60 different audio fragments. But It will remove the effects of sequential order, compared to replicating the audio 10,000 times.

Any thoughts on that?

Where does your 10^60 come from?

It is an oversimplified example.
Assuming we need to synthesis 10 minutes of audio, and the fact we have 60 different 1 minute fragments,
The total possibility of arranging those 10 slots is 10^60 (with replacement) and 60_P_10 permutations (without replacement).