How tile data can reduce the number of training epochs?


It says:
Tile/copy our data to increase the training set size and reduce the number of training epochs.

I don’t understand it.
If we want to increase the training set size, shouldn’t we get more new data instead of copying existing data? How is it useful?

I understand each epoch is doing something similar to gradient descent? So say epochs is 10, then gradient descent will be applied 10 times on the parameters (w and b). Is it? Copying existing data increase the times of gradient descent? Then why do we still need epochs?

Hi @Jinyan_Liu,

Check this out. It gives you a reason why we wanted to do that.

Note that copying data does NOT replace getting new data. If you want to combat overfitting, you need to find new data.

Raymond

Thanks!
Now I understand!
The times of gradient descents applied is not equal to the times of epoch ran!
So no matter it’s 40*5 or 10*20, the gradient descents are applied the same times!
Thank you!