Will tile function result in overfitting?

In the week 1 of 2nd course, we used np.tile() function to replicate the data for better training of our model, but since the data is duplicate for many rows, won’t it result in overfitting of the model and it may have a problem in generalizing the result.

Hi @Utsav_Sharma1,

That’s a good question, the answer is it depends on the results. Replicating data is one of method to balance the ratio for the data.
We often have to experiment in various ways with the dataset to improve the model. If the result is worse, then you can reduce the number of duplicated data to see.

In addition to what was said,

duplicating the data will cause you to have the same parameters in half the number of epochs.

And the idea of it causing overfitting is exactly as if you don’t duplicate your data and double the number of epochs.

so you mean to say that its going to overfit in both cases, when the data is replicated and when we increase the number of epochs?

yes it has the same effect