Hi Mentor,

From this lab what does it mean ideal curve ? Im unable to understand the statement . can someone explain briefly ?

  • the ‘ideal’ curves represent the generator model to which noise was added to achieve the data set

Hey @Anbu,
The data that has been used to plot this curve is artificially generated, the code for which you can find in the plt_overfit.py script. Also, you will find that some noise has been added to this artificially generated data to make it resemble more closely to the real-world data.

So, the expression “generator model to which noise was added to achieve the data set” refers to the model (all the different segments of code) used to create the dataset, and the statement states that the “ideal” curves represent this model, or in other words, the “ideal” curves are able to find out the mapping between x_0 and x_1, that this model creates. Let me know if this helps.


Thanks sir for the reply. But what is the use of having ideal curves drawn using artifical noise data. ? what it is showing for us ?

Generator model means used to create the dataset ? Is it prediction final model ?

Can you please sir ?

Hey @Anbu,
The aim of the lab is to depict the concept of “Overfitting”, which I am assuming, you have understood from the lecture videos, and if not, then please review the lecture videos once again.

Getting to the lab, we are not using “artificial noise data”, we are using “artificially generated data to which noise is added”. Please read this carefully to understood the difference between the 2. If the latter expression seems to be confusing, then just assume that we are using a real-world dataset with 2 features x_0 and x_1.

I believe, once you understand that “artificially generated data to which noise is added” is equivalent to any other real-world dataset, you won’t have to think much about what the “generator model” is and what does it do.

But even if you want to, let me try. A real-world dataset is created by collecting samples from different sources. So, tell me this, how will be an artificial dataset created? You will need some function, which takes in the input variables, and produce some output, which will be your true label (or true value), right? This function is none other than our generator model in this case.

Now, this query seems to be a bit abstract. The predictions are the values that are supposed to be similar to the true labels/values.

I hope this helps.

P.S. - Please don’t refer to me as Sir. I am just a learner like you.


It’s an easy way to generate a data set for testing your code.