MLE for Gaussian population - Reading

Hello everyone. I have 2 questions. In this explanation, when you have n sample X = (X1,X2,…,Xn), mean that X1 or X2 or Xn is either a sample from a population. Why Xi follows Normal distribution but not X follows Normal Distribution because I think X is random variable when Xi: X1, X2,… Xn is a value of this random variable. And if X1 is a value, so why do we have x1,x2,…,xn from the likelihood function. Thank you!

Hey @Luong_Nguyen_Dinh,

I can’t think of any instance where I have seen an individual and unique notation to represent random samples drawn from a particular distribution. I even searched on the web for one, but I couldn’t find any.

So, instead of interpreting this notation as X_i follows a Normal Distribution, please interpret this as “X_i are independent and identically distributed samples, drawn from a normal distribution”. In fact, this is the only notation for IID samples that I have come across up until now.

If you find a well-know mathematical notation for the above quoted interpretation, please do let us know, and we will be more than happy to modify the reading item.

You are correct on this, thanks a lot for pointing out this discrepancy. Either the random samples should be represented with x_1, x_2, ..., x_n, or in the definition of likelihood, X_1, X_2, ... , X_n should be used. I will pass this on to the team for correction purposes.


Hi guys!

This can be quite confusing.

You are correct, you can also consider that X is a random variable, however, to work with the likelehood function, it is better to consider them as n distinct draws from the same distribution.

Regarding the likelihood function, in fact the text is correct.

For each X_i, f_{X_i} is the PDF of X_i - and the PDF is a real function in the case that X_i \sim N(\mu, \sigma^2), so we can evaluate f_{X_i}(x) for any x \in \mathbb{R}, because it is the PDF for that specific value.

Considering X = (X_1, X_2, \ldots,X_n) with X_{i} \sim N(\mu, \sigma^2), we consider the likelihood function as the product of all f_{X_i} (in this case they are the same function, because they come from the same distribution). Then it is considered one observation from this random vector. This is considered as x = (x_1, x_2, \ldots, x_n).

I will add this definition in the likelihood function.


Hey @lucas.coutinho,
I guess I am more confused now. Are X_1, X_2, ... sample-sets or samples, because “sample” represent 2 different things, as per my knowledge.

Let’s say that I have a Normal Distribution, N (0, 1), and from which I draw one value at random, say a_1, then a_1 is referred to as a sample. However, if I draw say 3 values at random, A = \{a_1, a_2, a_3\}, then A is also referred to as a sample.

Now, coming to the likelihood function, why are we computing it for any x \in R? Aren’t we supposed to compute it for the observed values, which as per the reading item are X_1, X_2, ...?


1 Like

I also feel like you!

Hey guys!

I will try to explain a bit better. I will also add our Curriculum Developer, @magdalena.bouza in the thread, so maybe she can improve a bit the texting to avoid such confusion.

In the case being discussed, even though we are assuming that we have n samples, we treat X as a random vector. So, each f_{X_i} is the PDF for that specific random variable. So, even though the function f_{X_i} does not return a probability itself (we must lookt at the area under the curve given by f_{X_i}), it does return a likelihood, where higher values mean that that point is more likely to be observed. So for that case, the function L(\mu, \sigma; x) is defined for any value that is defined the f_{X_i}, since we are dealing with normal variables, it means all real numbers. This function will give, for any real vector (now of numbers and not random variables - or functions), x, the likilehood of that specific set of n numbers being observed.


Hey @lucas.coutinho,
Let me try to present my sources of confusion based off your replies. Please correct me wherever I am wrong.

When you mentioned X above, are you trying to refer to X = (X_1, X_2, ... X_n) or an individual X_i where i \in \{1, 2, ... n\}?

Also, when you mentioned “random vector”, what exactly is meant by that? Are you saying that it is a set of random elements? I believe in that case, it is the same as a sample set (or a set of elements / samples, where each element / sample is a random draw from the population)?



Hey @Elemento!

I was actually talking about X as a random vector. The confusion is that X_i is not a real number. It is a random variable, a function that will take a value \omega \in \Omega from the sample space \Omega and returns (in this case) a real number. The fact that we call it a random variable, is that we do not know exactly what value X_i will take because we do not control the \Omega. So what we can do is assign probabilities to its values. So when we write X = (X_1, X_2, \ldots, X_n) we are talking about X a random-vector of random variables, i.e., functions that we do not know exactly their values, but we do know which values are more or less likely to happen.

So when we are talking about actual real numbers that come from realizations of the X_i's, we usually write them as x_i to make this discintcion between what is the function that represents the random variable X_i and the value that we actually observed x_i.

Is it more clear now? Please tell me if it isn’t, and we can continue further. I know that it can be quite complicated, but those are in fact complex definitions.


Hey @lucas.coutinho,
I believe that I do have gained some clarity, but not completely. Let me present my query here.

What I have understood from your replies is that X is not a random variable, instead, it is a random vector. And each of the X_i is a random variable for i \in \{1, 2, ..., n\}.

Now, if I am correct on this, why the following is mentioned:

Suppose you have n samples, X = (X_1, X_2, ..., X_n)

Instead, wouldn’t it be better if it would have been something like:

Suppose you have a random vector, X consisting of n random variables, where each random variable X_i models the distribution of a single sample.

Please do correct me wherever I am wrong.


Hi @Elemento!

You are correct. It is confusing indeed, when we talk about samples, usually we consider them as random variables, because we do not know their value until we in fact observe them, so the only thing we can talk about them is about their behaviour as random variable. I will ask our Curriculum Developer to make it a bit clearer in the text. Thanks!

Hey @lucas.coutinho,
Finally, I guess I got it. Thanks a ton :partying_face:

I guess, the major source of confusion is the fact that throughout the lecture videos, when we talk about say n “samples” from the population, i.e., n random draws from the population, we had the observed values only. For instance, we were talking about 3 rolls of a fair die and we had 4, 1, 2. In other words, we simply had x = \{x_1, x_2, ..., x_n\}.

However, in this reading item, the first statement is concerned about the generic case (i.e., we haven’t observed any values yet), and that is why, everything is represented by a random variable. And also, since the second statement is about the observed values, we come back to small x(s) representation.

I really hope that I am correct now :joy: Please do correct me, if I am wrong. And yes indeed, it would be great, if the reading item can be elaborated a little bit more, so that it is easier to understand.

@Luong_Nguyen_Dinh, please feel free to let us know, if you are still confused in this.