MLE for Gaussian population - Reading

Luong_Nguyen_Dinh · July 14, 2023, 5:35am

Hello everyone. I have 2 questions. In this explanation, when you have n sample X = (X1,X2,…,Xn), mean that X1 or X2 or Xn is either a sample from a population. Why Xi follows Normal distribution but not X follows Normal Distribution because I think X is random variable when Xi: X1, X2,… Xn is a value of this random variable. And if X1 is a value, so why do we have x1,x2,…,xn from the likelihood function. Thank you!

Elemento · July 17, 2023, 5:37pm

Hey @Luong_Nguyen_Dinh,

I can’t think of any instance where I have seen an individual and unique notation to represent random samples drawn from a particular distribution. I even searched on the web for one, but I couldn’t find any.

So, instead of interpreting this notation as X_i follows a Normal Distribution, please interpret this as “X_i are independent and identically distributed samples, drawn from a normal distribution”. In fact, this is the only notation for IID samples that I have come across up until now.

If you find a well-know mathematical notation for the above quoted interpretation, please do let us know, and we will be more than happy to modify the reading item.

You are correct on this, thanks a lot for pointing out this discrepancy. Either the random samples should be represented with x_1, x_2, ..., x_n, or in the definition of likelihood, X_1, X_2, ... , X_n should be used. I will pass this on to the team for correction purposes.

Cheers,
Elemento

lucas.coutinho · July 18, 2023, 11:56pm

Hi guys!

This can be quite confusing.

You are correct, you can also consider that X is a random variable, however, to work with the likelehood function, it is better to consider them as n distinct draws from the same distribution.

Regarding the likelihood function, in fact the text is correct.

For each X_i, f_{X_i} is the PDF of X_i - and the PDF is a real function in the case that X_i \sim N(\mu, \sigma^2), so we can evaluate f_{X_i}(x) for any x \in \mathbb{R}, because it is the PDF for that specific value.

Considering X = (X_1, X_2, \ldots,X_n) with X_{i} \sim N(\mu, \sigma^2), we consider the likelihood function as the product of all f_{X_i} (in this case they are the same function, because they come from the same distribution). Then it is considered one observation from this random vector. This is considered as x = (x_1, x_2, \ldots, x_n).

I will add this definition in the likelihood function.

Cheers,
Lucas

Elemento · July 19, 2023, 4:57am

Hey @lucas.coutinho,
I guess I am more confused now. Are X_1, X_2, ... sample-sets or samples, because “sample” represent 2 different things, as per my knowledge.

Let’s say that I have a Normal Distribution, N (0, 1), and from which I draw one value at random, say a_1, then a_1 is referred to as a sample. However, if I draw say 3 values at random, A = \{a_1, a_2, a_3\}, then A is also referred to as a sample.

Now, coming to the likelihood function, why are we computing it for any x \in R? Aren’t we supposed to compute it for the observed values, which as per the reading item are X_1, X_2, ...?

Cheers,
Elemento

Luong_Nguyen_Dinh · July 19, 2023, 10:35am

I also feel like you!

lucas.coutinho · July 20, 2023, 5:45pm

Hey guys!

I will try to explain a bit better. I will also add our Curriculum Developer, @magdalena.bouza in the thread, so maybe she can improve a bit the texting to avoid such confusion.

In the case being discussed, even though we are assuming that we have n samples, we treat X as a random vector. So, each f_{X_i} is the PDF for that specific random variable. So, even though the function f_{X_i} does not return a probability itself (we must lookt at the area under the curve given by f_{X_i}), it does return a likelihood, where higher values mean that that point is more likely to be observed. So for that case, the function L(\mu, \sigma; x) is defined for any value that is defined the f_{X_i}, since we are dealing with normal variables, it means all real numbers. This function will give, for any real vector (now of numbers and not random variables - or functions), x, the likilehood of that specific set of n numbers being observed.

Cheers,
Lucas

Elemento · July 22, 2023, 4:18am

Hey @lucas.coutinho,
Let me try to present my sources of confusion based off your replies. Please correct me wherever I am wrong.

When you mentioned X above, are you trying to refer to X = (X_1, X_2, ... X_n) or an individual X_i where i \in \{1, 2, ... n\}?

Also, when you mentioned “random vector”, what exactly is meant by that? Are you saying that it is a set of random elements? I believe in that case, it is the same as a sample set (or a set of elements / samples, where each element / sample is a random draw from the population)?

Cheers,
Elemento

.

lucas.coutinho · July 24, 2023, 2:05pm

Hey @Elemento!

I was actually talking about X as a random vector. The confusion is that X_i is not a real number. It is a random variable, a function that will take a value \omega \in \Omega from the sample space \Omega and returns (in this case) a real number. The fact that we call it a random variable, is that we do not know exactly what value X_i will take because we do not control the \Omega. So what we can do is assign probabilities to its values. So when we write X = (X_1, X_2, \ldots, X_n) we are talking about X a random-vector of random variables, i.e., functions that we do not know exactly their values, but we do know which values are more or less likely to happen.

So when we are talking about actual real numbers that come from realizations of the X_i's, we usually write them as x_i to make this discintcion between what is the function that represents the random variable X_i and the value that we actually observed x_i.

Is it more clear now? Please tell me if it isn’t, and we can continue further. I know that it can be quite complicated, but those are in fact complex definitions.

Thanks,
Lucas

Elemento · August 2, 2023, 3:00pm

Hey @lucas.coutinho,
I believe that I do have gained some clarity, but not completely. Let me present my query here.

What I have understood from your replies is that X is not a random variable, instead, it is a random vector. And each of the X_i is a random variable for i \in \{1, 2, ..., n\}.

Now, if I am correct on this, why the following is mentioned:

Suppose you have n samples, X = (X_1, X_2, ..., X_n)

Instead, wouldn’t it be better if it would have been something like:

Suppose you have a random vector, X consisting of n random variables, where each random variable X_i models the distribution of a single sample.

Please do correct me wherever I am wrong.

Cheers,
Elemento

lucas.coutinho · August 11, 2023, 4:45pm

Hi @Elemento!

You are correct. It is confusing indeed, when we talk about samples, usually we consider them as random variables, because we do not know their value until we in fact observe them, so the only thing we can talk about them is about their behaviour as random variable. I will ask our Curriculum Developer to make it a bit clearer in the text. Thanks!

Elemento · August 21, 2023, 5:20am

Hey @lucas.coutinho,
Finally, I guess I got it. Thanks a ton

I guess, the major source of confusion is the fact that throughout the lecture videos, when we talk about say n “samples” from the population, i.e., n random draws from the population, we had the observed values only. For instance, we were talking about 3 rolls of a fair die and we had 4, 1, 2. In other words, we simply had x = \{x_1, x_2, ..., x_n\}.

However, in this reading item, the first statement is concerned about the generic case (i.e., we haven’t observed any values yet), and that is why, everything is represented by a random variable. And also, since the second statement is about the observed values, we come back to small x(s) representation.

I really hope that I am correct now Please do correct me, if I am wrong. And yes indeed, it would be great, if the reading item can be elaborated a little bit more, so that it is easier to understand.

@Luong_Nguyen_Dinh, please feel free to let us know, if you are still confused in this.

Cheers,
Elemento

Topic		Replies	Views
Central Limit Theorem - Continuous Random Variable Probability & Statistics for Machine Learning &... week-3	9	293	March 16, 2024
C3_W1 definition of probability Unsupervised Learning, Recommenders, Reinforcement week-1	4	530	December 31, 2022
Variance calculation for sample Probability & Statistics for Machine Learning &... week-3	1	305	February 6, 2024
Does "MLE: Gaussian example" video have a conceptual mistake? Probability & Statistics for Machine Learning &... week-3	5	485	July 18, 2023
Week-4: Confused about sampling distribution of sample means Probability & Statistics for Machine Learning &... week-4	1	33	August 4, 2024

MLE for Gaussian population - Reading

Related topics