Central Limit Theorem - Continuous Random Variable

Dear Mentor,

From this lecture (Time: 3:21 / 7:36),
Central Limit Theorem - Continuous Random Variable | Coursera

Could you please guide me how to understand this part of equation? May i have an example?

Quoted from the lecture

since all variables are identically distributed,
this is 1/n times n times the mean of X, which is simply the mean of X

Thank you.

Hello @JJaassoonn

It is saying that all the circled terms - they are all just image.

What can you think of that can make the answer (to existing or any potentially new question) complete?

Cheers,
Raymond

Dear Mr Raymond,

At the time 0:30 / 7:36 from the lecture, X follows a uniform distribution U(0,15).

My thought is that X = [X1, X2, X3, …, Xn],

The first equation is mean of X1 + mean of X2 +…+ mean of Xn
The second equation is mean of X1, X2 ,…, Xn

It seems like something is weird.

Please correct my mistake.

I have no clue about this question.

Thank you.

Great, @JJaassoonn! Thanks for sharing your thought!

This is not the case. Each X_i is one variable. X is not a collection of those variables.

Let’s forget about those X_i for now.

Let’s say V_1 is the variable standing for the height of people in Planet A, and the variable follows the Gaussian distribution image. What is image ?

Cheers,
Raymond

Dear Mr Raymond,

V1 is the variable follows the Gaussian distribution N( 1.7 , 1 ).
E[V1] = 1.7

May i know whether it is correct?

Thank you.

1 Like

Hello @JJaassoonn,

That’s correct! So, if you were holding a device that has a button and a screen, such that whenever you push the button once, it shows a new number generated following the Gaussian distribution N(1.7, 1), how would you use the device to demonstrate that the expected value is 1.7?

Cheers,
Raymond

Dear Mr Raymond,

I am sorry for my delayed response.

In my opinion to use the device to demonstrate that the expected value is 1.7, we have to set the range of variable V1 to be from -1.3 to 4.7 (within 3 standard deviation).

Besides, we should set the chance of getting number within 1 standard deviation to be 68.2%, number within 1 standard deviation and 2 standard deviation to be 27.2%, and number within 2 standard deviation and 3 standard deviation to be 4.2%.

May i know if this assumption is correct?

Thank you

It’s alright. I believe it is us who controls our own time, and it is enjoyable to work with learners who are spending time and effort, not saving them.

No. You don’t need to set anything, because I told you that:

Any number generated is already following N(1.7, 1.).

In fact, you cannot set anything, because the device has only one button and one screen. Pressing the button only gives you a new number that is generated following N(1.7, 1.).

@JJaassoonn, think about how to use the numbers generated by the device.

In fact, I always encourage learners to take their time :wink: No need to rush, @JJaassoonn! :wink:

Cheers,
Raymond

Dear Mr Raymond,

Since the device has only one button, pressing the button once is equal to one trial. If i take limited numbers of trial, eg. 10 trials, this is not sufficient to demonstrate that the expected value is 1.7. However, if i take the numbers of trial tend to infinity, i can get an expected value of 1.7.

May i know if this is correct?

Thank you.

Hello @JJaassoonn, very well said! The expected value is the average of all the trials/samples. With a very small sample size, the average you get would have a larger deviation from the true mean. With sufficient samples, the deviation becomes smaller. With infinite samples, the deviation is close to 0.

We say the device generates samples for a variable V_i that follows the gaussian distribution of N(1.7, 1.), and the expected value of V_i is 1.7. Here, I want to make sure you know that, during all this time discussing about the device, we are only dealing one variable. A sample is NOT a variable. OK? The variable follows a distribution, and we generate samples from those distribution. If you are not sure, you may want to google some readings with keywords like “statitics variables vs samples”.

I remember we have discussed in the DLS/MLS forums, so I trust you can find yourself a Python environment. I would like you to study and run the following code, and make sure you see the difference by progressively changing the size parameter from 10 to 100, to 1000, to 10000, and so on. I also trust you can find explanations to any library function used on the Internet, so I did not add comments about that.

import numpy as np
from matplotlib import pyplot as plt

rng = np.random.default_rng(10)
samples = rng.normal(loc=0., scale=1., size=10)

print(
    f'Mean and variance of the samples: {samples.mean()}, {samples.var()}'
)

plt.hist(samples, bins=100)
plt.show()

Please take your time playing with the codes.

Coming back to the central limit theorem lecture, can you write another piece of code that demonstrates the content of the lecture?

I mean, you know what the theorem is about from the lecture. You know how to have a variable to generate samples in Python. You know how to plot the distribution to see if anything shows a normal distribution.

This is not going to be easy, so please take your time. I would like to see what you will get us, and then we will move on from there.

Cheers,
Raymond