Hi guys,
I reckon that the explanation of CLT in the video “Central Limit Theorem - Continuous Random Variable” is incorrect or confusing.
First of all, the explanation and code about CLT in the lab after the video is correct. The rough idea is,
- Toss k dice each trial and record the mean value of these k dice.
- The mean approximates a normal distribution as k → infinity, but in practice k > 30 will suffice.
- We can draw a plot with k=30 and a large number of trials to verify the theorem. We can increase k to see the effect.
However, the idea conveyed by the video seems to be different. It seems to state,
- Toss 3 dice each trial and record the mean value of these 3 dice.
- Draw plots with different numbers of trials: 5, 25, 50 and 100.
- The plot for 100 trials will look most like a normal distribution while the plot for 5 won’t.
If this is indeed what the video meant to convey, then it is an incorrect explanation of the theorem.
The code below reflects my understanding of the video. The video seems to say that the distribution is not quite a normal distribution when the argument count
is 5, but will approximate a normal distribution when count
is 100. This is NOT CLT.
I wonder if the video meant something else.
code.py (934 Bytes)
def plot_video(count):
array1 = np.array([np.random.choice(dice) for _ in range(count)])
array2 = np.array([np.random.choice(dice) for _ in range(count)])
array3 = np.array([np.random.choice(dice) for _ in range(count)])
array = (array1 + array2 + array3) / 3
sns.histplot(array, stat='frequency')