Confused between sample size and number of samples in CLT and Law of Large numbers

Hey @karthikeyan_S2,
Welcome, and we are glad that you could become a part of our community :partying_face:

Thanks for creating this thread. I was actually for something like this, since I myself had some confusions in understanding the lecture videos “Law of Large Numbers” and “Central Limit Theorem - Discrete Random Variable”. In my opinion, the videos only serve the purpose of confusing the learners (be they have some prior knowledge, or be they beginners).


Let’s begin with discussing the discrepancies in Week 3 (Lecture 1), which I have come across till now (Central Limit Theorem - Discrete Random Variable):

Discrepancy 1

  • The first lecture video, defines the concept of “sample”, but nowhere in the lecture videos, it has been mentioned that “sample” could mean 2 different things.
  • In fact, the lecture videos themselves use “sample” for 2 different meanings.
    • “Population and Sample → 0:32”: A sample is a smaller subset that we actually observe or measure (Here, it is implying that “sample” refers to the set of samples)
    • “Population and Sample → 2:50”: If you thought that the first one was better, that is correct because you always want to take random samples (Here, it is implying that “sample” refers to individual samples)
  • Let’s understand it better with the help of an example. Consider the example of a fair dice being rolled. Now let’s say we roll it 4 times, and get (1, 5, 4, 2).
  • Here, 1, 5, 4 and 2 are individually referred to as samples. But the confusing part is that (1, 5, 4, 2) in itself is also referred to as a sample.
  • Say, if we roll the dice for 4 more times, and let’s say we get (2, 6, 3, 1). Here, this is another sample, consisting of 4 samples.
  • Therefore, for our discussion, we have 2 samples, each consisting of 4 samples, i.e., we have a sample-size of 4.
  • For reference, check out this video, time-stamp 2:10 onwards. Unlike Luis, Sal has explicitly mentioned this.

Discrepancy 2

  • In the Coursera videos, the same notation has been used for different concepts in different videos.
  • Now, off course, we can use the same variable to denote different things, after all, they are variables :joy:
  • But I believe that the important parameters should always be denoted with different variables, and the notation should be consistent throughout the videos.
  • You will find that in the lecture videos “n” denotes both sample-size, as well as the number of samples. I am mentioning 2 references here:
    • “Population and Sample → 2:03”: In this example, the population size is 10,000, denoted by N, and the sample size could be anything smaller, from 1-9,999, that’s denoted by n (Here, n denotes sample size)
    • “Law of Large Numbers → 2:32”: So if n is the number of samples (Here, n denotes number of samples)

Conclusion

  • These are the 2 major discrepancies which I could observe as of now, and a lot of minor discrepancies, and in my opinion, these make the videos more confusing than they make them beneficial for the learners.
  • In fact, you are not the first one to feel that the videos are confusing or incorrect. Check out the following threads: Thread 1 and Thread 2.
  • At this, point I would suggest you to follow other content on the web, since the team is still working on fixing the content. For starters, you can check out this playlist, which I believe is an amazing source of knowledge.

P.S. - @lucas.coutinho please take a note of this thread. Currently, the content of Week 3, Lecture 1 is extremely confusing, and needs some major changes on an urgent basis. I guess, adding a note regarding this can be helpful for the learners, so that they can avoid getting confused meanwhile.

Cheers,
Elemento

1 Like