CDF of a random variable X(discrete or continuous) follows uniform distribution

Devarapalli_Vamsi · January 23, 2024, 10:39am

In the final programming assignment of “week 1” of the course probability & statistics for ML and data science, it is stated that,
“In the lectures, you learned that if 𝑋
is a random variable with CDF 𝐹, then 𝐹(𝑋)
follows a uniform distribution between 0 and 1. In other words, the new random variable 𝐹(𝑋) will be uniformly distributed between 0 and 1”.

I had re-watched the video “Sampling from distribution” as per response to
“Same doubt posted by other student”.

What I am concerned about and where I am getting confused is the following line
with which the video begins:
“Computers generate random numbers from uniform distribution in the given interval.”

Specifically, Why should we take random numbers that are distributed uniformly?

Reference in the lab/assignment:
The following line is the beginning of the cell " Exercise 1: Uniform Generator" in the assignment.
“The natural first step is to create a function capable of generating random data that comes from the uniform distribution”
Also, I might understand this better if someone could highlight the issues that I shall face if I don’t take random numbers from a “uniform distribution”.

Keeping that aside for a while, what’s happening in the video is, instructor takes the random numbers (probabilities between 0 and 1) and finding corresponding ‘x’ values on horizontal axis (which is termed as finding F^-1(Y) in assignment) and finally saying that the values obey the distribution that we take.

Someone please help me understand this. I had no doubts at all until this point.
Thanks.

pastorsoto · January 23, 2024, 3:36pm

Hi @Devarapalli_Vamsi great question. When we Take random samples from a distribution we want numbers that obey from that distribution. In others words you want to have numbers that had equal chance of being selected, like the outcome of a dice, but you want to imitate probabilities no outcomes, this is a way to mimic real wolrd behavior, if you count the outcomes of a dice in 6 trials it is unlikely to have the same outcomes in each posibility, but if you have 100000 the caunce of having the same outcome in each posibility increase

I hope this helps

Devarapalli_Vamsi · January 23, 2024, 7:28pm

sorry @pastorsoto didn’t get you. I request you to elaborate.

pastorsoto · January 24, 2024, 12:45am

Your question touches on a fundamental concept in statistics and computer simulations: the use of uniformly distributed random numbers as a basis for generating random samples from other distributions.

Firstly, the reason we often start with uniformly distributed random numbers is due to the way computers generate randomness. Most computer-generated random numbers are actually pseudorandom, meaning they are produced by deterministic algorithms. These algorithms are designed to produce numbers that behave as if they are random. For practical purposes, these numbers are generated uniformly across a specific range, typically between 0 and 1.

Now, why is this uniform distribution useful? It’s because of its simplicity and the fact that it covers the entire range of probabilities (from 0 to 1). This uniformity allows us to map these random values to any other probability distribution using various techniques.

Regarding the specific concept you mentioned, the Cumulative Distribution Function (CDF), let’s delve a bit deeper:

The CDF, 𝐹(𝑋), of a random variable 𝑋, gives the probability that 𝑋 will take a value less than or equal to 𝑥. It’s a function that grows from 0 to 1 as 𝑥 moves across the range of possible values.
If you take a random variable 𝑌 that is uniformly distributed between 0 and 1, and apply the inverse CDF of 𝑋 to it (denoted as 𝐹^(-1)(𝑌)), you effectively transform 𝑌 into a random variable that follows the distribution of 𝑋.

This transformation is crucial in statistical simulations. It allows us to generate random samples from any probability distribution, given that we can generate uniformly distributed random numbers and we know the CDF of the target distribution.

If you don’t start with a uniform distribution, the transformation becomes much more complex and less general. For example, if you start with a distribution that is not uniform, you would first need to understand and mathematically characterize that distribution before you could map it to another distribution. This is often impractical or impossible with arbitrary non-uniform distributions.

In summary, starting with uniformly distributed random numbers is a matter of practicality and simplicity. It allows for a standardized way to simulate random samples from virtually any probability distribution, which is a cornerstone technique in statistics, machine learning, and data science.

Let me know if this helps

Devarapalli_Vamsi · January 24, 2024, 11:42am

Thanks for your lucid explanation @pastorsoto, your response cleared my doubt and It’s crystal clear now. thanks again.

Devarapalli_Vamsi · January 24, 2024, 11:47am

Hey @Deepti_Prasad ,thanks for your additional example oriented explanation!!.
Got a good clarity now.
And yeah, you didn’t confuse me.

Devarapalli_Vamsi · January 25, 2024, 1:18am

Attaching the response that I got in stack exchange. Thought this might be helpful for the community.

Deepti_Prasad · January 25, 2024, 2:45pm

I had explained the same way

Happy that you understood the doubt you had.

Happy Learning!!!

Regards
DP

Topic		Replies	Views
CDF of a random variable X follows a uniform distribution Probability & Statistics for Machine Learning &... week-module-1	1	509	June 10, 2023
Week 1-Sampling from a Distribution Probability & Statistics for Machine Learning &... week-module-1	8	359	November 9, 2023
Why do we generate data this way? Probability & Statistics for Machine Learning &... week-module-1	1	438	June 9, 2023
Spotted mistake in Uniform distribution Lecture video Probability & Statistics for Machine Learning &... week-module-1	1	423	June 15, 2023
C3_W1 Exercise 2 Probability & Statistics for Machine Learning &... week-module-1	1	452	August 16, 2023

CDF of a random variable X(discrete or continuous) follows uniform distribution

Related topics