C3 W1 A1 Exercise 5 - Number of Samples in C_i vs Total Number of Samples

tenaciousjzh · August 17, 2023, 1:15am

I’m working on exercise 5 and have gotten myself a little stuck. In the function definition for compute_breed_proportions(df)
The code has a comment with a placeholder to calculate the following:

# Compute the probability of each class (breed)
# You can get the number of rows in a dataframe by using len(dataframe)
prob_class = None/None

I believe the division that is supposed to be performed is:
prob_class = the number of samples in C_i that have x_k / total number of samples in C_i

In the dataframe we’re working with what is x_k? Is it the height for that particular breed?
Would the total number of samples in C_i == to the number of samples where C_i has that feature (like height)?

I keep thinking I’m just going to end up with

prob_class = len(df_breed) / len(df_breed) == 1

for that and my spidey sense is tingling, lol

tenaciousjzh · August 18, 2023, 6:20pm

I think I’ve got a point of clarification here that could help if someone else gets hung up on this. Exercise 5 is not looking for you to just figure out C_i. Don’t worry about x_k yet as that is handled in Exercise 6.
It’s more about how many samples are there in each Class (0, 1, 2) compared to the total number of samples across all classes.

tenaciousjzh · August 18, 2023, 6:21pm

Another way to put it is that we’re computing one of the parts for the overall Bayes algorithm for P(C_i). The other parts to the algorithm are handled further on in the assignment.

lucas.coutinho · August 18, 2023, 8:44pm

Hi @tenaciousjzh!

I believe the division that is supposed to be performed is:
prob_class = the number of samples in Ci that have xk / total number of samples in Ci

This is not correct. What we want here is the value P(C_i), i.e., the probability of a random sample be in the class C_i. We are not yet computing any conditional probability!

Topic		Replies	Views
C3_W1_Assignment _Kindly rectify Probability & Statistics for Machine Learning &... week-1	1	431	August 18, 2023
C3_W1 Exercise 5 Expected Output Doesnt match Probability & Statistics for Machine Learning &... week-1	8	548	June 14, 2023
C3_W1 Naive Bayes Algorithm Question Probability & Statistics for Machine Learning &... week-1	4	490	July 19, 2023
Week 1 hw compute_breed_proportions(df): Probability & Statistics for Machine Learning &... week-1	3	505	August 24, 2023
Week 1 Exercise 5 grading issue Probability & Statistics for Machine Learning &... week-1	4	497	July 19, 2023

C3 W1 A1 Exercise 5 - Number of Samples in C_i vs Total Number of Samples

Related topics