What is syntax X[idx == 0]?

someone555777 · April 29, 2023, 12:21pm

Hi! I am watching hints of your solution of compute_centroids(X, idx, K) and I really can’t understand what you offer. See screen

Is it something like inner syntax in numpy matrix that contain connection to any cluster?

Shreya_Hegde · April 29, 2023, 12:44pm

“idx” is a list containing the value of nearest cluster of respective data point( index_value)
you are going to make this list before defining compute_centroid function.
for example:
idx=[1,0,2,1]
datapoint 0 - is nearest to cluster 1
datapoint 1 - is nearest to cluster 0
datapoint 2 - is nearest to cluster 2
datapoint 3 - is nearest to cluster 1
“idx” can be replaced with any name of your choice. its not an inbuilt function.

X= list that contains all the data points
X[idx==1] gives you all the datapoints which has been assigned to cluster 1
As for the above example,
X[idx==k] = [0,3] (data points assigned to cluster 1 as for the list “idx”)

ai_curious · April 29, 2023, 2:38pm

I might have included the words boolean and indexing in the explanation.

This code is leveraging some of the power of Python to achieve multiple steps written in a single expression. First, the comparison operator == is creating a new list that is the boolean result of comparing each value of some existing list with a value (here 0). Then, that list of boolean values is used to index, or filter, or slice, a subset out of some existing list. This second new list is comprised of each element where the boolean comparison evaluated to True. The list that was used to generate the booleans and the list against which the booleans are applied can be the same, or, as in this example, different, as long as they have the same shape.

boolean_list = (one_existing_list == 0) #create first new list with boolean values eg [True False …True]
filtered_list = another_existing_list[boolean_list] #create second new list by slicing out the True ‘rows’

You can find discussions of this usage by searching the interweb on python boolean mask or python boolean indexing. It also shows up in the context of Numpy and Pandas. It’s pretty common in Python-based data sciencey stuff and can also be applied in multidimensional cases.

HTH

someone555777 · April 29, 2023, 7:35pm

what is this syntax? can you explain me? I see in first time. Is it something like:
boolean_list = (i for i in one_existing_list if i == 0)

someone555777 · April 29, 2023, 7:36pm

and this too? Indexing from list by list??

ai_curious · April 29, 2023, 8:03pm

Not exactly. As I attempted to describe in the narrative in my first reply above, this step creates a new list of the same length as one_existing_list where each element of the new list has a boolean value. Whether each element value is True or False depends on whether the conditional == is satisfied.

ai_curious · April 29, 2023, 8:08pm

Yes, but it is using the list of boolean values based on the conditional, and only slicing (keeping) the elements of the target list where the elements of the boolean list have value True

someone555777 · April 29, 2023, 8:14pm

Can you give me links in docs, please? Because I can’t understand anything and can’t repeat in python console

ai_curious · April 29, 2023, 8:24pm

As I wrote above…

Maybe worth reading through some of those

someone555777 · April 30, 2023, 8:40am

so, is this only about numpy functionality, right?

someone555777 · April 30, 2023, 8:48am

Can you explain me how is this part working?

ai_curious · April 30, 2023, 11:16am

True Positives (tp) means the number of times the prediction is correct, right? So how do you know if the prediction is correct? You create a boolean list of the predictions that have a positive value (True iff ==1) and a boolean list of all the training data that have a positive value (True iff == 1) then AND those two lists together and count how many True values result.

I’m not sure what your background is, or your purpose for taking these classes, but I highly recommend adopting the practice of writing little test scripts to work through these questions on your own. In this example, print out what predictions consists of. Print out what (predictions == 1) consists of. Print out what ((predictions == 1) & (y_val == 1)) consists of. Now not only do you have the answer to this question, but you start to develop a skill for answering future questions as well. Good luck on your journey.

vibhukmenon · May 11, 2023, 3:57am

It works with other libraries too…like pandas

someone555777 · August 18, 2023, 6:03pm

So, I would like to conclude

In numpy and pandas we have mechanism of python boolean mask or python boolean indexing that helps us to convert array of elements to an array of False and True

>>> l = np.array([1,2,3])
>>> l>1 
array([False,  True,  True])

>>> l==2 
array([False,  True, False])

This new array with True and False can be applied as filter on another array that is connected with the first and usually has the same length. So, we will get just elements from another_existing_list, on which place is True in the boolean_list

In initial case it was the filter that we derived from idx and was applied on array X.

Sometimes we want to separate only that True in two filters (masks) that are True in both. We can do like this

>>> (l>1) & (l>2) 

array([False, False,  True])

Topic		Replies	Views
Help needed with syntax in C3_W1_Assignment - Exercise 2 Unsupervised Learning, Recommenders, Reinforcement week-1	1	532	August 3, 2022
C3_W1_KMeans_Assignment_Computing centroid means Unsupervised Learning, Recommenders, Reinforcement	5	288	December 29, 2023
Assignment 1 questions Unsupervised Learning, Recommenders, Reinforcement week-1	2	479	February 23, 2023
Question regarding C3_W1_KMeans_Assignment: Unsupervised Learning, Recommenders, Reinforcement week-1	4	490	April 24, 2023
C3_W1_KMeans_Assignment # UNQ_C1 # GRADED FUNCTION: find_closest_centroids Unsupervised Learning, Recommenders, Reinforcement week-1	6	584	August 22, 2022

What is syntax X[idx == 0]?

Related topics