my query is ,while assigning points from X array to their corrosponding centroids ,
the given formula is
how does X array know idx==k values to be transferred to corresponding centroids. here is idx is keyword . I did not understand ,please help me to understand
By first a list of all data points in X assigned to centroid k, so your points code would be directed toward X[idx==k]
and then calculate centroid[k] by using numpy mean of points by making sure to set parameter axis to 0
The code idx == k
will return a vector of 0’s and 1’s. The 1’s will be in the positions where the idx value is the centroid k.
Then using this vector of 0’s and 1’s on the X matrix, the rows with 1’s will be copied out to the “points” variable.
Than you Tmosh and Deepti Prasad ,
I got your points ,but is there any relation between idx and X array ,because what
if I keep i==K ,instead of idx==k ,it is not working , i did not understand the relation between X array and idx.
Hello vemula,
ofcouse there is a relation between X and idx
While in exercise 1, in instructions it is mentioned to find_closest_centroids.
the function take the data matrix C and the locations of all centroids.
It should output a one-dimensional array idx (which has the same number of elements as X) that holds the index of the closest centroid (a value in {0,…,𝐾−1}, where 𝐾 is total number of centroids) to every training example . (Note: The index range 0 to K-1 varies slightly from what is shown in the lectures (i.e. 1 to K) because Python list indices start at 0 instead of 1)
So you using I ==k would be incorrect inclusion of all data points in data matrix with inclusion one dimension array, hence idx==k is correct
Hope that clears your doubt
Regards
DP
Thank you very much Deepti