Pandas Question (Week 2 Notebook 1)

I’m confused on how this works:

for sentiment in data.sentiment.unique():
    ix = index[data.sentiment == sentiment]
    ax.scatter(data.iloc[ix].positive, data.iloc[ix].negative, c=colors[int(sentiment)], s=0.1, marker='*', label=sentiments[int(sentiment)])

data.sentiment == sentiment evaluates to True or False, so how does it get a value out of this?

Hi Alistair,

You are correct, ‘data.sentiment == sentiment’ returns a True or False value but not a single bool but an array/list of it. Basically for every row in the dataframe you’ll have a True or False value.

So, let’s say, if the data dataframe has 100 rows with 2 unique values of sentiments: happy and sad. then data.sentiment.unique() will return 2 values -: happy and sad.
Then when you loop over it and write index[data.sentiment == sentiment], it’s equivalent of writing index[True, True, False, …, False, True], basically True wherever the row sentiment matches the current sentiment in the for loop.
Then, index[True, True, False, …, False, True] basically returns all indices where the value was set to True and filters out the wherever the value was False.

Hope this helps.

1 Like

Thank you!! That makes sense! :slight_smile: