C3M5_Assignment Exercise 6. SettingWithCopyWarning

Hi, I am having trouble with the ex 6, 7 and 8 (Programming Assignment: Analyzing Chlorophyll levels in Australian Coral Reefs).

{edited by mentor}

I got this warning

/tmp/ipykernel_84/2104055361.py:9: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

Hello, @shofagbile,

Welcome to this community! :raising_hands:

I see that this is your first post, so I modified it to only show the warning, because sharing work publicly isn’t allowed here. Also, it’s actually better to open a new topic for your own case, so I have split your post out to create a new topic for you. You may respond here for any follow-up of this, but if you have a new question (like for exercise 7 or 8), please open another new topic with a full description of your question and a copy of the errors and autograder messages.

Alright, now the warning you have shared here. First, this won’t actually hurt anything. You can pass the assignment with this warning there. If any test failed or marks deducted, there should be some other reason and we will need to see the full error of the failed test or the autograder’s message for the exercise that didn’t receive full marks. For these cases, please open a new topic.

If the assignment work is all of the concern, you may skip the rest of my message, but for using pandas in the long run, I think everybody should know about this warning and what pandas is going to change about it in the latest pandas version 3 (we are using version 2 here).

This is a quite frequent warning that, if you google, you can see many many discussions spanning over many years. I often get this myself, too. You get this warning when you assign values to a slice of a sub-dataframe/series of a dataframe, because it will be unpredictable whether the assignment will affect the dataframe from which the sub-dataframe is obtained. It will affect the sub-dataframe - this is expected and is good, but whether or not it is your intention to also change the original dataframe, pandas does not guarantee which will happen and this is why the warning.

Assigning to a slice of a sub-dataframe of a dataframe is called “chained assignment”. For example,

df["foo"][df["bar"] > 5] = 100

You see two pairs of brackets one after another - this is chained indexing. You indexed the column of “foo” (that will return a series) and then index a number of rows (slice) that satisfy the conditions. Exercise 6 also has chained indexing and chained assignment, and you will see another example similar to that in the paragraph after the next one.

For an official explanation of this warning and chained indexing/chained assignment, please first read this and then this one. In particular, you may be more interested in the following example that I took screenshot from the first link, because it is just what’s happening in Exercise 6:

In the example, we expect foo to change, but pandas cannot guarantee if df will get changed, too. If you didn’t expect it to change and it had changed, then this makes a case for the warning. In exercise 6 of our assignment, df_bur13 will take the place of df in the example, then as far as the assignment is concerned, does it matter even if it was also changed? The answer is No - it does not matter, because df_bur13 won’t be graded afterwards.

Could I have avoided the warning (or the unpredictable behavior) in the example? Yes - if our intention is to create a new dataframe, then we could have changed the first line of the function to foo = df[['bar', 'baz']].copy(). Since foo is now separated from df as a standalone copy, any change to foo is guaranteed not to change df, and we won’t see that warning. Similarly, we could have done the same to Exercise 6 but, again, it won’t matter.

This unpredictability and warning happens in pandas version 2 which is used by the assignment and, I think, many systems. As the second link above reports, with the “Copy-on-write” feature enabled (by default), such unpredictability and warning will be gone in pandas version 3 which is supported by new systems with Python 3.10 or higher. Note that the feature may also be enabled in pandas version 2 with pd.options.mode.copy_on_write = True.

What we learners should be concerned about here is that, the enabling of the feature also adds something to the list of “Don’t do”. See this for an example - the idea is that you want to avoid creating unnecessary copys that will take up some of your precious memory when you are already dealing with a large volume of data.

This is my first time writing about this and it may not be very good, so please let me know if you are not sure about anything here or just google for some better explanations. You or I may also share better references here for everyone.

Cheers,
Raymond

Thanks for your response. I believe i have the right line of code and i got a result but this is wrong from the expected result. The only reason I observed was the warning that was why I thought it was the warning.
This got me stuck in ex.6. I attached the result here .

(attachments)

Hello, @shofagbile,

No worries! It is only natural that we learner want to investigate problem ourselves and suggest where might have gone wrong.

I have read each and every number on your OLS regression result and they are all same as mine. Also, that SettingWithCopyWarning warning was fine, so at this point, I don’t see any sign that anything in your work was going wrong.

Did you see any error message (other than the warning) that you can share a screenshot of them with us? Or, if the autograder had deducted any marks from your submission, could you please share a screenshot of that, too? If the screenshot can’t cover the whole text message by the grader, please copy and paste it here along with the screenshot.

Thanks,
Raymond

Hi,
Thank you so much for your help, your response was very helpful. With your comments that you ran the code yourself and got the same result I was encouraged to check all the exercises again and make some corrections. I passed the graded lab with 100%.
I have my certificate now.

Thanks
Shola

1 Like

That’s good news, Shola. Congratulations on your achievements!

Cheers,
Raymond