C3M5 Lesson 2 Practice Lab: Flight delays and cancellations - Working with time series data

Hello!

In the Practice Lab “Flight Delays and Cancellations – Working with Time Series Data” (Course: Python for Data Analytics, Module 5: Time Series), in Step 5: Speed of Changes, we first need to calculate the percentage change for the Arrivals_Delayed column and save it in the variable cancel_pct_change.

Later in this step, we need to find the rows in df_per_month where the percentage change is greater than 50%. However, this is done by referring to the variable cancel_pct_change, which does not belong to the df_per_month DataFrame.

Is it possible to select data from a DataFrame by referring to a variable that does not belong to the DataFrame itself?

Thanks in advance!

Hi @VeronikaS!

The short answer to your question is: yes, it is possible, and this notebook is intentional.

The longer answer:
The key idea is that pandas selects rows by index, not by whether a variable is a column of the DataFrame.

So when you create a series_from_df with a certain filter for the original_df, your series_from_df will still have the same indexes that came from the original_df.

When you move on and do something like `original_df[series_from_df>0.5]`, you are basically saying: “Give me the rows of original_df with the index that corresponds to True in this series_from_df condition"

This only works because the index of df_per_month and cancel_pct_change are identical. If they didn’t match, there would be an error.

Happy learning in 2026 :partying_face:

1 Like

@imgabidotcom Thank you for the excellent explanation! I also wish you happy learning and testing in 2026!

1 Like