Week 4 Airflow "Additional Notes about Airflow"

In section “Defining Dependencies” there are several examples of the dependency representation using Python. Could the fourth one be written in the next way?: task0 >> [task1 >> task3, task2 >> task4] >> taks5 (instead of using Chain)

Here’s the reference picture:

Hello @easf

I added a dummy task to the DAG in the Airflow 101 - Building Your First Data Pipeline lab, and tried the syntax you proposed. Here are the results.
The following syntax, which is the same as the one you proposed, doesn’t lead to the expected behavior. Hereunder, you can see the DAG definition code snippet and the DAG graph from Airflow.

start_task >> \
[get_new_users_task >> dummy_after_users_task, \
get_session_task >> get_users_info_task >> save_complete_session_task] \
>> cleanup_task >> end_task

However, if you insist on not using the chain method, there is another way around. You can use the following syntax.

start_task >> [get_new_users_task, get_session_task]
get_session_task >> get_users_info_task >> save_complete_session_task
get_new_users_task >> dummy_after_users_task
[dummy_after_users_task, save_complete_session_task] >> cleanup_task
cleanup_task >> end_task

This leads to the following DAG chart.


To make sure that the task dependency is correct, I used xcom to exchange variables between the get_new_users task and the added dummy_after_users dummy task, and it worked as expected.

1 Like

Awesome! Thank you for your answer, Amir. I really appreciate it. In this case, it seems way better to use chain.