Capstone Project Part 2,** 4.2 - DAG for Songs Data in RDS Source Airflow task failed

I faced the problem when running 4.2 - DAG for Songs Data in RDS Source as it failed and the related details are:

[2024-10-20, 14:04:39 UTC] {taskinstance.py:3310} ERROR - Task failed with exception
Traceback (most recent call last):
File “/home/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py”, line 767, in _execute_task
result = _execute_callable(context=context, **execute_callable_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py”, line 733, in _execute_callable
return ExecutionCallableRunner(
^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/operator_helpers.py”, line 252, in run
return self.func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/airflow/.local/lib/python3.11/site-packages/airflow/models/baseoperator.py”, line 406, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/amazon/aws/operators/glue.py”, line 223, in execute
glue_job_run = self.glue_job_hook.job_completion(self.job_name, self._job_run_id, self.verbose)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/amazon/aws/hooks/glue.py”, line 297, in job_completion
ret = self._handle_state(job_run_state, job_name, run_id, verbose, next_log_tokens)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/amazon/aws/hooks/glue.py”, line 346, in _handle_state
raise AirflowException(job_error_message)
airflow.exceptions.AirflowException: Exiting Job jr_df9fbaec9871e8282bba5f8baf23277af20b42635f7d3cdfbeb8f8d23a622d6e Run State: FAILED

Can someone please help me to sort it out ?

Thanks

Unfortunately I run out of time and ideas how to fix it. Anyhow, it seems everything was done correctly up to 4.2 - DAG for Songs Data in RDS Source and your prompt assistance is very much needed.

Total score 60/100

Deployment and Successful Runs of Glue Jobs 30/30
Existence of Objects in DBT S3 Bucket 30/30
Existence and Syntax Check of DAG Python Files 0/40

Hello @MiloAndelic
I was not able to reproduce the issue, and my DAGs ran successfully. Can you please tell us which of the tasks in your DAG fails and provide the logs?
You can select each of the tasks from the chart like this:


Then you can see the logs for the failed task from here.

The log is already provided @Amir_Zare and rds_extract_glue_job failed . Is there any way for me to fix it ?
Also, I have just tried again the lab but wasn’t able to access Jupyter notebook at: http://54.145.114.192:8953/lab?token=6359cf64d8e61b1fe9a04da502e0e9331b5bc8953159f154 :

This site can’t be reached

54.145.114.192 took too long to respond.

Try:

ERR_CONNECTION_TIMED_OUT

Could you please clarify it ?

Hello @MiloAndelic
Thank you for the details; I needed to know which task from your DAG is failing. For rds_extract_glue_job, the parameter SCRIPTS_BUCKET_NAME is used, which you are required to fill in in step 4.2.2. These parameters should be copied from the terraform outputs. Did you do this correctly?
As per your problem with Jupyter notebook, the address you provided should contain a port number at the end. You might have missed that while copying the address.

Thanks @Amir_Zare and please note that the address provided for Jupyter notebook was with a port number at the end.

As suggested I have just tried to collect the requested details but wasn’t able to open the lab at all due to “Your total lab spend of $20.135681 has exceeded the total budget of $20”. Can you please sort it out quickly as I need to complete it soon ?

Can you please fill out this lab refresh form? Our team will refresh your lab ASAP, and I think you won’t face the same problems afterwards.

Thanks @Amir_Zare, have just filled out and submitted the form. Can you check it and refresh the lab ?

No problem @MiloAndelic . Unfortunately, I don’t have the necessary access. The team will see it and refresh your lab.

It seems that @jessica-dlai might have the access, can she help @Amir_Zare ?

Actually, I’m not sure who has the access for this. Please, be patient. The form was made for cases similar to yours, and it will surely be handled :slightly_smiling_face:

Hi @MiloAndelic ,

Our lead lab engineer is notified when a learner submits the form. She’ll work through the backlog of requests as soon as possible. Thank you for your patience.

Jess

1 Like

Hi @MiloAndelic , your account budget should be refreshed now. Please try the lab again and let us know if there are any issues.

1 Like

Thanks @Jessica-DLAI for helping resolve the issue! Everything ran smoothly without any Airflow DAGs hiccups this time, your and @Amir_Zare support is appreciated - have a fantastic day!

1 Like