C4 - W4 - capstone project 2

When I’m running the airflow pipeline I’m getting the following error:

84fc8623986e
*** Found local files:
*** * /opt/airflow/logs/dag_id=deftunes_songs_pipeline_dag/run_id=scheduled__2020-02-01T00:00:00+00:00/task_id=rds_extract_glue_job/attempt=1.log
[2024-11-06, 15:03:50 UTC] {local_task_job_runner.py:123} :arrow_forward: Pre task execution logs
[2024-11-06, 15:03:50 UTC] {glue.py:188} INFO - Initializing AWS Glue Job: de-c4w4a2-rds-extract-job. Wait for completion: True
[2024-11-06, 15:03:50 UTC] {glue.py:365} INFO - Checking if job already exists: de-c4w4a2-rds-extract-job
[2024-11-06, 15:03:50 UTC] {base_aws.py:606} WARNING - Unable to find AWS Connection ID ‘aws_default’, switching to empty.
[2024-11-06, 15:03:50 UTC] {base_aws.py:180} INFO - No connection ID provided. Fallback on boto3 credential strategy (region_name=‘us-east-1’). See: Configuration - Boto3 1.35.54 documentation
[2024-11-06, 15:03:51 UTC] {credentials.py:1075} INFO - Found credentials from IAM Role: de-c4w4a2-ec2-role
[2024-11-06, 15:03:52 UTC] {glue.py:209} INFO - You can monitor this Glue Job run at: https://console.aws.amazon.com/gluestudio/home?region=us-east-1#/job/de-c4w4a2-rds-extract-job/run/jr_cd4d57ad843f53b0ec46d7c214ded9a2b54b7b6532fff950b6869eb865956d19
[2024-11-06, 15:03:52 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:03:58 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:04:04 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:04:10 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:04:17 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:04:23 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:04:29 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:04:35 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:04:41 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:04:47 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:04:53 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:04:59 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:05:05 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:05:11 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:05:17 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:05:23 UTC] {glue.py:348} INFO - Polling for AWS Glue Job de-c4w4a2-rds-extract-job current run state with status RUNNING
[2024-11-06, 15:05:30 UTC] {glue.py:345} INFO - Exiting Job jr_cd4d57ad843f53b0ec46d7c214ded9a2b54b7b6532fff950b6869eb865956d19 Run State: FAILED
[2024-11-06, 15:05:30 UTC] {taskinstance.py:3310} ERROR - Task failed with exception
Traceback (most recent call last):
File “/home/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py”, line 767, in _execute_task
result = _execute_callable(context=context, **execute_callable_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py”, line 733, in _execute_callable
return ExecutionCallableRunner(
^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/operator_helpers.py”, line 252, in run
return self.func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/airflow/.local/lib/python3.11/site-packages/airflow/models/baseoperator.py”, line 406, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/amazon/aws/operators/glue.py”, line 223, in execute
glue_job_run = self.glue_job_hook.job_completion(self.job_name, self._job_run_id, self.verbose)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/amazon/aws/hooks/glue.py”, line 297, in job_completion
ret = self._handle_state(job_run_state, job_name, run_id, verbose, next_log_tokens)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/amazon/aws/hooks/glue.py”, line 346, in _handle_state
raise AirflowException(job_error_message)
airflow.exceptions.AirflowException: Exiting Job jr_cd4d57ad843f53b0ec46d7c214ded9a2b54b7b6532fff950b6869eb865956d19 Run State: FAILED
[2024-11-06, 15:05:30 UTC] {taskinstance.py:1225} INFO - Marking task as UP_FOR_RETRY. dag_id=deftunes_songs_pipeline_dag, task_id=rds_extract_glue_job, run_id=scheduled__2020-02-01T00:00:00+00:00, execution_date=20200201T000000, start_date=20241106T150350, end_date=20241106T150530
[2024-11-06, 15:05:30 UTC] {taskinstance.py:340} :arrow_forward: Post task execution logs

the dag is the rds_extract_glue_job and I already made the 4.2.2 step

Hello @felipeyamate, you could get the outputs of the scripts_bucket, data_lake_bucket and ARN from the terminal output, it should look like this:

scripts_bucket = “de-c4w4a2-<ACOUNT_ID>-us-east-1-scripts”
data_lake_bucket = “de-c4w4a2-<ACOUNT_ID>-us-east-1-data-lake”
ARN: arn:aws:iam::<ACOUNT_ID>:role/Cloud9-de-c4w4a2-glue-role

you can replace <ACOUNT_ID> with yours. Thank you

thanks