Course: Data Modeling, Transformation, and Serving
Week: Week 4 – Serving Data
Assignment: Programming Assignment 4 – Capstone Project Part 1 (C4_W4_Assignment_1.ipynb)
Hi everyone,
I’m currently working on C4_W4_Assignment_1.ipynb from the Coursera platform, and I’ve been stuck for almost a month due to repeated AWS Glue job failures.
Landing Zone – Extract Jobs (Glue)
After successfully running Terraform, I executed the following Glue extract jobs:
-
de-c4w4a1-rds-extract-job→ SUCCEEDED -
de-c4w4a1-api-users-extract-job→ FAILED -
de-c4w4a1-api-sessions-extract-job→ FAILED
Both API-based jobs fail with the following error:
ConnectTimeout: HTTPConnectionPool(host='ec2-3-219-211-66.compute-1.amazonaws.com', port=80):
Max retries exceeded with url: /users?start_date=2020-01-01&end_date=2020-01-31
As a result, these jobs have no successful runs, while only the RDS extract works.
Transformation Zone – Transform Jobs
Because the extract jobs are failing, the Transformation Glue jobs are also failing, and I’m unable to proceed to the Redshift / dbt serving layer part of the assignment.
Has anyone faced a similar AP
I timeout issue in Week 4? Any guidance on how to resolve this or proceed would be really helpful.