Capstone Week 4 - API Extract Jobs FAILED with ConnectTimeoutError (Glue to EC2 API on port 8080)

Hello everyone,

I’m stuck in Week 4 Capstone Project (Part 2). The API extract jobs are failing with ConnectTimeoutError when trying to connect to the EC2 API endpoint on port 8080.

Error from Glue job logs (both users and sessions jobs):
ConnectTimeout: HTTPConnectionPool(host=‘ec2-52-20-108-145.compute-1.amazonaws.com’, port=8080): Max retries exceeded with url: /users?start_date=2020-01-01&end_date=2020-01-31

I already tried:

But still getting ConnectTimeout (not 404 or other HTTP errors).

RDS extract job is SUCCEEDED, but API jobs fail.

This is the Data Engineering Specialization and you had posted in another one, I move your post to the right place!

1 Like

Hello @Mohammed004,

You need to use the correct APIEndpoint from CloudFormation, it should look like this:
ec2-**-**-**-**.compute-1.amazonaws.com

Then replace it in two places in the modules/extract_job/glue.tf:
One for users:
"--api_url" = "http://ec2-**-**-**-**.compute1.amazonaws.com/users"

and one for sessions.

Note that you should check the APIEndpoint in CloudFormation every time you start the lab from scratch and update its value in those two places for the jobs to succeed. thanks