C4W4 - Capstone Project Part 1

Hi everyone,

I am encountering an issue in step 4.1.7. After running the command:
aws glue start-job-run --job-name <JOB-NAME> | jq -r '.JobRunId'
While the “de-c4w4a1-rds-extract-job” job succeeds, the other two jobs consistently fail. This is my second attempt at solving the lab, and I’ve double-checked every step, but the result remains the same. I have two days left to complete this lab, and I really do not want to pay for another month. Has anyone faced a similar issue or have any suggestions for troubleshooting?

There are lots of posts on the forum about “glue” issues.

I’m not a mentor for that course, but my advice is to try a Forum search for the word “glue” and see how others have fixed this issue.

Hello @aufome
The jobs use the python scripts in the terraform/assets/extract_jobs/ folder. If the jobs are failing, either there is some error in these files that you filled out in step 4.1.1, or one of the terraform/modules/extract_job/s3.tf and terraform/modules/extract_job/glue.tf files has defects. Can you please double check your answers in these files?

Dear DE Teams,

Would you please help… I am really frustrated on resolving below issue. I tried to resolved more than 4 time with no luck…I paid $50 for another month with the hope to complete the Capstone Project and get “Specialization Certificate.

Extract Jobs: Created and Run Successfully as shown below

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue start-job-run --job-name de-c4w4a1-api-sessions-extract-job | jq -r ‘.JobRunId’
jr_8fd06e6313d369f3104aa6b677fe5afaa1e78ff0142ee983898407a5a0db0ea9

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue start-job-run --job-name de-c4w4a1-rds-extract-job | jq -r ‘.JobRunId’
jr_d687e2112d300733553bd9bed15b58def03d805e09bf16030b56a157ec603a4f

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue get-job-run --job-name de-c4w4a1-api-users-extract-job --run-id jr_8e5fa6952bb3c9aa594fbfb4cf0f8e549fa4dd3f42ae7c58d3eace5fb0a72dd7 --output text --query “JobRun.JobRunState”
SUCCEEDED

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue get-job-run --job-name de-c4w4a1-api-sessions-extract-job --run-id jr_8fd06e6313d369f3104aa6b677fe5afaa1e78ff0142ee983898407a5a0db0ea9 --output text --query “JobRun.JobRunState”
SUCCEEDED

Transform Jobs: Created but Run Failed as shown below

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue start-job-run --job-name de-c4w4a1-songs-transform-job | jq -r ‘.JobRunId’
jr_db9c5e521413d232aa4d276315255f7ab2e0130e8eb74249acfb606ee4625a60

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue start-job-run --job-name de-c4w4a1-songs-transform-job | jq -r ‘.JobRunId’

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue get-job-run --job-name de-c4w4a1-json-transform-job --run-id jr_b7eade906e0b4145a1a4c88050e21d36b9181fe48454d8d4e5d4c293e0e5958f --output text --query “JobRun.JobRunState”
FAILED

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue get-job-run --job-name de-c4w4a1-songs-transform-job --run-id jr_db9c5e521413d232aa4d276315255f7ab2e0130e8eb74249acfb606ee4625a60 --output text --query “JobRun.JobRunState”
FAILED

I am wondering if the following is the cause of the problem as I am not sure about required action:
At http://98.83.21.188:8888/lab/tree/terraform/modules/transform_job/glue.tf defualt arguement…
default_arguments = {
“–enable-job-insights” = “true”
“–job-language” = “python”
# Set "--catalog_database" to aws_glue_catalog_database.transform_db.name
“–catalog_database” = aws_glue_catalog_database.transform_db.name
# Set “–ingest_date” to the server’s current date in Pacific Time (UTC-7), in “yyyy-mm-dd” format.
# (replace the placeholder <PACIFIC-TIME-CURRENT-DATE>)
“–ingest_date” = “UTC-7”

What should should be replaced with? I tried leave intact, “UTC-7”, “ingest_date”, with no luck and transform job still failing….

I would be thankful for your help and support. Many thanks in advance.

@TMosh @Amir_Zare

I managed to solve the problem—turns out it was a simple mistake! The error occurred because I used double quotes when running the
aws glue start-job-run --job-name de-c4w4a1-api-users-extract-job | jq -r '.JobRunId'
command. Thank you for your help and interest.

@WafeeqAjoor You can search “PST time” on Google to find the current date. As of now, it is “2025-01-05.” You can use this date in the relevant lines.

Thank you very much aufome on your advice. I did google PST and inserted the date “2025-01-05” but “Transform” job run still fail! What are the possible reason for failures? Is there a way to debug the “Failed” run job? Did your “Transform” job run successfully? Many thanks for your support.

@WafeeqAjoor I no longer have access to the labs, so I’m unable to investigate further. Could you explore AWS Glue to clarify the error after running the glue job?