C4W4 - Capstone Project Part 1

Hi everyone,

I am encountering an issue in step 4.1.7. After running the command:
aws glue start-job-run --job-name <JOB-NAME> | jq -r '.JobRunId'
While the “de-c4w4a1-rds-extract-job” job succeeds, the other two jobs consistently fail. This is my second attempt at solving the lab, and I’ve double-checked every step, but the result remains the same. I have two days left to complete this lab, and I really do not want to pay for another month. Has anyone faced a similar issue or have any suggestions for troubleshooting?

There are lots of posts on the forum about “glue” issues.

I’m not a mentor for that course, but my advice is to try a Forum search for the word “glue” and see how others have fixed this issue.

Hello @aufome
The jobs use the python scripts in the terraform/assets/extract_jobs/ folder. If the jobs are failing, either there is some error in these files that you filled out in step 4.1.1, or one of the terraform/modules/extract_job/s3.tf and terraform/modules/extract_job/glue.tf files has defects. Can you please double check your answers in these files?

Dear DE Teams,

Would you please help… I am really frustrated on resolving below issue. I tried to resolved more than 4 time with no luck…I paid $50 for another month with the hope to complete the Capstone Project and get “Specialization Certificate.

Extract Jobs: Created and Run Successfully as shown below

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue start-job-run --job-name de-c4w4a1-api-sessions-extract-job | jq -r ‘.JobRunId’
jr_8fd06e6313d369f3104aa6b677fe5afaa1e78ff0142ee983898407a5a0db0ea9

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue start-job-run --job-name de-c4w4a1-rds-extract-job | jq -r ‘.JobRunId’
jr_d687e2112d300733553bd9bed15b58def03d805e09bf16030b56a157ec603a4f

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue get-job-run --job-name de-c4w4a1-api-users-extract-job --run-id jr_8e5fa6952bb3c9aa594fbfb4cf0f8e549fa4dd3f42ae7c58d3eace5fb0a72dd7 --output text --query “JobRun.JobRunState”
SUCCEEDED

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue get-job-run --job-name de-c4w4a1-api-sessions-extract-job --run-id jr_8fd06e6313d369f3104aa6b677fe5afaa1e78ff0142ee983898407a5a0db0ea9 --output text --query “JobRun.JobRunState”
SUCCEEDED

Transform Jobs: Created but Run Failed as shown below

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue start-job-run --job-name de-c4w4a1-songs-transform-job | jq -r ‘.JobRunId’
jr_db9c5e521413d232aa4d276315255f7ab2e0130e8eb74249acfb606ee4625a60

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue start-job-run --job-name de-c4w4a1-songs-transform-job | jq -r ‘.JobRunId’

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue get-job-run --job-name de-c4w4a1-json-transform-job --run-id jr_b7eade906e0b4145a1a4c88050e21d36b9181fe48454d8d4e5d4c293e0e5958f --output text --query “JobRun.JobRunState”
FAILED

(jupyterlab-venv) abc@083116fbf8bd:~/workspace/terraform$ aws glue get-job-run --job-name de-c4w4a1-songs-transform-job --run-id jr_db9c5e521413d232aa4d276315255f7ab2e0130e8eb74249acfb606ee4625a60 --output text --query “JobRun.JobRunState”
FAILED

I am wondering if the following is the cause of the problem as I am not sure about required action:
At http://98.83.21.188:8888/lab/tree/terraform/modules/transform_job/glue.tf defualt arguement…
default_arguments = {
“–enable-job-insights” = “true”
“–job-language” = “python”
# Set "--catalog_database" to aws_glue_catalog_database.transform_db.name
“–catalog_database” = aws_glue_catalog_database.transform_db.name
# Set “–ingest_date” to the server’s current date in Pacific Time (UTC-7), in “yyyy-mm-dd” format.
# (replace the placeholder <PACIFIC-TIME-CURRENT-DATE>)
“–ingest_date” = “UTC-7”

What should should be replaced with? I tried leave intact, “UTC-7”, “ingest_date”, with no luck and transform job still failing….

I would be thankful for your help and support. Many thanks in advance.

@TMosh @Amir_Zare

I managed to solve the problem—turns out it was a simple mistake! The error occurred because I used double quotes when running the
aws glue start-job-run --job-name de-c4w4a1-api-users-extract-job | jq -r '.JobRunId'
command. Thank you for your help and interest.

@WafeeqAjoor You can search “PST time” on Google to find the current date. As of now, it is “2025-01-05.” You can use this date in the relevant lines.

Thank you very much aufome on your advice. I did google PST and inserted the date “2025-01-05” but “Transform” job run still fail! What are the possible reason for failures? Is there a way to debug the “Failed” run job? Did your “Transform” job run successfully? Many thanks for your support.

@WafeeqAjoor I no longer have access to the labs, so I’m unable to investigate further. Could you explore AWS Glue to clarify the error after running the glue job?

Hello @WafeeqAjoor
Sorry for the inconvenience. Please, check the following items.

  1. I see that your last update was for 5 hours ago, which was around Pacific midnight. If you run extract jobs before pacific midnight, it’s possible that the pacific date changes while you are going through the lab. If possible, please try the lab in a way that all your commands are run in the same pacific day.
  2. Please check your completion of the file de-c4w4a1-transform-songs-job.py. The mentioned glue job uses this file as its executable.
  3. Make sure that all extract jobs are executed successfully before proceeding to the transform jobs.