My jobs (de-c4w4a1-json-transform-job or de-c4w4a1-songs-transform-job ) are failed while running them in VS terminal. I have compared my files in the Capstone Project Part 1, and the same files in Capstone Project Part 2. The variables in my Part 1 file match in the Part 2 files, except Part 2 has a few more lines for date variables.
terraform/assets/transform_jobs folder:
a. de-c4w4a1-transform-songs-job.py
b. de-c4w4a1-transform-json-job.py
terraform/modules/transform_job/s3.tf
terraform/modules/transform_job/glue.tf
terraform/main.tf : lines 16 to 30
terraform/outputs.tf : lines 22 to 34
I don’t see any error in Log. Could you please advise me on how to troubleshoot them? I have reviewed the files above couple times. Here are info at my end.
In de-c4w4a1-transform-songs-job.py, the error in Glue is {ValueError: time data ‘yyyy-mm-dd’ does not match format ‘%Y-%m-%d’}. I have the date variables as follows. Could you advise me?
line 63-65: original codes
ingest_date = args[“ingest_date”]
date_object = datetime.strptime(ingest_date, “%Y-%m-%d”)
ingest_date_str = date_object.strftime(“%Y_%m_%d”)
Hello @Adazhu,
I think the code in de-c4w4a1-transform-songs-job is correct. The issue with the de-c4w4a1-json-transform-job is related to the s3 data lake bucket:
Could you check you are not missing that folder in the S3 buckets:
de-c4w4a1-[ACCOUNT ID]-us-east-1-data-lake/landing_zone/api/users/[TODAY DATE]
That could mean that your Landing Zone jobs didnt create the correct folder for the transform-job to SUCCEED. Check if you used the correct [API-ENDPOINT] in the terraform/modules/extract_job/**glue.tf** file.
When you complete dates in the other transform_job/glue.tf use the correct syntax (year-month-day).
Another mistake would be to use .py is provided in S3.tf but not in the second one. Hope its helpful