I completed the scripts for the two glue jobs (de-c4w4a1-songs-transform-job and de-c4w4a1-json-transform-job) but I run them, I got these errors (see them in the screeshots below). These errors come from the date format in the scripts but I don’t know how to fix them.
I put “yyyy-mm-dd” on the PACIFIC-TIME-CURRENT-DATE placeholder.
Hello @MartinAmouzou,
Could you check in parts
4.1.3. and 4.2.3. in the terraform/modules/extract_job/glue.tf and terraform/modules/transform_job/glue.tf files you need to replace placeholders with "actual dates" and use the currect date in Pacific time (e.g. “2025-01-08”) in two places. Thanks
rerun the part in 4.2.6 after you updated the glue.tf file
if the correct folder is created in the data lake bucket from the landing jobs. If not check you have the correct API-Endpoint in the terraform/modules/extract_job/glue.tf
you might need to run again the ingestion jobs if you make changes above.
maybe on the of the py files in part 4.2.1 has any typos.
Hello @aurelio_gialluca,
I think what @MartinAmouzou had the wrong date in step 4.2.3 terraform/modules/transform_job/glue.tf file (replace with today’s date in Pacific time, just Google it e.g. “2025-01-12”)
The second tying he mentioned might be in one of the two py files in step 4.2.1 (a method like columns or ingest_on had a typo I guess). Hope it helped
Hi. Please help me. Wich is the error in the python files exactly? And, by the other hand, I change the pacific dates, but only in transformation… @Georgios
@paniJc,
Thats correct you change the dates in pacific time only in terraform/modules/transform_job/glue.tf in the terraform/modules/extract_job/glue.tf you keep the dates provided to you start:2020-01-01 to end 2020-01-31.
In the python files do you use the correct methods:
df.withColumn(“ingest_on”, F.to_date(F.lit(ingest_date), etc…
Hope it helps