Hello @cchristyraj
You are supposed to replace the <PACIFIC-TIME-CURRENT-DATE> placeholder in two places in the terraform/modules/transform_job/glue.tf file. The value should be the current pacific datetime in the yyyy-mm-dd format, for example 2024-11-28.
Hi, Amir! I’m still hitting this issue after supplying the current Pacific date. My code snippet of the glue.tf are below. What am I missing?
default_arguments = {
“–enable-job-insights” = “true”
“–job-language” = “python”
# Set "--catalog_database" to aws_glue_catalog_database.transform_db.name
“–catalog_database” = aws_glue_catalog_database.transform_db.name
# Set “–ingest_date” to the server’s current date in Pacific Time (UTC-7), in “yyyy-mm-dd” format.
# (replace the placeholder <PACIFIC-TIME-CURRENT-DATE>)
“–ingest_date” = “2024-12-19”
default_arguments = {
“–enable-job-insights” = “true”
“–job-language” = “python”
“–catalog_database” = aws_glue_catalog_database.transform_db.name
# Set “–ingest_date” to the server’s current date in Pacific Time (UTC-7), in “yyyy-mm-dd” format.
# (replace the placeholder <PACIFIC-TIME-CURRENT-DATE>)
“–ingest_date” = “2024-12-19”
For songs-transform job I get AttributeError: ‘DataFrame’ object has no attribute ‘duration’. For json-transform job I get AnalysisException: Path does not exist: s3://de-c4w4a1-533267286350-us-east-1-data-lake/landing_zone/api/users/2024-12-19. Would appreciate your advice.
Hello @ArtK
Apparently, you have filled in the <PACIFIC-TIME-CURRENT-DATE> variable in transform_job/glue.tf file correctly, yet the the data the second one reads doesn’t have the required columns, and the first job can’t find the data it needs. So, my guess is that your issue is with the extract jobs. After you run your extract jobs, you can check the address you see in the exception, namely s3://de-c4w4a1-sensitivedatahere-us-east-1-data-lake/landing_zone/api/users/2024-12-19 , and verify that the files the transform job wants to read indeed exist there.
Thanks, Amir! I think I messed it up somehow by proceeding to the next step (servicing) before the jobs were completed. Anyway, my lab expired the other day, and when I re-did it just now, all worked like a charm. Thanks!
Hello @Amir_Zare
I seem to have the same issue, but I have the mentioned data files from the extract jobs in place.
But I still got this error “ValueError: time data ‘’ does not match format ‘%Y-%m-%d’” in my AWS Glue for job de-c4w4a1-songs-transform-job
In the respective glue.tf its the current date:
default_arguments = {
…
# Set “–ingest_date” to the server’s current date in Pacific Time (UTC-7), in “yyyy-mm-dd” format.
# (replace the placeholder <PACIFIC-TIME-CURRENT-DATE>)
“–ingest_date” = “2025-02-13”
…
My extract jobs run with SUCCESS.
I do not know where to fix this. I am stuck at the same position 4 times by now. Could you please help?
Hello @max85
Seems like you have missed some of the placeholders, or you might have forgot to save the changes made to the file before deploying your terraform components. The error is saying that you still have "<PACIFIC-TIME-CURRENT-DATE>" in your files. Please, make sure that you replace this with the current date in two instances in the glue.tf (lines 27 and 69) and save the changes made to the files before running terraform commands.