In section 5.4 (Presentation) I get the following error. Everything works until this point.
QueryFailed: TABLE_NOT_FOUND: line 11:4: Table ‘awsdatacatalog.curated_zone.orders’ does not exist. You may need to manually clean the data at location ‘s3://de-c3w2a1-474000284478-us-east-1-data-lake/athena_output/sales_report/tables/94c07b54-be07-4692-8750-5183e5951746’ before retrying. Athena will not delete data in your account.
There is no ‘sales_report’ folder at that location in S3.
Hello @mkorangestripe
I could reproduce the issue after my de-c3w2a1-csv-transformation-job
failed. Could you check in step 4.1.1 for any None
values in the script de_c3w2a1_batch_transform.py
that causes the job run to fail. Hope it helps
I checked de_c3w2a1_batch_transform.py. The only place I see ‘None’ is in comments. These are the parts I’ve updated. Any ideas? Thanks
[Code Removed]
Hello @mkorangestripe,
Your de_c3w2a1_batch_transform.py
script seems to have a bug in line 157 (also the IntegerType[DoubleType]
you had before) , that might be the reason that the job failed and cause the error in step 5.4:
# Handle the DoubleType case similar to the IntegerType case above.
# Use pd.to_numeric with errors='coerce'.
source_pd[field_name] = pd.to_numeric(DoubleType[field_name], errors='coerce')
source_pd[field_name] = pd.to_numeric(source_pd[field_name], errors='coerce') <--Use this instead:
Also please remove the code since it’s against the Code of Conduct, thank you
1 Like