Week 2, Assignment 2: Building a Data Lakehouse with AWS Lake Formation and Apache Iceberg

Hello,

I am stuck at Week-2, Assignment-2 of Data Storage and Queries module.
I am stuck at:

5.4. Now you will create tables that may involve running aggregations or joins across multiple tables from the curated zone. The first one will involve returning the average sales per month and year and storing them in the table sales_report.

I get this error when I run the code:

Output exceeds the size limit. Open the full output data in a text editor

--------------------------------------------------------------------------- QueryFailed Traceback (most recent call last) Cell In[21], line 30 10 ctas_query = f"“” 11 CREATE TABLE {PRESENTATION_DATABASE_NAME}.sales_report 12 WITH ( (…) 26 ORDER BY YEAR(orderdate), MONTH(orderdate); 27 “”" 29 # Execute the query using AWS Athena —> 30 response = wr.athena.start_query_execution( 31 sql=ctas_query, 32 database=PRESENTATION_DATABASE_NAME, 33 wait=True, 34 s3_output=f’s3://{DATA_LAKE_BUCKET_NAME}/athena_output/sales_report’ 35 ) 37 # Print execution status 38 print(f"Query Execution Status: {response[‘Status’][‘State’]}") File ~/miniconda/lib/python3.11/site-packages/awswrangler/_config.py:735, in apply_configs..wrapper(*args_raw, **kwargs) 733 del args[name] 734 args = {**args, **keywords} → 735 return function(**args) File ~/miniconda/lib/python3.11/site-packages/awswrangler/athena/_executions.py:163, in start_query_execution(sql, database, s3_output, workgroup, encryption, kms_key, params, paramstyle, boto3_session, client_request_token, athena_cache_settings, athena_query_wait_polling_delay, data_source, wait)

→ 236 raise exceptions.QueryFailed(response[“Status”].get(“StateChangeReason”)) 237 if state == “CANCELLED”: 238 raise

exceptions.QueryCancelled(response[“Status”].get(“StateChangeReason”)) QueryFailed: TABLE_NOT_FOUND: line 12:4: Table ‘awsdatacatalog.curated_zone.orders’ does not exist. You may need to manually clean the data at location ‘s3://de-c3w2a1-533267108172-us-east-1-data-lake/athena_output/sales_report/tables/9ee5b178-8a1a-4f12-b5c1-048bc2eb87b3’ before retrying. Athena will not delete data in your account.

Any solutions?

Hello @san_mir,

I could reproduce the issue after my de-c3w2a1-csv-transformation-job failed. Could you check in step 4.1.1 for any None values in the script de_c3w2a1_batch_transform.py that causes the job run to fail:

After you run all the jobs succesfully you should be able to see all the folders in S3_data_lake if you go to curated zone:

Continue from steps 5.1 to grant access to the above tables in the curated_zone and after that you should be able to SUCCEEDED in the query in step 5.4:

Hello,

I am trying to do the lab from start, and when in step 1.2.2 I try to click on ‘GO TO AWS CONSOLE’ lick after running the above code, the pop-up did not happen, and no AWS window appears. This was not happening before.

Any solutions?

Hello @san_mir,

I had this issue before and perhaps the lab exceeded limit. You could check that other parts throw errors as well like if you run source scripts/setup.sh in the terminal. Please submit this form since a lab refresh seems to fix it. Note that it takes 1-2 business days to complete, hope it helps

I tried to run source scripts/setup.sh but get this error:

coder@c1cafc8aaa3c:~/project$ source scripts/setup.sh

An error occurred (AuthFailure) when calling the DescribeVpcs operation: Authorization header or parameters are not formatted correctly.

An error occurred (AuthFailure) when calling the DescribeSubnets operation: Authorization header or parameters are not formatted correctly.

An error occurred (InvalidClientTokenId) when calling the GetCallerIdentity operation: The security token included in the request is invalid.

An error occurred (InvalidClientTokenId) when calling the DescribeDBInstances operation: The security token included in the request is invalid.

An error occurred (InvalidClientTokenId) when calling the DescribeDBInstances operation: The security token included in the request is invalid.

An error occurred (InvalidClientTokenId) when calling the GetCallerIdentity operation: The security token included in the request is invalid.

I was not getting this error before. I have submitted the form for lab refresh.

Hello @san_mir,

Thanks for submitting the form since a lab refresh seems to fix it. You could check the vocapi_logger file that you submitted and in my case the reason was this:

Error: Failed operation: Start. Your total lab usage time of 2400 minutes has exceeded the total allocated time of 2400 minutes

Hello,

I will wait for the refresh. AWS is still not working for me, I checked just now. Will I be able to do the lab on Weekend, or will it be solved by Monday? Here is by vocapi_logger file:


vocapi response status: success
Extracted AWS Console URL
AWS Console URL written to /home/coder/.aws/aws_console_url
Run count incremented to: 5
Run count written to /home/coder/vocapi_call_run_count
Iteration 4 completed at Fri Mar 28 12:10:02 AM UTC 2025
Iteration 5 of 8 started at Fri Mar 28 12:10:02 AM UTC 2025
Sleeping for 900 seconds before next iteration…
Running the vocapi_call_get_aws_console_url at Fri Mar 28 12:25:02 AM UTC 2025
Current run count: 5
Using LAB_SPACE_ID: qpnsrdijiima
START_LAB_DATA received: {‘lab_data’: {‘partid’: ‘3581682’, ‘assignmentid’: ‘3581681’, ‘courseid’: ‘62825’}}



vocapi response status: success
Extracted AWS Console URL
AWS Console URL written to /home/coder/.aws/aws_console_url
Run count incremented to: 6
Run count written to /home/coder/vocapi_call_run_count
Iteration 5 completed at Fri Mar 28 12:25:03 AM UTC 2025
Iteration 6 of 8 started at Fri Mar 28 12:25:03 AM UTC 2025
Sleeping for 900 seconds before next iteration…
Running the vocapi_call_get_aws_console_url at Fri Mar 28 12:40:03 AM UTC 2025
Current run count: 6
Using LAB_SPACE_ID: qpnsrdijiima
START_LAB_DATA received: {‘lab_data’: {‘partid’: ‘3581682’, ‘assignmentid’: ‘3581681’, ‘courseid’: ‘62825’}}



vocapi response status: success
Extracted AWS Console URL
AWS Console URL written to /home/coder/.aws/aws_console_url
Run count incremented to: 7
Run count written to /home/coder/vocapi_call_run_count
Iteration 6 completed at Fri Mar 28 12:40:05 AM UTC 2025
Iteration 7 of 8 started at Fri Mar 28 12:40:05 AM UTC 2025
Sleeping for 900 seconds before next iteration…