Capstone Project Part 1 - ETL and Data Modeling: issue with schemas

For section 5.2 of “C4_W4_Assignment_1“ notebook, I a have problem with schema and tables. I get some None schemas after running “%sql SHOW SCHEMAS FROM DATABASE dev “:

and these queries also fail subsequently “%sql select * from deftunes_transform.songs limit 10“, “%sql select * from deftunes_transform.sessions limit 10“ due to “(psycopg2.errors.InternalError_) AwsClientException: EntityNotFoundException from glue - Entity Not Found“.

This usually happens when the Glue catalog objects for the lab weren’t created correctly, or the ETL step that populates the deftunes_transform schema didn’t run to completion. When SHOW SCHEMAS returns None, it means the database exists in the PostgreSQL workspace, but the corresponding Glue schema wasn’t registered — so downstream queries fail with EntityNotFoundException.

A few things to check:

  1. Re-run the ETL notebook (Section 5.1) — the jobs there create the tables in Glue. If a job failed earlier, the schema may not exist at all.

  2. In your Spark/ETL logs, confirm that the songs and sessions DataFrames were written to the Glue catalog successfully with:

    .write.format("delta").option("mergeSchema", "true").saveAsTable(...)
    
    
  3. Refresh your metadata:

    %sql REFRESH DATABASE dev;
    %sql REFRESH SCHEMA deftunes_transform;
    
    
  4. If the schemas still show as None, the fix is typically to reset the entire lab environment — this recreates the Glue folders, the S3 structure, and the Delta tables used in the assignment. Several students reported that partial workspace resets cause exactly this behavior.

Once the environment is reset and the ETL steps re-executed, the schemas should appear normally and the queries should run without Glue errors. Let us know what you see after a fresh run.