C4W4 Capstone project 2. Glue ETL jobs throwing access error

In the 2nd capstone project, when I run the transform glue etl jobs, this error is thrown

AnalysisException: Insufficient Lake Formation permission(s) on de_c4w4a2_transform_db (Service: AWSGlue; Status Code: 400; Error Code: AccessDeniedException; Request ID: 1f805925-459a-469b-a88d-847d090dc658; Proxy: null)

Need help in fixing this

Hello @sahil251298
Sorry for the inconvenience. Can you please fill out the lab refresh form and try again after your lab is refreshed?

Hi Amir, I’m having the same permission problem, and follow the link you provided. So now I should wait is correct? Thank you

Hello @rrosa
Yes. Please, wait up to 3 business days and try again after.

1 Like

Hi Amir,

I hope you’re doing well!

I’m not sure if I’ve done something wrong or missed a step, but when running the following command in the Visual Studio Code terminal as instructed in the Jupyter notebook:
aws glue start-job-run --job-name de-c4w4a2-json-transform-job

I encountered the same issue. In the AWS Console, under AWS Glue > ETL Jobs, the job run has a “Failed” status with the following exception message:

AnalysisException: Insufficient Lake Formation permission(s) on de_c4w4a2_transform_db 
(Service: AWSGlue; Status Code: 400; Error Code: AccessDeniedException; 
Request ID: f3e13364-62d8-4273-b96f-32154bc047b8; Proxy: null)

Additionally, I noticed that my running user in the AWS Console (top-right corner) is prefixed with voclabs/.

Could you please help me identify what might be missing so I can successfully complete the project? I’d greatly appreciate any guidance you can provide.

Thank you so much in advance! :blush:

Best regards,
Rui Rosa

Hello @rrosa
The Insufficient Lake Formation permission can be fixed via a lab refresh. Please, fill out this lab refresh form and try again after 3 working days.

1 Like

Hi @Amir_Zare Im getting the following error after running the glue start-job-run. Can I get help on this please?
An error occurred (AccessDeniedException) when calling the StartJobRun operation: User: arn:aws:sts::471112906237:assumed-role/VSCodeInstanceRole/i-0353661b5af2b206e is not authorized to perform: glue:StartJobRun on resource: arn:aws:glue:us-east-1:471112906237:job/glue_api_users_extract_job because no identity-based policy allows the glue:StartJobRun action

Hello @Edmund_Koh
I guess you are using a wrong value for the job name in the glue start job command. Please, provide the full command you use so that we can check it.

Hey @Amir_Zare attached my input
Screenshot 2025-02-12 at 3.16.48 PM

@Edmund_Koh As I said, glue_api_user_extract_job is not the correct job name. You need to get the value of the output named glue_api_user_extract_job from terraform and use that value as the job name in the command.

Hi Amir,

It took me a little while to retry, but I finally got around to it today, and everything ran smoothly! I just wanted to update you and sincerely thank you for your help—I really appreciate it!

Best regards,
Rui Rosa

Hello @Amir_Zare ,

I am getting a similar exception:
AnalysisException: Insufficient Lake Formation permission(s) on de_c4w4a2_silver_db (Service: AWSGlue; Status Code: 400; Error Code: AccessDeniedException; Request ID: d5bbb718-dbe8-4026-ac11-e8e59703fe2c; Proxy: null)

I tried to use the lab refresh link above that you provided, but it tells me it is no longer accepting responses:

What should I do? Any ideas?

I found a link in this article that allowed me to submit a refresh. I’m standing by for that!

Hi @bchibbard
The link I added in this post in the past is obsolete. You can use this link instead.

Many thanks, @Amir_Zare . They fixed my ticket and I’ve gotten past that.

I am now getting these in section 3.1.2:

Error: creating Glue Data Quality Ruleset (songs_dq_ruleset): operation error Glue: CreateDataQualityRuleset, https response error StatusCode: 400, RequestID: e74a4630-31e4-4a4a-87be-5d453217273d, InvalidInputException: Entity Not Found (Service: AmazonDataCatalog; Status Code: 400; Error Code: EntityNotFoundException; Request ID: d0df79ee-c457-4225-bddd-1306eef1245a; Proxy: null)

  with module.data_quality.aws_glue_data_quality_ruleset.songs_dq_ruleset,
  on modules/data_quality/glue.tf line 1, in resource "aws_glue_data_quality_ruleset" "songs_dq_ruleset":
   1: resource "aws_glue_data_quality_ruleset" "songs_dq_ruleset" {

It’s the same error for all three rulesets. The fact that the first ruleset–which we don’t need to modify–isn’t working, seems suspicious.

Any ideas? is it something I may have missed?

@bchibbard you are getting the Entity Not Found exception. Terraform is probably not finding the table songs or the database for which it is supposed to create the ruleset. My guess is that there should have been a problem with the previous steps. Please, make sure that all the previous glue jobs have run completely and succeeded before moving on to this step. You can manually check that the database and table in question exist from the AWS Console.