C4W4 Capstone Part 2 Issues - Insufficient Lake Formation permission(s) Glue Jobs

I am having problem when running 2 glue jobs ‘glue_json_transformation_job’ and ‘glue_songs_transformation_job’. Both of these jobs request to write data to database ‘de_c4w4a2_silver_db’ but both get error ‘AnalysisException: Insufficient Lake Formation permission(s) on de_c4w4a2_silver_db (Service: AWSGlue; Status Code: 400; Error Code: AccessDeniedException; Request ID: e15fa4db-2dcc-4361-bd0f-e9e884cc87ef; Proxy: null)’. I tried granting access to this database to IAM role of 2 glue jobs but I can’t and obviously only admin account can do this. The 3 jobs in the previous stage, glue_api_users_extract_job, glue_sessions_users_extract_job and glue_rds_extract_job, all executed successfully. I filled out the support form once and received an email saying that my issue was resolved. However, the above error still occurs when I restart the project. I really hope this error will be resolved completely because I have been stuck with it for 2 days.

Hello @pmk209,

Sorry for the inconvenience, it looks like the lab refresh didn’t resolve your issue. Perhaps a staff @hawraa.salami could look into this matter. In the meantime feel free to submit again the form. I hope it helps

I have resubmitted the form. Thanks for your reply.

1 Like

@hawraa.salami I cant work con capstone project part 2 on C4W4. It has been too impossible doing all the labs with all the problems that arise. i CANT do terraform apply as a million issues arise with the glue policies and redshift schemas. You should do maintainance

@hawraa.salami This is the error, and then I cant run the glue jobs

Error: creating Glue Connection (de-c4w4a2-connection-rds): operation error Glue: CreateConnection, https response error StatusCode: 400, RequestID: 9d8d89e3-8b24-4e58-b83d-5be9e0877641, AlreadyExistsException: Connection already exists.

  with module.extract_job.aws_glue_connection.rds_connection,
  on modules/extract_job/glue.tf line 2, in resource "aws_glue_connection" "rds_connection":
   2: resource "aws_glue_connection" "rds_connection" {


Error: creating IAM Role (de-c4w4a2-glue-role): operation error IAM: CreateRole, https response error StatusCode: 409, RequestID: 855b9fa7-fee3-4be4-ae31-b76435cec20b, EntityAlreadyExists: Role with name de-c4w4a2-glue-role already exists.

  with module.extract_job.aws_iam_role.glue_role,
  on modules/extract_job/iam.tf line 1, in resource "aws_iam_role" "glue_role":
   1: resource "aws_iam_role" "glue_role" {


Error: pq: Schema "deftunes_serving" already exists

  with module.serving.redshift_schema.serving_schema,
  on modules/serving/redshift.tf line 1, in resource "redshift_schema" "serving_schema":
   1: resource "redshift_schema" "serving_schema" {


Error: pq: Schema "deftunes_transform" already exists

  with module.serving.redshift_schema.transform_external_from_glue_data_catalog,
  on modules/serving/redshift.tf line 9, in resource "redshift_schema" "transform_external_from_glue_data_catalog":
   9: resource "redshift_schema" "transform_external_from_glue_data_catalog" {

This error is because you used ‘terraform apply’ when the lab server was rebooted without destroying terraform before. You need to delete them manually using aws commands. For example, with an existing connection, you can use ‘aws glue delete-connection --connection name ’.

@pmk209 Thank you, I did that but now I cant run glue jobs:

aws glue start-job-run --job-name glue_api_users_extract_job | jq -r '.JobRunId'

An error occurred (AccessDeniedException) when calling the StartJobRun operation: User: arn:aws:sts::339712957310:assumed-role/voclabs/user3823079=dyuqfeojgjei is not authorized to perform: glue:StartJobRun on resource: arn:aws:glue:us-east-1:339712957310:job/glue_api_users_extract_job because no identity-based policy allows the glue:StartJobRun action

Hello @floojeda,

After you run terraform apply you should get in the terminal: glue_api_users_extract_job = “de-c4w4a1-api-users-extract-job”
Could you try using this command instead:
aws glue start-job-run --job-name de-c4w4a1-api-users-extract-job | jq -r ‘.JobRunId’

@Georgios Oh thank you for your quick response! it worked!

2 Likes

Hello, I’m having this same Issue:
AnalysisException: Unable to verify existence of default database: com.amazonaws.services.glue.model.AccessDeniedException: Insufficient Lake Formation permission(s) on default (Service: AWSGlue; Status Code: 400; Error Code: AccessDeniedException; Request ID: 5ca24ae5-faa4-4b49-8690-34097f1d6d5a; Proxy: null)

I’m pretty sure I’ve followed all steps correctly, I rebooted the lab 3 times, but everytime this happens when running the transform jobs.

The 3 extract Jobs worked correctly… I was hoping to finish the capstone project by tomorrow since I believe that my license will then expire and I had no issues with previous labs before… I will fill out the form @Georgios shared, but I think they take a couple of business days to respond?

Thanks in advance

Hello @hawraa.salami,

It seems @AndresCathalifaud has submitted the form because of the Insufficient Lake Formation Permissions. Is there anything he can do since the lab refresh takes 1-2 business days so he can finish tomorrow. Thanks in advance

I forgot to mention, this issue created by pmk209 is about Part2 of the Capstone project, but I’m having issues on Part 1 – I read his issue and its basically the same problem though

@AndresCathalifaud Your account has been refreshed. Could you please try again and let us know if you encounter other issues.

cc: @Mubsi

2 Likes

Thank you so much, I’ll go ahead and try again right now !

Lets gooo I managed to get through the transform section that was not working before. Thank you!

Now I’m having the same issue with Part 2 of the Capstone Project. I think the issue started after trying to run
terraform apply -target=module.serving
for the first time… The session got stuck, crashed, and now the console crashes after any terraform apply instruction I give it

Submitted a new form to restart the lab :persevere:
Thanks in advance

I managed to solve my issue (my last problem was with the terraform apply -target=data_quality) by reading the error log file generated by the apply instruction, which had issues with some elements that were previously generated – I had to go to AWS and manually delete a database, role and connection, then rebooted and tried again, this time the terraform apply for data_quality managed to finish correctly and I was able to run the Airflow tasks

1 Like

Despite of following the steps in correct order, I’m also facing the same problem at step 2.5 of C4W4_Part2 project.

Both glue_json_transformation_job and glue_songs_transformation_job are encountering the following error.
AnalysisException: Insufficient Lake Formation permission(s) on de_c4w4a2_silver_db (Service: AWSGlue; Status Code: 400; Error Code: AccessDeniedException; Request ID: f61dd05a-181a-4489-bd0f-d8bd89261d50; Proxy: null)

Moreover, I’m unable to fill the lab refresh form due to following issue:

“Data Engineering - Lab Issue Question
The form Data Engineering - Lab Issue Question is no longer accepting responses.
Try contacting the owner of the form if you think this is a mistake.”

Kindly help.

Hello @pawanshirbhate,

Please fill out the lab refresh form, this is the correct link. The other link is outdated, it should take 1-2 business days since it is a manual process performed by the engineers. Thank you

Thanks. Have submitted the form. What preventive measures should be considered to avoid this issue in future attempts?