C4W4 Capstone Part 2

I get these errors after setting up the architecture with terraform, when trying to run the Glue Jobs

Hello @Francisco_Machado,

It seems in the terminal, you should get:
glue_api_users_extract_job = “de-c4w4a2-api-users-extract-job”

To run this glue job, you would need to use this command:
aws glue start-job-run --job-name de-c4w4a2-api-users-extract-job | jq -r ‘.JobRunId’

Thank you

1 Like

Hello @Georgios,

I have a problem running terraform. It always fails with the message “AlreadyExistsException: Database already exists” i have the same problem with the first capstone project.

Do you have any ideas for how to fix this please.

Thanks

Hello @abedelb,

If you had any job fails with Insufficient Lake Formation Permissions you should complete this form and wait two business days. The database already exists seems to fix with a lab refresh. Hope it helps

@Georgios I have refreshed it 10 times and it still doesnt work!

Hello @floojeda,

I understand your frustration and sorry for the inconvenience this causes. However if you have issues with Insufficient Lake Formation Permissions after running your jobs the error will continue after the lab session (or 10). The solution is to fill this form, that should take 2-3 business dayes (no weekends) to refresh your lab. I hope it helps

But my issue is with glue policy

@Georgios This is the error, and then I cant run the glue jobs


Error: creating Glue Connection (de-c4w4a2-connection-rds): operation error Glue: CreateConnection, https response error StatusCode: 400, RequestID: 9d8d89e3-8b24-4e58-b83d-5be9e0877641, AlreadyExistsException: Connection already exists.

  with module.extract_job.aws_glue_connection.rds_connection,
  on modules/extract_job/glue.tf line 2, in resource "aws_glue_connection" "rds_connection":
   2: resource "aws_glue_connection" "rds_connection" {


Error: creating IAM Role (de-c4w4a2-glue-role): operation error IAM: CreateRole, https response error StatusCode: 409, RequestID: 855b9fa7-fee3-4be4-ae31-b76435cec20b, EntityAlreadyExists: Role with name de-c4w4a2-glue-role already exists.

  with module.extract_job.aws_iam_role.glue_role,
  on modules/extract_job/iam.tf line 1, in resource "aws_iam_role" "glue_role":
   1: resource "aws_iam_role" "glue_role" {


Error: pq: Schema "deftunes_serving" already exists

  with module.serving.redshift_schema.serving_schema,
  on modules/serving/redshift.tf line 1, in resource "redshift_schema" "serving_schema":
   1: resource "redshift_schema" "serving_schema" {


Error: pq: Schema "deftunes_transform" already exists

  with module.serving.redshift_schema.transform_external_from_glue_data_catalog,
  on modules/serving/redshift.tf line 9, in resource "redshift_schema" "transform_external_from_glue_data_catalog":
   9: resource "redshift_schema" "transform_external_from_glue_data_catalog" {

Hello @floojeda,

Sorry for the inconvenience, I am not a moderator to track all your posts so I can’t be much of a help. However there were someone with a similar error, post. The other mentor gave the same solution. Hope it helps

getting the following error for the rule set
Error: creating Glue Data Quality Ruleset (songs_dq_ruleset): operation error Glue: CreateDataQualityRuleset, https response error StatusCode: 400, RequestID: 9f62c98b-4414-4ceb-bb1e-70a465951030, InvalidInputException: Entity Not Found (Service: AmazonDataCatalog; Status Code: 400; Error Code: EntityNotFoundException; Request ID: 3003dea7-c7d8-4542-98c3-47039ef1f3cd; Proxy: null)
│
│ with module.data_quality.aws_glue_data_quality_ruleset.songs_dq_ruleset,
│ on modules/data_quality/glue.tf line 1, in resource “aws_glue_data_quality_ruleset” “songs_dq_ruleset”:
│ 1: resource “aws_glue_data_quality_ruleset” “songs_dq_ruleset” {
│
â•”
╷
│ Error: creating Glue Data Quality Ruleset (sessions_dq_ruleset): operation error Glue: CreateDataQualityRuleset, https response error StatusCode: 400, RequestID: c48618ee-4611-4897-9e6d-296c0165d436, InvalidInputException: Entity Not Found (Service: AmazonDataCatalog; Status Code: 400; Error Code: EntityNotFoundException; Request ID: e48f4e0b-f9f6-409a-bfe9-3e544d975f35; Proxy: null)
│
│ with module.data_quality.aws_glue_data_quality_ruleset.sessions_dq_ruleset,
│ on modules/data_quality/glue.tf line 10, in resource “aws_glue_data_quality_ruleset” “sessions_dq_ruleset”:
│ 10: resource “aws_glue_data_quality_ruleset” “sessions_dq_ruleset” {
│
â•”
╷
│ Error: creating Glue Data Quality Ruleset (users_dq_ruleset): operation error Glue: CreateDataQualityRuleset, https response error StatusCode: 400, RequestID: aea7f8e1-c809-4a69-b86d-5afbcfc214c2, InvalidInputException: Entity Not Found (Service: AmazonDataCatalog; Status Code: 400; Error Code: EntityNotFoundException; Request ID: d1e62464-99de-4c27-97d8-8d1ad1907a8c; Proxy: null)
│
│ with module.data_quality.aws_glue_data_quality_ruleset.users_dq_ruleset,
│ on modules/data_quality/glue.tf line 19, in resource “aws_glue_data_quality_ruleset” “users_dq_ruleset”:
│ 19: resource “aws_glue_data_quality_ruleset” “users_dq_ruleset” {
│
â•”
abc@41c96fdce5f4:~/workspace/terraform$

Hello @dkovi2020,

There was someone with a similar issue, post. Could you try deleting the data quality ruleset from the AWS console. Hope it helps