Getting this error, in my first try ETL jobs was failed and then I rebooted the lab and now I am getting error in terraform apply
coder@71c13ac84688:~/project/terraform$ terraform init
Initializing the backend…
Initializing modules…
Initializing provider plugins…
- Reusing previous version of hashicorp/aws from the dependency lock file
- Using previously-installed hashicorp/aws v5.98.0
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running “terraform plan” to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
coder@71c13ac84688:~/project/terraform$ terraform plan
module.etl.data.aws_iam_policy_document.glue_access_policy: Reading…
module.etl.data.aws_security_group.db_sg: Reading…
module.etl.data.aws_iam_policy_document.glue_base_policy: Reading…
module.etl.data.aws_caller_identity.current: Reading…
module.etl.data.aws_subnet.public_a: Reading…
module.etl.data.aws_iam_policy_document.glue_access_policy: Read complete after 0s [id=2526717434]
module.etl.data.aws_iam_policy_document.glue_base_policy: Read complete after 0s [id=3940084333]
module.etl.aws_iam_role.glue_role: Refreshing state… [id=de-c1w4-glue-role]
module.etl.data.aws_caller_identity.current: Read complete after 0s [id=590183988889]
module.etl.data.aws_security_group.db_sg: Read complete after 0s [id=sg-01c6de5b714f7242d]
module.etl.data.aws_subnet.public_a: Read complete after 0s [id=subnet-0b2d6ccd8a392b399]
module.etl.aws_glue_connection.rds_connection: Refreshing state… [id=590183988889:de-c1w4-rds-connection]
module.etl.aws_iam_role_policy.task_role_policy: Refreshing state… [id=de-c1w4-glue-role:de-c1w4-glue-role-policy]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
- create
Terraform will perform the following actions:
module.etl.aws_glue_catalog_database.ml_database will be created
- resource “aws_glue_catalog_database” “ml_database” {
-
arn = (known after apply)
-
catalog_id = (known after apply)
-
description = “Database for performing ml from OLTP data”
-
id = (known after apply)
-
location_uri = (known after apply)
-
name = “de-c1w4-ml-db”
-
tags_all = (known after apply)
-
create_table_default_permission (known after apply)
}
-
module.etl.aws_glue_crawler.s3_crawler will be created
- resource “aws_glue_crawler” “s3_crawler” {
-
arn = (known after apply)
-
database_name = “de-c1w4-ml-db”
-
id = (known after apply)
-
name = “de-c1w4-ml-db-crawler”
-
role = “arn:aws:iam::590183988889:role/de-c1w4-glue-role”
-
tags_all = (known after apply)
-
recrawl_policy {
- recrawl_behavior = “CRAWL_NEW_FOLDERS_ONLY”
}
- recrawl_behavior = “CRAWL_NEW_FOLDERS_ONLY”
-
s3_target {
- path = “s3://de-c1w4-590183988889-us-east-1-datalake/gold”
}
- path = “s3://de-c1w4-590183988889-us-east-1-datalake/gold”
-
schema_change_policy {
- delete_behavior = “LOG”
- update_behavior = “LOG”
}
}
-
module.etl.aws_glue_job.etl_job will be created
- resource “aws_glue_job” “etl_job” {
-
arn = (known after apply)
-
connections = [
- “de-c1w4-rds-connection”,
]
- “de-c1w4-rds-connection”,
-
default_arguments = {
- “–enable-job-insights” = “true”
- “–glue_connection” = “de-c1w4-rds-connection”
- “–glue_database” = “de-c1w4-ml-db”
- “–job-language” = “python”
- “–target_path” = “s3://de-c1w4-590183988889-us-east-1-datalake”
}
-
glue_version = “4.0”
-
id = (known after apply)
-
max_capacity = (known after apply)
-
name = “de-c1w4-etl-job”
-
number_of_workers = 2
-
role_arn = “arn:aws:iam::590183988889:role/de-c1w4-glue-role”
-
tags_all = (known after apply)
-
timeout = 5
-
worker_type = “G.1X”
-
command {
- name = “glueetl”
- python_version = “3”
- runtime = (known after apply)
- script_location = “s3://de-c1w4-590183988889-us-east-1-scripts/de-c1w4-etl-job.py”
}
-
execution_property (known after apply)
-
notification_property (known after apply)
}
-
Plan: 3 to add, 0 to change, 0 to destroy.
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Note: You didn’t use the -out option to save this plan, so Terraform can’t guarantee to take exactly these actions if you run “terraform apply” now.
coder@71c13ac84688:~/project/terraform$ terraform apply
module.etl.data.aws_security_group.db_sg: Reading…
module.etl.data.aws_iam_policy_document.glue_base_policy: Reading…
module.etl.data.aws_subnet.public_a: Reading…
module.etl.data.aws_iam_policy_document.glue_access_policy: Reading…
module.etl.data.aws_caller_identity.current: Reading…
module.etl.data.aws_iam_policy_document.glue_base_policy: Read complete after 0s [id=3940084333]
module.etl.data.aws_iam_policy_document.glue_access_policy: Read complete after 0s [id=2526717434]
module.etl.aws_iam_role.glue_role: Refreshing state… [id=de-c1w4-glue-role]
module.etl.data.aws_caller_identity.current: Read complete after 0s [id=590183988889]
module.etl.data.aws_subnet.public_a: Read complete after 0s [id=subnet-0b2d6ccd8a392b399]
module.etl.data.aws_security_group.db_sg: Read complete after 0s [id=sg-01c6de5b714f7242d]
module.etl.aws_glue_connection.rds_connection: Refreshing state… [id=590183988889:de-c1w4-rds-connection]
module.etl.aws_iam_role_policy.task_role_policy: Refreshing state… [id=de-c1w4-glue-role:de-c1w4-glue-role-policy]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
- create
Terraform will perform the following actions:
module.etl.aws_glue_catalog_database.ml_database will be created
- resource “aws_glue_catalog_database” “ml_database” {
-
arn = (known after apply)
-
catalog_id = (known after apply)
-
description = “Database for performing ml from OLTP data”
-
id = (known after apply)
-
location_uri = (known after apply)
-
name = “de-c1w4-ml-db”
-
tags_all = (known after apply)
-
create_table_default_permission (known after apply)
}
-
module.etl.aws_glue_crawler.s3_crawler will be created
- resource “aws_glue_crawler” “s3_crawler” {
-
arn = (known after apply)
-
database_name = “de-c1w4-ml-db”
-
id = (known after apply)
-
name = “de-c1w4-ml-db-crawler”
-
role = “arn:aws:iam::590183988889:role/de-c1w4-glue-role”
-
tags_all = (known after apply)
-
recrawl_policy {
- recrawl_behavior = “CRAWL_NEW_FOLDERS_ONLY”
}
- recrawl_behavior = “CRAWL_NEW_FOLDERS_ONLY”
-
s3_target {
- path = “s3://de-c1w4-590183988889-us-east-1-datalake/gold”
}
- path = “s3://de-c1w4-590183988889-us-east-1-datalake/gold”
-
schema_change_policy {
- delete_behavior = “LOG”
- update_behavior = “LOG”
}
}
-
module.etl.aws_glue_job.etl_job will be created
- resource “aws_glue_job” “etl_job” {
-
arn = (known after apply)
-
connections = [
- “de-c1w4-rds-connection”,
]
- “de-c1w4-rds-connection”,
-
default_arguments = {
- “–enable-job-insights” = “true”
- “–glue_connection” = “de-c1w4-rds-connection”
- “–glue_database” = “de-c1w4-ml-db”
- “–job-language” = “python”
- “–target_path” = “s3://de-c1w4-590183988889-us-east-1-datalake”
}
-
glue_version = “4.0”
-
id = (known after apply)
-
max_capacity = (known after apply)
-
name = “de-c1w4-etl-job”
-
number_of_workers = 2
-
role_arn = “arn:aws:iam::590183988889:role/de-c1w4-glue-role”
-
tags_all = (known after apply)
-
timeout = 5
-
worker_type = “G.1X”
-
command {
- name = “glueetl”
- python_version = “3”
- runtime = (known after apply)
- script_location = “s3://de-c1w4-590183988889-us-east-1-scripts/de-c1w4-etl-job.py”
}
-
execution_property (known after apply)
-
notification_property (known after apply)
}
-
Plan: 3 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only ‘yes’ will be accepted to approve.
Enter a value: yes
module.etl.aws_glue_catalog_database.ml_database: Creating…
╷
│ Error: creating Glue Catalog Database (de-c1w4-ml-db): operation error Glue: CreateDatabase, https response error StatusCode: 400, RequestID: ee1195ed-4a9f-4c4b-bc54-0d444fe78103, AlreadyExistsException: Database already exists.
│
│ with module.etl.aws_glue_catalog_database.ml_database,
│ on modules/etl/glue.tf line 1, in resource “aws_glue_catalog_database” “ml_database”:
│ 1: resource “aws_glue_catalog_database” “ml_database” {
│
╵
coder@71c13ac84688:~/project/terraform$ aws glue start-job-run --job-name de-c1w4-etl-job | jq -r ‘.JobRunId’
An error occurred (EntityNotFoundException) when calling the StartJobRun operation: Failed to start job run due to missing metadata.
coder@71c13ac84688:~/project/terraform$