C2W3 Assignment 1 Great Expectations yaml

After loading and initing Great Expectations which worked. I followed the steps to get the two bucket names to insert into the greatexpectations.yaml file when I opened the file shown below:

Welcome to Great Expectations! Always know what to expect from your data.

Here you can define datasources, batch kwargs generators, integrations and

more. This file is intended to be committed to your repo. For help with

configuration please:

- Read our docs: Connect to a Data Source | Great Expectations

- Join our slack channel: Slack

config_version refers to the syntactic version of this config file, and is used in maintaining backwards compatibility

It is auto-generated and usually does not need to be changed.

config_version: 3

Datasources tell Great Expectations where your data lives and how to get it.

Read more at Connect to a Data Source | Great Expectations

datasources: {}

This config file supports variable substitution which enables: 1) keeping

secrets out of source control & 2) environment-based configuration changes

such as staging vs prod.

When GX encounters substitution syntax (like my_key: ${my_value} or

my_key: $my_value) in the great_expectations.yml file, it will attempt

to replace the value of my_key with the value from an environment

variable my_value or a corresponding key read from this config file,

which is defined through the config_variables_file_path.

Environment variables take precedence over variables defined here.

Substitution values defined here can be a simple (non-nested) value,

nested value such as a dictionary, or an environment variable (i.e. ${ENV_VAR})

Configure credentials | Great Expectations

config_variables_file_path: uncommitted/config_variables.yml

The plugins_directory will be added to your python path for custom modules

used to override and extend Great Expectations.

plugins_directory: plugins/

stores:

Stores are configurable places to store things like Expectations, Validations

Data Docs, and more. These are for advanced users only - most users can simply

leave this section alone.

Three stores are required: expectations, validations, and

evaluation_parameters, and must exist with a valid store entry. Additional

stores can be configured for uses such as data_docs, etc.

expectations_store:
class_name: ExpectationsStore
store_backend:
class_name: TupleFilesystemStoreBackend
base_directory: expectations/

validations_store:
class_name: ValidationsStore
store_backend:
class_name: TupleFilesystemStoreBackend
base_directory: uncommitted/validations/

evaluation_parameter_store:
# Evaluation Parameters enable dynamic expectations. Read more here:
# Introduction to GX Core | Great Expectations
class_name: EvaluationParameterStore

checkpoint_store:
class_name: CheckpointStore
store_backend:
class_name: TupleFilesystemStoreBackend
suppress_store_backend_id: true
base_directory: checkpoints/

profiler_store:
class_name: ProfilerStore
store_backend:
class_name: TupleFilesystemStoreBackend
suppress_store_backend_id: true
base_directory: profilers/

expectations_store_name: expectations_store
validations_store_name: validations_store
evaluation_parameter_store_name: evaluation_parameter_store
checkpoint_store_name: checkpoint_store

data_docs_sites:

Data Docs make it simple to visualize data quality in your project. These

include Expectations, Validations & Profiles. The are built for all

Datasources from JSON artifacts in the local repo including validations &

profiles from the uncommitted directory. Read more at Data Docs | Great Expectations

local_site:
class_name: SiteBuilder
# set to false to hide how-to buttons in Data Docs
show_how_to_buttons: true
store_backend:
class_name: TupleFilesystemStoreBackend
base_directory: uncommitted/data_docs/local_site/
site_index_builder:
class_name: DefaultSiteIndexBuilder

anonymous_usage_statistics:
enabled: True

It does not contain the bucket replacement tags mentioned in the instructions. How should I proceed? Should I just update the yaml to exactly match the instructions plus the actual bucket names?

Tim Hayes

Hello @thazed,

You need to replace the placeholders in the great_expectations.yml file as instructed for the expectations, validations and checkpoints stores and finally the s3 site.
After you replace the old TupleFilesystemStoreBackend with the new TupleS3StoreBackend and the code provided in the instructions, you need to replace the bucket names with the actual names of the two buckets. Hope it helps