Hello @mizou,
As Amir pointed out, we are not allowed to share solutions. If you encounter specific problems on a lab, you can detail the problems you encounter and everyone will be happy to help you find the solution on your own
C2W3A1 - initial syntax in the file: great_expectations.yml
which is generated after command: great_expectations init
is not according to the supplied lab instructions, keyword bucket
is not recognized. Also, key: prefix
is not used anymore, looks like that it is replaced by base_directory
.
Command: great_expectations store list
executes with the following error(s):
(jupyterlab-venv) voclabs:~/environment $ great_expectations store list
Traceback (most recent call last):
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/util.py", line 98, in instantiate_class_from_config
class_instance = class_(**config_with_defaults)
TypeError: __init__() got an unexpected keyword argument 'bucket'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/util.py", line 98, in instantiate_class_from_config
class_instance = class_(**config_with_defaults)
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/store/expectations_store.py", line 151, in __init__
super().__init__(
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/store/store.py", line 87, in __init__
self._store_backend = instantiate_class_from_config(
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/util.py", line 100, in instantiate_class_from_config
raise TypeError(
TypeError: Couldn't instantiate class: TupleFilesystemStoreBackend with config:
store_name expectations_store
bucket de-c2w3a1-546183455181-us-east-1-gx-artifacts
base_directory expectations/
manually_initialize_store_backend_id d7853c25-2df6-475f-8963-8ef82548e6b4
filepath_suffix .json
root_directory /home/ec2-user/environment/gx
__init__() got an unexpected keyword argument 'bucket'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ec2-user/environment/jupyterlab-venv/bin/great_expectations", line 8, in <module>
sys.exit(main())
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/cli/cli.py", line 146, in main
cli()
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/click/core.py", line 1685, in invoke
super().invoke(ctx)
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/cli/store.py", line 13, in store
ctx.obj.data_context = ctx.obj.get_data_context_from_config_file()
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/cli/cli.py", line 43, in get_data_context_from_config_file
context: FileDataContext = toolkit.load_data_context_with_error_handling( # type: ignore[assignment] # will exit if error
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/cli/toolkit.py", line 420, in load_data_context_with_error_handling
context = get_context(context_root_dir=directory)
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/data_context/context_factory.py", line 263, in get_context
context = _get_context(**kwargs)
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/data_context/context_factory.py", line 302, in _get_context
file_context = _get_file_context(
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/data_context/context_factory.py", line 383, in _get_file_context
return FileDataContext(
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/data_context/file_data_context.py", line 67, in __init__
super().__init__(
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/data_context/serializable_data_context.py", line 68, in __init__
super().__init__(runtime_environment=runtime_environment)
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/core/usage_statistics/usage_statistics.py", line 266, in usage_statistics_wrapped_method
result = func(*args, **kwargs)
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/data_context/abstract_data_context.py", line 301, in __init__
self._init_primary_stores(self.project_config_with_variables_substituted.stores)
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/data_context/abstract_data_context.py", line 4399, in _init_primary_stores
self._build_store_from_config(store_name, store_config)
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/data_context/abstract_data_context.py", line 4332, in _build_store_from_config
new_store = Store.build_store_from_config(
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/store/store.py", line 319, in build_store_from_config
new_store = instantiate_class_from_config(
File "/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/util.py", line 100, in instantiate_class_from_config
raise TypeError(
TypeError: Couldn't instantiate class: ExpectationsStore with config:
store_name expectations_store
store_backend {'class_name': 'TupleFilesystemStoreBackend', 'bucket': 'de-c2w3a1-546183455181-us-east-1-gx-artifacts', 'base_directory': 'expectations/', 'manually_initialize_store_backend_id': 'd7853c25-2df6-475f-8963-8ef82548e6b4', 'filepath_suffix': '.json'}
runtime_environment {'root_directory': '/home/ec2-user/environment/gx'}
Couldn't instantiate class: TupleFilesystemStoreBackend with config:
store_name expectations_store
bucket de-c2w3a1-546183455181-us-east-1-gx-artifacts
base_directory expectations/
manually_initialize_store_backend_id d7853c25-2df6-475f-8963-8ef82548e6b4
filepath_suffix .json
root_directory /home/ec2-user/environment/gx
__init__() got an unexpected keyword argument 'bucket'
(jupyterlab-venv) voclabs:~/environment $
The content of grat_expectations.yml file is shared below:
# Welcome to Great Expectations! Always know what to expect from your data.
#
# Here you can define datasources, batch kwargs generators, integrations and
# more. This file is intended to be committed to your repo. For help with
# configuration please:
# - Read our docs: https://docs.greatexpectations.io/docs/guides/connecting_to_your_data/connect_to_data_overview/#2-configure-your-datasource
# - Join our slack channel: http://greatexpectations.io/slack
# config_version refers to the syntactic version of this config file, and is used in maintaining backwards compatibility
# It is auto-generated and usually does not need to be changed.
config_version: 3
# Datasources tell Great Expectations where your data lives and how to get it.
# Read more at https://docs.greatexpectations.io/docs/guides/connecting_to_your_data/connect_to_data_overview
datasources: {}
# This config file supports variable substitution which enables: 1) keeping
# secrets out of source control & 2) environment-based configuration changes
# such as staging vs prod.
#
# When GX encounters substitution syntax (like `my_key: ${my_value}` or
# `my_key: $my_value`) in the great_expectations.yml file, it will attempt
# to replace the value of `my_key` with the value from an environment
# variable `my_value` or a corresponding key read from this config file,
# which is defined through the `config_variables_file_path`.
# Environment variables take precedence over variables defined here.
#
# Substitution values defined here can be a simple (non-nested) value,
# nested value such as a dictionary, or an environment variable (i.e. ${ENV_VAR})
#
#
# https://docs.greatexpectations.io/docs/guides/setup/configuring_data_contexts/how_to_configure_credentials
config_variables_file_path: uncommitted/config_variables.yml
# The plugins_directory will be added to your python path for custom modules
# used to override and extend Great Expectations.
plugins_directory: plugins/
stores:
# Stores are configurable places to store things like Expectations, Validations
# Data Docs, and more. These are for advanced users only - most users can simply
# leave this section alone.
#
# Three stores are required: expectations, validations, and
# evaluation_parameters, and must exist with a valid store entry. Additional
# stores can be configured for uses such as data_docs, etc.
expectations_store:
class_name: ExpectationsStore
store_backend:
class_name: TupleFilesystemStoreBackend
bucket: de-c2w3a1-546183455181-us-east-1-gx-artifacts
#prefix: expectations/
base_directory: expectations/
validations_store:
class_name: ValidationsStore
store_backend:
class_name: TupleFilesystemStoreBackend
bucket: de-c2w3a1-546183455181-us-east-1-gx-artifacts
#prefix: validations/
base_directory: uncommitted/validations/
evaluation_parameter_store:
# Evaluation Parameters enable dynamic expectations. Read more here:
# https://docs.greatexpectations.io/docs/reference/evaluation_parameters/
class_name: EvaluationParameterStore
checkpoint_store:
class_name: CheckpointStore
store_backend:
class_name: TupleFilesystemStoreBackend
suppress_store_backend_id: true
bucket: de-c2w3a1-546183455181-us-east-1-gx-artifacts
#prefix: checkpoints/
base_directory: checkpoints/
profiler_store:
class_name: ProfilerStore
store_backend:
class_name: TupleFilesystemStoreBackend
suppress_store_backend_id: true
base_directory: profilers/
expectations_store_name: expectations_store
validations_store_name: validations_store
evaluation_parameter_store_name: evaluation_parameter_store
checkpoint_store_name: checkpoint_store
data_docs_sites:
# Data Docs make it simple to visualize data quality in your project. These
# include Expectations, Validations & Profiles. The are built for all
# Datasources from JSON artifacts in the local repo including validations &
# profiles from the uncommitted directory. Read more at https://docs.greatexpectations.io/docs/terms/data_docs
local_site:
class_name: SiteBuilder
# set to false to hide how-to buttons in Data Docs
show_how_to_buttons: true
store_backend:
class_name: TupleFilesystemStoreBackend
bucket: de-c2w3a1-546183455181-us-east-1-gx-docs
base_directory: uncommitted/data_docs/local_site/
site_index_builder:
class_name: DefaultSiteIndexBuilder
anonymous_usage_statistics:
enabled: True
CONCLUSION: I am blocked and can not complete this LAB/Assignement. Please share appropriate configuration changes that look like not covered by LAB notebook. Thanks.
@stodic @mizou Thank you for your questions. When you run the command âgreat_expectations initâ , the stores for validations, expectations and checkpoints are by default configured as local files. This is why once you open the yml file: great_expectations.yml, you will see for example under âexpectations storeâ that the backend is of type file system (âTupleFilesystemStoreBackendâ) and for that file system you need to define the âbase directoryâ.
What you are asked to do in the question is to update the configuration of the backend store; so instead of using a local store for the validations, expectations and checkpoints stores, you need to define the backend stores as âs3 bucketâ. And for that, you need to completely change the block under store_backend as shown here:
So class_name should be TupleS3StoreBackend, then you need to specify the bucket and finally you need to provide the prefix (because in S3 we donât have directories like in a regular file system, each object has a key that starts with a prefix).
You need to do that for expectations_store, validations_store and checkpoints_store.
And finally for data_docs_sites, you would also need to change the block to how it is defined in the Jupyter notebook.
Hope that helps! Please let us know if you have any further questions.