C2W3 assignment 3:Testing Data Quality with Great Expectation

Course 2 Week 3 : I am doing the Great Expectation Lab.
Desipte meeting doing all steps until checkpoint.run (which fails and before steps I do not get an error).
When I do submit the assignment. The grading says until checkpoint.run I did not do any of steps. When one check the S3 buckets. I have expectation suite, S3 artifacts buckets, validation folders created but they are all ignored. I do not know why my checkpoint run fails but even due to that I should be getting 85/100 but I only get 30/100. I have done the lab twice. Additional note that yaml.file parameters to be setup are incorrect. As I was getting constant error when i added the bucket parameter because class_name provided in the parameter file is incorrect. Please could you have the look at the exercise. As I am unable to go forward due to me failing this assignment. Thanks
When creating a post, please add:

  • Module # must be added in the tags option of the post. (ex: Module-1)
  • Link to the classroom item you are referring to:
  • Description (include relevant info but please do not post solution code or your entire notebook)

Hello @AQ_2023 , I coudnt reproduce any errors especially in the yml file. Sorry for trying twice with no success but make sure to replace the correct stores with the two different bucket names. Could you post your submission report to check further, it should look like this:

Hello Georgios, Thank for your fast reply.

In the (configuration file) yaml file for example. The guide has set in the yaml file the expectations_store
the class_name : TupleFilesystemStoreBackend

this apparently result in an error when the bucket parameter is set.
When change class_name to this type : TupleS3StoreBackend

and also make sure that the
base_directory: expectations/
is changed
to
prefix: expectations/

Then the yaml file worked otherwise I would continuously get error

[Executed at: Mon Sep 30 8:47:54 PDT 2024]

Test 1 passed: Created Cloud9 environment.

Test 2 failed: No graded exercises found in the submission notebook. Please try again.

Test 3 failed: No graded exercises found in the submission notebook. Please try again.

Test 4 failed: The expectation file does not exist.

Test 5 failed: No graded exercises found in the submission notebook. Please try again.

Test 6 passed: The checkpoint file exists with the correct content.

Test 7 failed: No folders found in S3 docs bucket. Please try again.

Hello thanks for the information. Could you try copying the yaml to a notebook just in case. In the new file you need ā€œTupleS3StoreBackendā€ and ā€œprefix: expectationsā€, you only change the artifact bucket (3 places) and docs bucket (one place).

In case of an error you can revert back to the copy you made so you wont have to repeat everything.

Hint: The blocks in the instructions should have the correct indentation so check for the original file, it has to look identical. Finally add the names of the buckets and change nothing else. Thanks

I have the issue when I created the solution for validation
and when batch loops I get too much output.
This is the synthax I have used.

validations = [
{ā€œbatch_requestā€:batches, ā€œexpectation_suite_nameā€: ā€œexpectation_suite_nameā€}
for batch in batches
]
validations

This is my output.

[{ā€˜batch_requestā€™: [Batch(datasource=SQLDatasource(type=ā€˜sqlā€™, name=ā€˜de-c2w3a1-db-datasourceā€™, id=None, assets=[TableAsset(name=ā€˜de-c2w3a1-tripsā€™, type=ā€˜tableā€™, id=None, order_by=, batch_metadata={}, splitter=SplitterColumnValue(column_name=ā€˜vendor_idā€™, method_name=ā€˜split_on_column_valueā€™), table_name=ā€˜tripsā€™, schema_name=None)], connection_string=ConfigStr(ā€˜{MYSQL_CONNECTION_STRING}'), create_temp_table=False, kwargs={}), data_asset=TableAsset(name='de-c2w3a1-trips', type='table', id=None, order_by=[], batch_metadata={}, splitter=SplitterColumnValue(column_name='vendor_id', method_name='split_on_column_value'), table_name='trips', schema_name=None), batch_request=BatchRequest(datasource_name='de-c2w3a1-db-datasource', data_asset_name='de-c2w3a1-trips', options={'vendor_id': 1}), data=<great_expectations.execution_engine.sqlalchemy_batch_data.SqlAlchemyBatchData object at 0x7fadd0861700>, id='de-c2w3a1-db-datasource-de-c2w3a1-trips-vendor_id_1', metadata={'vendor_id': 1}, batch_markers={'ge_load_time': '20240930T204247.156885Z'}, batch_spec={'type': 'table', 'data_asset_name': 'de-c2w3a1-trips', 'table_name': 'trips', 'schema_name': None, 'batch_identifiers': {'vendor_id': 1}, 'splitter_method': 'split_on_column_value', 'splitter_kwargs': {'column_name': 'vendor_id'}}, batch_definition={'datasource_name': 'de-c2w3a1-db-datasource', 'data_connector_name': 'fluent', 'data_asset_name': 'de-c2w3a1-trips', 'batch_identifiers': {'vendor_id': 1}}), Batch(datasource=SQLDatasource(type='sql', name='de-c2w3a1-db-datasource', id=None, assets=[TableAsset(name='de-c2w3a1-trips', type='table', id=None, order_by=[], batch_metadata={}, splitter=SplitterColumnValue(column_name='vendor_id', method_name='split_on_column_value'), table_name='trips', schema_name=None)], connection_string=ConfigStr('{MYSQL_CONNECTION_STRING}ā€™), create_temp_table=False, kwargs={}), data_asset=TableAsset(name=ā€˜de-c2w3a1-tripsā€™, type=ā€˜tableā€™, id=None, order_by=, batch_metadata={}, splitter=SplitterColumnValue(column_name=ā€˜vendor_idā€™, method_name=ā€˜split_on_column_valueā€™), table_name=ā€˜tripsā€™, schema_name=None), batch_request=BatchRequest(datasource_name=ā€˜de-c2w3a1-db-datasourceā€™, data_asset_name=ā€˜de-c2w3a1-tripsā€™, options={ā€˜vendor_idā€™: 2}), data=<great_expectations.execution_engine.sqlalchemy_batch_data.SqlAlchemyBatchData object at 0x7fadd0861f10>, id=ā€˜de-c2w3a1-db-datasource-de-c2w3a1-trips-vendor_id_2ā€™, metadata={ā€˜vendor_idā€™: 2}, batch_markers={ā€˜ge_load_timeā€™: ā€˜20240930T204247.158778Zā€™}, batch_spec={ā€˜typeā€™: ā€˜tableā€™, ā€˜data_asset_nameā€™: ā€˜de-c2w3a1-tripsā€™, ā€˜table_nameā€™: ā€˜tripsā€™, ā€˜schema_nameā€™: None, ā€˜batch_identifiersā€™: {ā€˜vendor_idā€™: 2}, ā€˜splitter_methodā€™: ā€˜split_on_column_valueā€™, ā€˜splitter_kwargsā€™: {ā€˜column_nameā€™: ā€˜vendor_idā€™}}, batch_definition={ā€˜datasource_nameā€™: ā€˜de-c2w3a1-db-datasourceā€™, ā€˜data_connector_nameā€™: ā€˜fluentā€™, ā€˜data_asset_nameā€™: ā€˜de-c2w3a1-tripsā€™, ā€˜batch_identifiersā€™: {ā€˜vendor_idā€™: 2}}),
Batch(datasource=SQLDatasource(type=ā€˜sqlā€™, name=ā€˜de-c2w3a1-db-datasourceā€™, id=None, assets=[TableAsset(name=ā€˜de-c2w3a1-tripsā€™, type=ā€˜tableā€™, id=None, order_by=, batch_metadata={}, splitter=SplitterColumnValue(column_name=ā€˜vendor_idā€™, method_name=ā€˜split_on_column_valueā€™), table_name=ā€˜tripsā€™, schema_name=None)], connection_string=ConfigStr(ā€˜{MYSQL_CONNECTION_STRING}'), create_temp_table=False, kwargs={}), data_asset=TableAsset(name='de-c2w3a1-trips', type='table', id=None, order_by=[], batch_metadata={}, splitter=SplitterColumnValue(column_name='vendor_id', method_name='split_on_column_value'), table_name='trips', schema_name=None), batch_request=BatchRequest(datasource_name='de-c2w3a1-db-datasource', data_asset_name='de-c2w3a1-trips', options={'vendor_id': 4}), data=<great_expectations.execution_engine.sqlalchemy_batch_data.SqlAlchemyBatchData object at 0x7fadd08611f0>, id='de-c2w3a1-db-datasource-de-c2w3a1-trips-vendor_id_4', metadata={'vendor_id': 4}, batch_markers={'ge_load_time': '20240930T204247.161007Z'}, batch_spec={'type': 'table', 'data_asset_name': 'de-c2w3a1-trips', 'table_name': 'trips', 'schema_name': None, 'batch_identifiers': {'vendor_id': 4}, 'splitter_method': 'split_on_column_value', 'splitter_kwargs': {'column_name': 'vendor_id'}}, batch_definition={'datasource_name': 'de-c2w3a1-db-datasource', 'data_connector_name': 'fluent', 'data_asset_name': 'de-c2w3a1-trips', 'batch_identifiers': {'vendor_id': 4}})], 'expectation_suite_name': 'expectation_suite_name'}, {'batch_request': [Batch(datasource=SQLDatasource(type='sql', name='de-c2w3a1-db-datasource', id=None, assets=[TableAsset(name='de-c2w3a1-trips', type='table', id=None, order_by=[], batch_metadata={}, splitter=SplitterColumnValue(column_name='vendor_id', method_name='split_on_column_value'), table_name='trips', schema_name=None)], connection_string=ConfigStr('{MYSQL_CONNECTION_STRING}ā€™), create_temp_table=False, kwargs={}), data_asset=TableAsset(name=ā€˜de-c2w3a1-tripsā€™, type=ā€˜tableā€™, id=None, order_by=, batch_metadata={}, splitter=SplitterColumnValue(column_name=ā€˜vendor_idā€™, method_name=ā€˜split_on_column_valueā€™), table_name=ā€˜tripsā€™, schema_name=None), batch_request=BatchRequest(datasource_name=ā€˜de-c2w3a1-db-datasourceā€™, data_asset_name=ā€˜de-c2w3a1-tripsā€™, options={ā€˜vendor_idā€™: 1}), data=<great_expectations.execution_engine.sqlalchemy_batch_data.SqlAlchemyBatchData object at 0x7fadd0861700>, id=ā€˜de-c2w3a1-db-datasource-de-c2w3a1-trips-vendor_id_1ā€™, metadata={ā€˜vendor_idā€™: 1}, batch_markers={ā€˜ge_load_timeā€™: ā€˜20240930T204247.156885Zā€™}, batch_spec={ā€˜typeā€™: ā€˜tableā€™, ā€˜data_asset_nameā€™: ā€˜de-c2w3a1-tripsā€™, ā€˜table_nameā€™: ā€˜tripsā€™, ā€˜schema_nameā€™: None, ā€˜batch_identifiersā€™: {ā€˜vendor_idā€™: 1}, ā€˜splitter_methodā€™: ā€˜split_on_column_valueā€™, ā€˜splitter_kwargsā€™: {ā€˜column_nameā€™: ā€˜vendor_idā€™}}, batch_definition={ā€˜datasource_nameā€™: ā€˜de-c2w3a1-db-datasourceā€™, ā€˜data_connector_nameā€™: ā€˜fluentā€™, ā€˜data_asset_nameā€™: ā€˜de-c2w3a1-tripsā€™, ā€˜batch_identifiersā€™: {ā€˜vendor_idā€™: 1}}),
Batch(datasource=SQLDatasource(type=ā€˜sqlā€™, name=ā€˜de-c2w3a1-db-datasourceā€™, id=None, assets=[TableAsset(name=ā€˜de-c2w3a1-tripsā€™, type=ā€˜tableā€™, id=None, order_by=, batch_metadata={}, splitter=SplitterColumnValue(column_name=ā€˜vendor_idā€™, method_name=ā€˜split_on_column_valueā€™), table_name=ā€˜tripsā€™, schema_name=None)], connection_string=ConfigStr(ā€˜{MYSQL_CONNECTION_STRING}'), create_temp_table=False, kwargs={}), data_asset=TableAsset(name='de-c2w3a1-trips', type='table', id=None, order_by=[], batch_metadata={}, splitter=SplitterColumnValue(column_name='vendor_id', method_name='split_on_column_value'), table_name='trips', schema_name=None), batch_request=BatchRequest(datasource_name='de-c2w3a1-db-datasource', data_asset_name='de-c2w3a1-trips', options={'vendor_id': 2}), data=<great_expectations.execution_engine.sqlalchemy_batch_data.SqlAlchemyBatchData object at 0x7fadd0861f10>, id='de-c2w3a1-db-datasource-de-c2w3a1-trips-vendor_id_2', metadata={'vendor_id': 2}, batch_markers={'ge_load_time': '20240930T204247.158778Z'}, batch_spec={'type': 'table', 'data_asset_name': 'de-c2w3a1-trips', 'table_name': 'trips', 'schema_name': None, 'batch_identifiers': {'vendor_id': 2}, 'splitter_method': 'split_on_column_value', 'splitter_kwargs': {'column_name': 'vendor_id'}}, batch_definition={'datasource_name': 'de-c2w3a1-db-datasource', 'data_connector_name': 'fluent', 'data_asset_name': 'de-c2w3a1-trips', 'batch_identifiers': {'vendor_id': 2}}), Batch(datasource=SQLDatasource(type='sql', name='de-c2w3a1-db-datasource', id=None, assets=[TableAsset(name='de-c2w3a1-trips', type='table', id=None, order_by=[], batch_metadata={}, splitter=SplitterColumnValue(column_name='vendor_id', method_name='split_on_column_value'), table_name='trips', schema_name=None)], connection_string=ConfigStr('{MYSQL_CONNECTION_STRING}ā€™), create_temp_table=False, kwargs={}), data_asset=TableAsset(name=ā€˜de-c2w3a1-tripsā€™, type=ā€˜tableā€™, id=None, order_by=, batch_metadata={}, splitter=SplitterColumnValue(column_name=ā€˜vendor_idā€™, method_name=ā€˜split_on_column_valueā€™), table_name=ā€˜tripsā€™, schema_name=None), batch_request=BatchRequest(datasource_name=ā€˜de-c2w3a1-db-datasourceā€™, data_asset_name=ā€˜de-c2w3a1-tripsā€™, options={ā€˜vendor_idā€™: 4}), data=<great_expectations.execution_engine.sqlalchemy_batch_data.SqlAlchemyBatchData object at 0x7fadd08611f0>, id=ā€˜de-c2w3a1-db-datasource-de-c2w3a1-trips-vendor_id_4ā€™, metadata={ā€˜vendor_idā€™: 4}, batch_markers={ā€˜ge_load_timeā€™: ā€˜20240930T204247.161007Zā€™}, batch_spec={ā€˜typeā€™: ā€˜tableā€™, ā€˜data_asset_nameā€™: ā€˜de-c2w3a1-tripsā€™, ā€˜table_nameā€™: ā€˜tripsā€™, ā€˜schema_nameā€™: None, ā€˜batch_identifiersā€™: {ā€˜vendor_idā€™: 4}, ā€˜splitter_methodā€™: ā€˜split_on_column_valueā€™, ā€˜splitter_kwargsā€™: {ā€˜column_nameā€™: ā€˜vendor_idā€™}}, batch_definition={ā€˜datasource_nameā€™: ā€˜de-c2w3a1-db-datasourceā€™, ā€˜data_connector_nameā€™: ā€˜fluentā€™, ā€˜data_asset_nameā€™: ā€˜de-c2w3a1-tripsā€™, ā€˜batch_identifiersā€™: {ā€˜vendor_idā€™: 4}})],
ā€˜expectation_suite_nameā€™: ā€˜expectation_suite_nameā€™},
{ā€˜batch_requestā€™: [Batch(datasource=SQLDatasource(type=ā€˜sqlā€™, name=ā€˜de-c2w3a1-db-datasourceā€™, id=None, assets=[TableAsset(name=ā€˜de-c2w3a1-tripsā€™, type=ā€˜tableā€™, id=None, order_by=, batch_metadata={}, splitter=SplitterColumnValue(column_name=ā€˜vendor_idā€™, method_name=ā€˜split_on_column_valueā€™), table_name=ā€˜tripsā€™, schema_name=None)], connection_string=ConfigStr(ā€˜{MYSQL_CONNECTION_STRING}'), create_temp_table=False, kwargs={}), data_asset=TableAsset(name='de-c2w3a1-trips', type='table', id=None, order_by=[], batch_metadata={}, splitter=SplitterColumnValue(column_name='vendor_id', method_name='split_on_column_value'), table_name='trips', schema_name=None), batch_request=BatchRequest(datasource_name='de-c2w3a1-db-datasource', data_asset_name='de-c2w3a1-trips', options={'vendor_id': 1}), data=<great_expectations.execution_engine.sqlalchemy_batch_data.SqlAlchemyBatchData object at 0x7fadd0861700>, id='de-c2w3a1-db-datasource-de-c2w3a1-trips-vendor_id_1', metadata={'vendor_id': 1}, batch_markers={'ge_load_time': '20240930T204247.156885Z'}, batch_spec={'type': 'table', 'data_asset_name': 'de-c2w3a1-trips', 'table_name': 'trips', 'schema_name': None, 'batch_identifiers': {'vendor_id': 1}, 'splitter_method': 'split_on_column_value', 'splitter_kwargs': {'column_name': 'vendor_id'}}, batch_definition={'datasource_name': 'de-c2w3a1-db-datasource', 'data_connector_name': 'fluent', 'data_asset_name': 'de-c2w3a1-trips', 'batch_identifiers': {'vendor_id': 1}}), Batch(datasource=SQLDatasource(type='sql', name='de-c2w3a1-db-datasource', id=None, assets=[TableAsset(name='de-c2w3a1-trips', type='table', id=None, order_by=[], batch_metadata={}, splitter=SplitterColumnValue(column_name='vendor_id', method_name='split_on_column_value'), table_name='trips', schema_name=None)], connection_string=ConfigStr('{MYSQL_CONNECTION_STRING}ā€™), create_temp_table=False, kwargs={}), data_asset=TableAsset(name=ā€˜de-c2w3a1-tripsā€™, type=ā€˜tableā€™, id=None, order_by=, batch_metadata={}, splitter=SplitterColumnValue(column_name=ā€˜vendor_idā€™, method_name=ā€˜split_on_column_valueā€™), table_name=ā€˜tripsā€™, schema_name=None), batch_request=BatchRequest(datasource_name=ā€˜de-c2w3a1-db-datasourceā€™, data_asset_name=ā€˜de-c2w3a1-tripsā€™, options={ā€˜vendor_idā€™: 2}), data=<great_expectations.execution_engine.sqlalchemy_batch_data.SqlAlchemyBatchData object at 0x7fadd0861f10>, id=ā€˜de-c2w3a1-db-datasource-de-c2w3a1-trips-vendor_id_2ā€™, metadata={ā€˜vendor_idā€™: 2}, batch_markers={ā€˜ge_load_timeā€™: ā€˜20240930T204247.158778Zā€™}, batch_spec={ā€˜typeā€™: ā€˜tableā€™, ā€˜data_asset_nameā€™: ā€˜de-c2w3a1-tripsā€™, ā€˜table_nameā€™: ā€˜tripsā€™, ā€˜schema_nameā€™: None, ā€˜batch_identifiersā€™: {ā€˜vendor_idā€™: 2}, ā€˜splitter_methodā€™: ā€˜split_on_column_valueā€™, ā€˜splitter_kwargsā€™: {ā€˜column_nameā€™: ā€˜vendor_idā€™}}, batch_definition={ā€˜datasource_nameā€™: ā€˜de-c2w3a1-db-datasourceā€™, ā€˜data_connector_nameā€™: ā€˜fluentā€™, ā€˜data_asset_nameā€™: ā€˜de-c2w3a1-tripsā€™, ā€˜batch_identifiersā€™: {ā€˜vendor_idā€™: 2}}),
Batch(datasource=SQLDatasource(type=ā€˜sqlā€™, name=ā€˜de-c2w3a1-db-datasourceā€™, id=None, assets=[TableAsset(name=ā€˜de-c2w3a1-tripsā€™, type=ā€˜tableā€™, id=None, order_by=, batch_metadata={}, splitter=SplitterColumnValue(column_name=ā€˜vendor_idā€™, method_name=ā€˜split_on_column_valueā€™), table_name=ā€˜tripsā€™, schema_name=None)], connection_string=ConfigStr(ā€˜${MYSQL_CONNECTION_STRING}ā€™), create_temp_table=False, kwargs={}), data_asset=TableAsset(name=ā€˜de-c2w3a1-tripsā€™, type=ā€˜tableā€™, id=None, order_by=, batch_metadata={}, splitter=SplitterColumnValue(column_name=ā€˜vendor_idā€™, method_name=ā€˜split_on_column_valueā€™), table_name=ā€˜tripsā€™, schema_name=None), batch_request=BatchRequest(datasource_name=ā€˜de-c2w3a1-db-datasourceā€™, data_asset_name=ā€˜de-c2w3a1-tripsā€™, options={ā€˜vendor_idā€™: 4}), data=<great_expectations.execution_engine.sqlalchemy_batch_data.SqlAlchemyBatchData object at 0x7fadd08611f0>, id=ā€˜de-c2w3a1-db-datasource-de-c2w3a1-trips-vendor_id_4ā€™, metadata={ā€˜vendor_idā€™: 4}, batch_markers={ā€˜ge_load_timeā€™: ā€˜20240930T204247.161007Zā€™}, batch_spec={ā€˜typeā€™: ā€˜tableā€™, ā€˜data_asset_nameā€™: ā€˜de-c2w3a1-tripsā€™, ā€˜table_nameā€™: ā€˜tripsā€™, ā€˜schema_nameā€™: None, ā€˜batch_identifiersā€™: {ā€˜vendor_idā€™: 4}, ā€˜splitter_methodā€™: ā€˜split_on_column_valueā€™, ā€˜splitter_kwargsā€™: {ā€˜column_nameā€™: ā€˜vendor_idā€™}}, batch_definition={ā€˜datasource_nameā€™: ā€˜de-c2w3a1-db-datasourceā€™, ā€˜data_connector_nameā€™: ā€˜fluentā€™, ā€˜data_asset_nameā€™: ā€˜de-c2w3a1-tripsā€™, ā€˜batch_identifiersā€™: {ā€˜vendor_idā€™: 4}})],
ā€˜expectation_suite_nameā€™: ā€˜expectation_suite_nameā€™}]

Additonally I also using context exercise with the following code

context.add_or_update_checkpoint(checkpoint_name)

get this output which is less than what is expected from the assignment output shown.

{
ā€œaction_listā€: [
{
ā€œnameā€: ā€œstore_validation_resultā€,
ā€œactionā€: {
ā€œclass_nameā€: ā€œStoreValidationResultActionā€
}
},
{
ā€œnameā€: ā€œstore_evaluation_paramsā€,
ā€œactionā€: {
ā€œclass_nameā€: ā€œStoreEvaluationParametersActionā€
}
},
{
ā€œnameā€: ā€œupdate_data_docsā€,
ā€œactionā€: {
ā€œclass_nameā€: ā€œUpdateDataDocsActionā€
}
}
],
ā€œbatch_requestā€: {},
ā€œclass_nameā€: ā€œCheckpointā€,
ā€œconfig_versionā€: 1.0,
ā€œevaluation_parametersā€: {},
ā€œmodule_nameā€: ā€œgreat_expectations.checkpointā€,
ā€œnameā€: ā€œde-c2w3a1-checkpoint-trips-1727729147.8883522ā€,
ā€œprofilersā€: ,
ā€œruntime_configurationā€: {},
ā€œvalidationsā€:
}

checkpoint_result=checkpoint.run()

results in an error I do not know what is the reason for that. Here is the error message :
checkpoint_result = checkpoint.run()


TypeError Traceback (most recent call last)
Cell In[18], line 1
----> 1 checkpoint_result = checkpoint.run()
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/core/usage_statistics/usage_statistics.py:266, in usage_statistics_enabled_method..usage_statistics_wrapped_method(*args, **kwargs)
263 args_payload = args_payload_fn(*args, **kwargs) or {}
264 nested_update(event_payload, args_payload)
ā†’ 266 result = func(*args, **kwargs)
267 message[ā€œsuccessā€] = True
268 except Exception:
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/checkpoint/checkpoint.py:265, in BaseCheckpoint.run(self, template_name, run_name_template, expectation_suite_name, batch_request, validator, action_list, evaluation_parameters, runtime_configuration, validations, profilers, run_id, run_name, run_time, result_format, expectation_suite_ge_cloud_id)
248 validations = get_validations_with_batch_request_as_dict(
249 validations=validations
250 )
252 runtime_kwargs: dict = {
253 ā€œtemplate_nameā€: template_name,
254 ā€œrun_name_templateā€: run_name_template,
(ā€¦)
262 ā€œexpectation_suite_ge_cloud_idā€: expectation_suite_ge_cloud_id,
263 }
ā†’ 265 substituted_runtime_config: dict = self.get_substituted_config(
266 runtime_kwargs=runtime_kwargs
267 )
269 run_name_template = substituted_runtime_config.get(ā€œrun_name_templateā€)
271 batch_request = substituted_runtime_config.get(ā€œbatch_requestā€)
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/checkpoint/checkpoint.py:370, in BaseCheckpoint.get_substituted_config(self, runtime_kwargs)
367 if runtime_kwargs is None:
368 runtime_kwargs = {}
ā†’ 370 config_kwargs: dict = self.get_config(mode=ConfigOutputModes.JSON_DICT) # type: ignore[assignment] # always returns a dict
372 template_name: str | None = runtime_kwargs.get(ā€œtemplate_nameā€)
373 if template_name:
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/core/config_peer.py:69, in ConfigPeer.get_config(self, mode, **kwargs)
67 config_kwargs: dict = config.to_dict()
68 elif mode == ConfigOutputModes.JSON_DICT:
ā€”> 69 config_kwargs = config.to_json_dict()
70 else:
71 raise ValueError(fā€™Unknown mode {mode} in ā€œBaseCheckpoint.get_config()ā€.')
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/types/base.py:3065, in CheckpointConfig.to_json_dict(self)
3055 ā€œā€ā€œReturns a JSON-serializable dict representation of this CheckpointConfig.
3056
3057 Returns:
3058 A JSON-serializable dict representation of this CheckpointConfig.
3059 ā€œā€ā€
3060 # # TODO: 2/4/2022
3061 # This implementation of "SerializableDictDot.to_json_dict() occurs frequently and should ideally serve as the
3062 # reference implementation in the ā€œSerializableDictDotā€ class itself. However, the circular import dependencies,
3063 # due to the location of the ā€œgreat_expectations/types/init.pyā€ and ā€œgreat_expectations/core/util.pyā€ modules
3064 # make this refactoring infeasible at the present time.
ā†’ 3065 dict_obj: dict = self.to_dict()
3066 serializeable_dict: dict = convert_to_json_serializable(data=dict_obj)
3067 return serializeable_dict
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/types/init.py:137, in DictDot.to_dict(self)
134 def to_dict(self) ā†’ dict:
135 new_dict = {
136 key: self[key]
ā†’ 137 for key in self.property_names(
138 include_keys=self.include_field_names,
139 exclude_keys=self.exclude_field_names,
140 )
141 }
142 for key, value in new_dict.items():
143 if isinstance(value, pydantic.BaseModel):
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/types/init.py:230, in DictDot.property_names(self, include_keys, exclude_keys)
224 keys_for_exclusion.extend(
225 [key for key in property_names if key not in include_keys]
226 )
228 if exclude_keys:
229 # Make sure that all properties, marked for exclusion, actually exist on the object.
ā†’ 230 assert_valid_keys(keys=exclude_keys, purpose=ā€œexclusionā€)
231 keys_for_exclusion.extend(
232 [key for key in property_names if key in exclude_keys]
233 )
235 keys_for_exclusion = list(set(keys_for_exclusion))
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/types/init.py:212, in DictDot.property_names..assert_valid_keys(keys, purpose)
210 for name in keys:
211 try:
ā†’ 212 _ = self[name]
213 except AttributeError:
214 try:
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/types/init.py:70, in DictDot.getitem(self, item)
68 if isinstance(item, int):
69 return list(self.dict.keys())[item]
ā€”> 70 return getattr(self, item)
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/types/base.py:181, in BaseYamlConfig.commented_map(self)
179 @property
180 def commented_map(self) ā†’ CommentedMap:
ā†’ 181 return self._get_schema_validated_updated_commented_map()
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/types/base.py:152, in BaseYamlConfig._get_schema_validated_updated_commented_map(self)
150 def _get_schema_validated_updated_commented_map(self) ā†’ CommentedMap:
151 commented_map: CommentedMap = copy.deepcopy(self._commented_map)
ā†’ 152 schema_validated_map: dict = self._get_schema_instance().dump(self)
153 commented_map.update(schema_validated_map)
154 return commented_map
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/marshmallow/schema.py:547, in Schema.dump(self, obj, many)
545 many = self.many if many is None else bool(many)
546 if self._hooks[PRE_DUMP]:
ā†’ 547 processed_obj = self._invoke_dump_processors(
548 PRE_DUMP, obj, many=many, original_data=obj
549 )
550 else:
551 processed_obj = obj
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/marshmallow/schema.py:1068, in Schema._invoke_dump_processors(self, tag, data, many, original_data)
1062 def _invoke_dump_processors(
1063 self, tag: str, data, *, many: bool, original_data=None
1064 ):
1065 # The pass_many post-dump processors may do things like add an envelope, so
1066 # invoke those after invoking the non-pass_many processors which will expect
1067 # to get a list of items.
ā†’ 1068 data = self._invoke_processors(
1069 tag, pass_many=False, data=data, many=many, original_data=original_data
1070 )
1071 data = self._invoke_processors(
1072 tag, pass_many=True, data=data, many=many, original_data=original_data
1073 )
1074 return data
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/marshmallow/schema.py:1222, in Schema._invoke_processors(self, tag, pass_many, data, many, original_data, **kwargs)
1220 data = processor(data, original_data, many=many, **kwargs)
1221 else:
ā†’ 1222 data = processor(data, many=many, **kwargs)
1223 return data
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/types/base.py:2767, in CheckpointConfigSchema.prepare_dump(self, data, **kwargs)
2765 @pre_dump
2766 def prepare_dump(self, data, **kwargs):
ā†’ 2767 data = copy.deepcopy(data)
2768 for key, value in data.items():
2769 data[key] = convert_to_json_serializable(data=value)
File /usr/lib64/python3.9/copy.py:153, in deepcopy(x, memo, _nil)
151 copier = getattr(x, ā€œdeepcopyā€, None)
152 if copier is not None:
ā†’ 153 y = copier(memo)
154 else:
155 reductor = dispatch_table.get(cls)
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/types/base.py:3045, in CheckpointConfig.deepcopy(self, memo)
3043 try:
3044 value = self[key]
ā†’ 3045 value_copy = safe_deep_copy(data=value, memo=memo)
3046 setattr(result, key, value_copy)
3047 except AttributeError:
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/types/init.py:266, in safe_deep_copy(data, memo)
263 return data
265 if isinstance(data, (list, tuple)):
ā†’ 266 return [safe_deep_copy(data=element, memo=memo) for element in data]
268 if isinstance(data, dict):
269 return {
270 key: safe_deep_copy(data=value, memo=memo) for key, value in data.items()
271 }
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/types/init.py:266, in (.0)
263 return data
265 if isinstance(data, (list, tuple)):
ā†’ 266 return [safe_deep_copy(data=element, memo=memo) for element in data]
268 if isinstance(data, dict):
269 return {
270 key: safe_deep_copy(data=value, memo=memo) for key, value in data.items()
271 }
File ~/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/types/init.py:274, in safe_deep_copy(data, memo)
269 return {
270 key: safe_deep_copy(data=value, memo=memo) for key, value in data.items()
271 }
273 # noinspection PyArgumentList
ā†’ 274 return copy.deepcopy(data, memo)
File /usr/lib64/python3.9/copy.py:172, in deepcopy(x, memo, _nil)
170 y = x
171 else:
ā†’ 172 y = _reconstruct(x, memo, *rv)
174 # If is its own copy, donā€™t memoize.
175 if y is not x:
File /usr/lib64/python3.9/copy.py:270, in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
268 if state is not None:
269 if deep:
ā†’ 270 state = deepcopy(state, memo)
271 if hasattr(y, ā€˜setstateā€™):
272 y.setstate(state)
File /usr/lib64/python3.9/copy.py:146, in deepcopy(x, memo, _nil)
144 copier = _deepcopy_dispatch.get(cls)
145 if copier is not None:
ā†’ 146 y = copier(x, memo)
147 else:
148 if issubclass(cls, type):
File /usr/lib64/python3.9/copy.py:230, in _deepcopy_dict(x, memo, deepcopy)
228 memo[id(x)] = y
229 for key, value in x.items():
ā†’ 230 y[deepcopy(key, memo)] = deepcopy(value, memo)
231 return y
File /usr/lib64/python3.9/copy.py:146, in deepcopy(x, memo, _nil)
144 copier = _deepcopy_dispatch.get(cls)
145 if copier is not None:
ā†’ 146 y = copier(x, memo)
147 else:
148 if issubclass(cls, type):
File /usr/lib64/python3.9/copy.py:205, in _deepcopy_list(x, memo, deepcopy)
203 append = y.append
204 for a in x:
ā†’ 205 append(deepcopy(a, memo))
206 return y
File /usr/lib64/python3.9/copy.py:172, in deepcopy(x, memo, _nil)
170 y = x
171 else:
ā†’ 172 y = _reconstruct(x, memo, *rv)
174 # If is its own copy, donā€™t memoize.
175 if y is not x:
File /usr/lib64/python3.9/copy.py:270, in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
268 if state is not None:
269 if deep:
ā†’ 270 state = deepcopy(state, memo)
271 if hasattr(y, ā€˜setstateā€™):
272 y.setstate(state)
File /usr/lib64/python3.9/copy.py:146, in deepcopy(x, memo, _nil)
144 copier = _deepcopy_dispatch.get(cls)
145 if copier is not None:
ā†’ 146 y = copier(x, memo)
147 else:
148 if issubclass(cls, type):
File /usr/lib64/python3.9/copy.py:230, in _deepcopy_dict(x, memo, deepcopy)
228 memo[id(x)] = y
229 for key, value in x.items():
ā†’ 230 y[deepcopy(key, memo)] = deepcopy(value, memo)
231 return y
File /usr/lib64/python3.9/copy.py:146, in deepcopy(x, memo, _nil)
144 copier = _deepcopy_dispatch.get(cls)
145 if copier is not None:
ā†’ 146 y = copier(x, memo)
147 else:
148 if issubclass(cls, type):
File /usr/lib64/python3.9/copy.py:230, in _deepcopy_dict(x, memo, deepcopy)
228 memo[id(x)] = y
229 for key, value in x.items():
ā†’ 230 y[deepcopy(key, memo)] = deepcopy(value, memo)
231 return y
File /usr/lib64/python3.9/copy.py:172, in deepcopy(x, memo, _nil)
170 y = x
171 else:
ā†’ 172 y = _reconstruct(x, memo, *rv)
174 # If is its own copy, donā€™t memoize.
175 if y is not x:
File /usr/lib64/python3.9/copy.py:270, in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
268 if state is not None:
269 if deep:
ā†’ 270 state = deepcopy(state, memo)
271 if hasattr(y, ā€˜setstateā€™):
272 y.setstate(state)
[ā€¦ skipping similar frames: deepcopy at line 146 (2 times), _deepcopy_dict at line 230 (1 times)]
File /usr/lib64/python3.9/copy.py:230, in _deepcopy_dict(x, memo, deepcopy)
228 memo[id(x)] = y
229 for key, value in x.items():
ā†’ 230 y[deepcopy(key, memo)] = deepcopy(value, memo)
231 return y
File /usr/lib64/python3.9/copy.py:146, in deepcopy(x, memo, _nil)
144 copier = _deepcopy_dispatch.get(cls)
145 if copier is not None:
ā†’ 146 y = copier(x, memo)
147 else:
148 if issubclass(cls, type):
File /usr/lib64/python3.9/copy.py:205, in _deepcopy_list(x, memo, deepcopy)
203 append = y.append
204 for a in x:
ā†’ 205 append(deepcopy(a, memo))
206 return y
File /usr/lib64/python3.9/copy.py:172, in deepcopy(x, memo, _nil)
170 y = x
171 else:
ā†’ 172 y = _reconstruct(x, memo, *rv)
174 # If is its own copy, donā€™t memoize.
175 if y is not x:
File /usr/lib64/python3.9/copy.py:270, in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
268 if state is not None:
269 if deep:
ā†’ 270 state = deepcopy(state, memo)
271 if hasattr(y, ā€˜setstateā€™):
272 y.setstate(state)
File /usr/lib64/python3.9/copy.py:146, in deepcopy(x, memo, _nil)
144 copier = _deepcopy_dispatch.get(cls)
145 if copier is not None:
ā†’ 146 y = copier(x, memo)
147 else:
148 if issubclass(cls, type):
File /usr/lib64/python3.9/copy.py:230, in _deepcopy_dict(x, memo, deepcopy)
228 memo[id(x)] = y
229 for key, value in x.items():
ā†’ 230 y[deepcopy(key, memo)] = deepcopy(value, memo)
231 return y
File /usr/lib64/python3.9/copy.py:146, in deepcopy(x, memo, _nil)
144 copier = _deepcopy_dispatch.get(cls)
145 if copier is not None:
ā†’ 146 y = copier(x, memo)
147 else:
148 if issubclass(cls, type):
File /usr/lib64/python3.9/copy.py:230, in _deepcopy_dict(x, memo, deepcopy)
228 memo[id(x)] = y
229 for key, value in x.items():
ā†’ 230 y[deepcopy(key, memo)] = deepcopy(value, memo)
231 return y
File /usr/lib64/python3.9/copy.py:172, in deepcopy(x, memo, _nil)
170 y = x
171 else:
ā†’ 172 y = _reconstruct(x, memo, *rv)
174 # If is its own copy, donā€™t memoize.
175 if y is not x:
File /usr/lib64/python3.9/copy.py:270, in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
268 if state is not None:
269 if deep:
ā†’ 270 state = deepcopy(state, memo)
271 if hasattr(y, ā€˜setstateā€™):
272 y.setstate(state)
[ā€¦ skipping similar frames: _deepcopy_dict at line 230 (4 times), deepcopy at line 146 (4 times), _reconstruct at line 270 (2 times), deepcopy at line 172 (2 times)]
File /usr/lib64/python3.9/copy.py:172, in deepcopy(x, memo, _nil)
170 y = x
171 else:
ā†’ 172 y = _reconstruct(x, memo, *rv)
174 # If is its own copy, donā€™t memoize.
175 if y is not x:
File /usr/lib64/python3.9/copy.py:270, in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
268 if state is not None:
269 if deep:
ā†’ 270 state = deepcopy(state, memo)
271 if hasattr(y, ā€˜setstateā€™):
272 y.setstate(state)
File /usr/lib64/python3.9/copy.py:146, in deepcopy(x, memo, _nil)
144 copier = _deepcopy_dispatch.get(cls)
145 if copier is not None:
ā†’ 146 y = copier(x, memo)
147 else:
148 if issubclass(cls, type):
File /usr/lib64/python3.9/copy.py:230, in _deepcopy_dict(x, memo, deepcopy)
228 memo[id(x)] = y
229 for key, value in x.items():
ā†’ 230 y[deepcopy(key, memo)] = deepcopy(value, memo)
231 return y
File /usr/lib64/python3.9/copy.py:161, in deepcopy(x, memo, _nil)
159 reductor = getattr(x, ā€œreduce_exā€, None)
160 if reductor is not None:
ā†’ 161 rv = reductor(4)
162 else:
163 reductor = getattr(x, ā€œreduceā€, None)
TypeError: cannot pickle ā€˜_thread.lockā€™ object

From http://ec2-3-91-0-65.compute-1.amazonaws.com:8888/lab/tree/gx/Untitled2.ipynb

Hello I am not sure why your grader is not giving any points since I can complete without any issues. You can just try to understand how to edit that expectations yaml file. Hope it helps

Even in the S3 buckets where I have validation and other outputs in the S3 output the grader ignores and I am really stuck.

is the synthax for the validation correct

this is what I had for the validation :

validations = [
{ā€œbatch_requestā€:batches, ā€œexpectation_suite_nameā€: ā€œexpectation_suite_nameā€}
for batch in batches
]
validations

Could please anyone help. I received my grading and even though I do not get any error when do the exercise expectation suite. The grading gives me zero toward this exercise. I use this synthax :
#Š’ Add an expectation suite name to the context
expectation_suite_name = f"{LAB_PREFIX}-expectation-suite-trips-taxi-db"

START CODE HERE ### (~ 1 line of code)

###None.None(expectation_suite_name=None)
context.add_or_update_expectation_suite(expectation_suite_name=ā€œexpectation_suite_nameā€)

END CODE HERE

And later when I create a checkpoint i get no error using this synthax:

context.add_or_update_checkpoint(checkpoint_name) ā€“ no error

but the output is completely different to what the assignment. My output based on executing this synthax :

context.add_or_update_checkpoint(checkpoint_name) ā€“ no error

{
ā€œaction_listā€: [
{
ā€œnameā€: ā€œstore_validation_resultā€,
ā€œactionā€: {
ā€œclass_nameā€: ā€œStoreValidationResultActionā€
}
},
{
ā€œnameā€: ā€œstore_evaluation_paramsā€,
ā€œactionā€: {
ā€œclass_nameā€: ā€œStoreEvaluationParametersActionā€
}
},
{
ā€œnameā€: ā€œupdate_data_docsā€,
ā€œactionā€: {
ā€œclass_nameā€: ā€œUpdateDataDocsActionā€
}
}
],
ā€œbatch_requestā€: {},
ā€œclass_nameā€: ā€œCheckpointā€,
ā€œconfig_versionā€: 1.0,
ā€œevaluation_parametersā€: {},
ā€œmodule_nameā€: ā€œgreat_expectations.checkpointā€,
ā€œnameā€: ā€œde-c2w3a1-checkpoint-trips-1727729147.8883522ā€,
ā€œprofilersā€: ,
ā€œruntime_configurationā€: {},
ā€œvalidationsā€:
}

Could anyone please tell me if I have done anything wrong with expectation suite and with the checkpoint. Any hint or tips will be highly appreciated. Big Thanks in advance

Hello @AQ_2023 ,In exercise 5 you are supposed to get an error because you use the whole batches instead of the batch.batch_request as a parameter. Try fixing your code and re-submit. A screenshot of your submission report is helpful:

Hi Georgios,
I did tired to resolve the issue as you mentioned. And this is my submission report.


Thanks

Test 3 failed: Output item in ex02 is incorrect: {ā€˜expectation_suite_nameā€™: ā€˜expectation_suite_nameā€™, ā€˜ge_cloud_idā€™: None, ā€˜expectationsā€™: , ā€˜data_asset_typeā€™: None, ā€˜metaā€™: {ā€˜great_expectations_versionā€™: ā€˜0.18.9ā€™}}. Please try again

Hello @AQ_2023 yes, the exercise 2 looks good. Did you get any errors after creating the yaml file this time(the code is provided, just change the bucket names)? If you have a bug early you will reproduce errors even correct later, the submission report shows exercise 1 is now correct?

Also, as I said before keep a copy of the yaml file and change the bucket names so you donā€™t have to try again but only if you are sure its correct, we dont want to reproduce the same bugs.

About exercise 5 you have made a mistake in ā€œbatch_requestā€: None.None try adding batch.batch_request instead of the whole batches. See if it can pass the grader. If not, it might be like exercise 2 correct but there are previous issues.

Feel free to ask for any hints. if you are stuck. Thanks

Hi Georgios, No there were no errors after creating the yaml file this time. Thank you for that.
for the batches I used the synthax batch_request.

exercise 6 I used this synthax :
context.add_or_update_checkpoint(checkpoint_name)

but the output I get is quite less compared to what the assignment for exercise 6 :
{
ā€œaction_listā€: [
{
ā€œnameā€: ā€œstore_validation_resultā€,
ā€œactionā€: {
ā€œclass_nameā€: ā€œStoreValidationResultActionā€
}
},
{
ā€œnameā€: ā€œstore_evaluation_paramsā€,
ā€œactionā€: {
ā€œclass_nameā€: ā€œStoreEvaluationParametersActionā€
}
},
{
ā€œnameā€: ā€œupdate_data_docsā€,
ā€œactionā€: {
ā€œclass_nameā€: ā€œUpdateDataDocsActionā€
}
}
],
ā€œbatch_requestā€: {},
ā€œclass_nameā€: ā€œCheckpointā€,
ā€œconfig_versionā€: 1.0,
ā€œevaluation_parametersā€: {},
ā€œmodule_nameā€: ā€œgreat_expectations.checkpointā€,
ā€œnameā€: ā€œde-c2w3a1-checkpoint-trips-1727729147.8883522ā€,
ā€œprofilersā€: ,
ā€œruntime_configurationā€: {},
ā€œvalidationsā€:
}

Any tips how I could fix exercise 6 so I can get the exact output as specified in the assignment.
Thanks

hello @AQ_2023 , exercise 6 has the same logic with exercise 2. The difference is you use checkpoint instead of expectation_suite. Just use the same context and add_or_update checkpoint and in the parentheses checkpoint=checkpoint as well. Thanks

Exercise 6
I just did this synthax as you said.

START CODE HERE ### (~ 1 line of code)

checkpoint.context.add_or_update_checkpoint(checkpoint=checkpoint)

END CODE HERE

and i get this error :

AttributeError Traceback (most recent call last)
Cell In[19], line 4
1 ### START CODE HERE ### (~ 1 line of code)
2
3 #None.None(checkpoint=None)
----> 4 checkpoint.context.add_or_update_checkpoint(checkpoint=checkpoint)
6 ### END CODE HERE ###

AttributeError: ā€˜Checkpointā€™ object has no attribute ā€˜contextā€™

I do not know how can i resolve is as I followed exactly what you said.

Thanks

also when I ran this sythax :

checkpoint_result = checkpoint.run()
i get a hugh error :
rror running action with name update_data_docs
Traceback (most recent call last):
File ā€œ/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/util.pyā€, line 98, in instantiate_class_from_config
class_instance = class_(**config_with_defaults)
TypeError: init() got an unexpected keyword argument ā€˜base_directoryā€™

During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File ā€œ/home/ec2-user/environment/jupyterlab-venv/lib64/python3.9/site-packages/great_expectations/data_context/util.pyā€, line 98, in instantiate_class_from_config
class_instance = class_(**config_with_defaults)

You can try removing the first chekpoint before context, just use context and keep the next of the cell. Hope it will pass exercise 6 now. Thanks

1 Like

Hi Georgios,
Many thanks it worked !!!
Thank you

1 Like