Convert dataframe to CSV since slice_functions
works only with tfdv.generate_statistics_from_csv
CSV_PATH = ‘slice_sample.csv’
train_df.to_csv(CSV_PATH)
Calculate statistics for the sliced dataset
sliced_stats = tfdv.generate_statistics_from_csv(CSV_PATH, stats_options=slice_stats_options)
WARNING:root:Make sure that locally built Python SDK docker image has Python 3.8 interpreter.
WARNING:tensorflow:From /opt/conda/lib/python3.8/site-packages/tensorflow_data_validation/utils/stats_util.py:247: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version. Instructions for updating: Use eager execution and: tf.data.TFRecordDataset(path)
WARNING:tensorflow:From /opt/conda/lib/python3.8/site-packages/tensorflow_data_validation/utils/stats_util.py:247: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version. Instructions for updating: Use eager execution and: tf.data.TFRecordDataset(path)