C2_W1_Lab1_TFDV_Exercise - Slicing Doubt

Here its mentioned in the lab notebook under data slicing section -

  • If you want to be more specific, then you can map the specific value to the feature name. For example, if you want just Male , then you can declare it as features={'sex': [b'Male']} . Notice that the string literal needs to be passed in as bytes with the b' prefix.
  • You can also pass in several features if you want. For example, if you want to slice through both the sex and race features, then you can do features={'sex': None, 'race': None}

I tried something as below -
from tensorflow_data_validation.utils import slicing_util

slice_fn = slicing_util.get_feature_value_slicer(features={‘sex’: [b’Male’], ‘race’: [b’Asian’]})

And when I am trying to view the result then I can see sex is male only but the race is not Asian its coming as white, can somebody please explain me, what I am doing wrong here?

Please find the screenshots below -


There are no records where sex = “Asian”. Here are the unique values in the training data.

>>> train_df.race.unique()
array(['White', 'Black', 'Asian-Pac-Islander', 'Amer-Indian-Eskimo',
       'Other'], dtype=object)

When you specify a dictionary like {'sex': [b'Male'], 'race': [b'Asian-Pac-Islander']}, the records picked are those that satisfy both conditions. The generated datasets are ['All Examples', 'race_Asian-Pac-Islander_sex_Male']

On the other hand, when you specify the features as {'sex': None, 'race': None} , you’ll have the following slices generated:
Datasets generated: ['All Examples', 'race_White_sex_Male', 'race_Black_sex_Male', 'race_Black_sex_Female', 'race_White_sex_Female', 'race_Asian-Pac-Islander_sex_Male', 'race_Amer-Indian-Eskimo_sex_Male', 'race_Other_sex_Female', 'race_Asian-Pac-Islander_sex_Female', 'race_Amer-Indian-Eskimo_sex_Female', 'race_Other_sex_Male'] Type of sliced_stats elements: <class 'tensorflow_metadata.proto.v0.statistics_pb2.DatasetFeatureStatistics'>

It seems like you are looking for the last option to pick a few and compare statistics.