What does min_domain_mass mean in the first lab?

I didn’t understand what min_domain_mass means?

country_feature.distribution_constraints.min_domain_mass = 0.9

it’s written that it relaxes the minimum fraction of values that must come from the domain of a particular feature. what does that mean? can anyone provide an example please?

Please read comments here


fields = ['payer_code', 'medical_specialty']
for field in fields:
    unique_in_train = train_df[field].unique()
    serving_values_within_train_domain = serving_df[field].isin(unique_in_train).sum()
    fraction_of_values_within_training_domain = serving_values_within_train_domain / len(serving_df)
    print(f'{fraction_of_values_within_training_domain * 100:.3f} % of {field} in serving_df is within domain of train_df')


99.993 % of payer_code in serving_df is within domain of train_df
99.895 % of medical_specialty in serving_df is within domain of train_df