Question about append value to schema

I am a little confused about the code below

tfdv.get_domain(schema, 'feature_column_name').value.append('string')

When we append value to the schema, what exactly are we appending? Where does the appended value come from?

Thank you!

@Zihao_Geng my understanding is the following:

  1. Appending value is used for categorical data
  2. The value is appended to the schema.
  3. The value mainly comes from your field knowledge

Let’s say in the train dataset, for whatever reason, for a categorical data you only have values “red” and “green”. So when you generate the schema, it will only list these 2 values as possible values. And when you assign number to those categories it will only prepare 2 possible values [0, 1].
But as you know that it can take values “red”, “green” and “blue”, you can add “blue” to the schema so you won’t run into issues when “blue” values show up.

I hope this helps

1 Like