C3W3 - tf.data (input pipeline) in the context of TFX ExampleGen component

When going through the material on tf.data I was left a bit confused on how it relates to TFX with regards to ingesting data. Would you use tf.data to create the input (TFRecords) to the TFX ExampleGen component?


From what I understand (and I may be wrong), TFRecord is a more primitive form than tf.data.dataset. You can create a dataset from a TFRecord, but not vice versa.

To comment specifically on the context of why it was used/how it relates to a TFX’s ExampleGen, can you provide me with the location of the of tf.data in week 3?

hi @jwarmenhoven , I think this question shares some similarity with yours, hope it gives some additional information.

Happy learning,

Week 3 - Video 2 : High-Performance Ingestion.

I think the different levels of abstraction confused me, as well as the distinction between datasource and dataformat. After having gone over the tf/tfx documentation I better understand how it all connects.