Why ExampleGen generates just train_set and eval_set

Hi there,

I’m about to finish the MLOps course and I wanted to get my hands a little bit dirty with a project using TFX pipeline.

But, I was wondering why ExampleGen generates only a train & eval split of the data instead of train, eval and test split?

I’m a little bit lost on how am I going to test my model and which data to use if the model has already seen the eval set during training?

looking forward to hearing from you!

1 Like

Which lab are you referring to?

Dear @Aly_SEGNANE,

Welcome to the community.

ExampleGen generates a train and eval split by default because it aligns with standard machine learning practices and provides a solid foundation for model development and evaluation. If you need a test split in addition to the train and eval splits, you can customize ExampleGen or incorporate additional components into your TFX pipeline to accommodate your specific requirements.

1 Like

No, it’s not a lab of the course. I just wanted to play with TFX on my own.

thank you @Girijesh !!!

1 Like

Thank you for confirming the source of your question. Moving forward, kindly tag such questions with general so that mentors don’t have to look for a notebook before answering.

Happy learning.

1 Like