Week 3 - Data Pipeline Comps for ML Prod: Sklearn and preprocessing

Hi, I’d appreciate some help!

I"m a bit confused about some of the preprocessing being done during this section e.g Week 3 assignment. I guess it’s only for the sake of the course? but there are a number of transformations and Feature Engineering jobs being executed w/ out the TFX f/work.

Shouldn’t all these transformations be recorded and logged within the TFX f/ work to preserve lineage and provenance?


Yeah in a production pipeline every processing should be part of the pipeline.

Thanks @gent.spah. So these jobs outside the FTX are just for practical/ academic purposes, right? I understand one can translate/ package those scklearn et. al. Into TF flows… Would be interesting to know how that comes together…

There might be steps that cannot be done in TFX but can be integrated in the pipeline.