C2W3- Exercise 8, StatisticsGen.. confusion

My code is generating the error:

TypeError: Argument schema should be a Channel of type <class ‘tfx.types.standard_artifacts.Schema’> (got feature {
name: “Cover_Type”
value_count {
min: 1
max: 1
}… etc…

My code is/was:

statistics_gen_updated = StatisticsGen(schema=new_schema,
examples=example_gen.outputs[‘examples’],
stats_options=tfdv.StatsOptions(schema=new_schema,
infer_type_from_schema=True))

The TFX documentation seems a bit confusing to me.
Looking here:
https://www.tensorflow.org/tfx/versions/r1.0/api_docs/python/tfx/v1/components/StatisticsGen
sort of confirms I should passing something of a data type called tfx.types.Channel which is a “concept that connects data consumers and producers” and it seems to inherit from tfx.types.Artifact which seems to be used as a sort of type hint for a Tuple of ml_metadata.proto.Artifact and ml_metadata.proto.ArtifactType…
But when I load in the new schema: “new_schema = tfdv.load_schema_text(schema_file)”
The data type of new_schema is tensorflow_metadata.proto.v0.schema_pb2.Schema … but I can’t find this anywhere in the TFX documentation, …but I guess it’s not a tfx.types.Channel, otherwise I wouldn’t have got the error that I got.

I’ve managed to hack my way through this exercise in the process of writing this question… by sort of combining bits of previous ungraded labs and bits of the Importer documentation (which seems to be relevant to ImporterNode, though I’m not sure if one or other names is now deprecated… or if one is a subclass of the other but has no documentation in TFX yet… )

All-in-all, it’s a bit guessy and trial and errory… … but it’s ok, feels like I’m learning… something.
I’m not complaining. :slight_smile:

Hello @shahin. I think you are close. One piece of instructions for Exercise 8 is to use StatisticsGen to compute the statistics with the schema you updated in the previous section.

In Exercise 7, we had used ImporterNode to handle the curated schema. At the end of Exercise 7, we also displayed what the curated schema looks like.

Try plugging the curated schema into the schema argument for StatisticsGen for Exercise 8 and see what happens.

Hi @davidlowe, as I stated in my post: “I’ve managed to hack my way through this exercise…”
i.e. I had already solved this one.

I did not post my correct solution because that’s not allowed.

(I did post the incorrect solution that I’d tried first and was the original reason for posting the question. I left it here (a) to highlight to the course-makers that their hint given to use stats_options appears to be not correct (as has already been pointed out by someone else in an older post), and (b) to explain to others who may also be stuck on this how I eventually hacked out the apparently correct answer. )

But thanks anyway.