C3w4lab2 - TF_GraphCopyFunction is null..?

lab2 - TFX Evaluator colab gives some long and winding traceback on calling Trainer in this cell:

from tfx.proto import trainer_pb2
trainer = Trainer(
module_file=os.path.abspath(_census_trainer_module_file),
examples=transform.outputs[‘transformed_examples’],
transform_graph=transform.outputs[‘transform_graph’],
schema=schema_gen.outputs[‘schema’],
train_args=trainer_pb2.TrainArgs(num_steps=50),
eval_args=trainer_pb2.EvalArgs(num_steps=50))
context.run(trainer, enable_cache=False)

At the top of the this giant traceback is this error:

ERROR:absl:udf_utils.get_fn {'train_args': '{\n "num_steps": 50\n}', 'eval_args': '{\n "num_steps": 50\n}', 'module_file': None, 'run_fn': None, 'trainer_fn': None, 'custom_config': 'null', 'module_path': 'census_trainer@./pipeline/_wheels/tfx_user_code_Trainer-0.0+cc29a9c825e35b0142dabcb732b412ff69124c1b5f5c6eee2546a102dcbf15c9-py3-none-any.whl'} 'run_fn'

And amidst the traceback the error seems related to TF_GraphCopyFunction being null:

tensorflow.python.framework.errors_impl.InvalidArgumentError: ‘func’ argument to TF_GraphCopyFunction cannot be null

The second such error is at Run the evaluator component:

#. Setup and run the Evaluator component
evaluator = Evaluator(
examples=example_gen.outputs[‘examples’],
model=trainer.outputs[‘model’],
baseline_model=model_resolver.outputs[‘model’],
eval_config=eval_config)
context.run(evaluator, enable_cache=False)

It also throws that TF_GraphCopyFunction is null error. But also has these 3 errors at the top of the traceback:

> ERROR:absl:udf_utils.get_fn {'eval_config': '{\n  "metrics_specs": [\n    {\n      "metrics": [\n        {\n          "class_name": "ExampleCount"\n        },\n        {\n          "class_name": "BinaryAccuracy",\n          "threshold": {\n            "change_threshold": {\n              "absolute": -1e-10,\n              "direction": "HIGHER_IS_BETTER"\n            },\n            "value_threshold": {\n              "lower_bound": 0.5\n            }\n          }\n        },\n        {\n          "class_name": "BinaryCrossentropy"\n        },\n        {\n          "class_name": "AUC"\n        },\n        {\n          "class_name": "AUCPrecisionRecall"\n        },\n        {\n          "class_name": "Precision"\n        },\n        {\n          "class_name": "Recall"\n        },\n        {\n          "class_name": "MeanLabel"\n        },\n        {\n          "class_name": "MeanPrediction"\n        },\n        {\n          "class_name": "Calibration"\n        },\n        {\n          "class_name": "CalibrationPlot"\n        },\n        {\n          "class_name": "ConfusionMatrixPlot"\n        }\n      ]\n    }\n  ],\n  "model_specs": [\n    {\n      "label_key": "label"\n    }\n  ],\n  "slicing_specs": [\n    {},\n    {\n      "feature_keys": [\n        "race"\n      ]\n    },\n    {\n      "feature_keys": [\n        "sex"\n      ]\n    }\n  ]\n}', 'feature_slicing_spec': None, 'fairness_indicator_thresholds': None, 'example_splits': 'null', 'module_file': None, 'module_path': None} 'custom_eval_shared_model'
> ERROR:absl:There are change thresholds, but the baseline is missing. This is allowed only when rubber stamping (first run).
> ERROR:absl:udf_utils.get_fn {'eval_config': '{\n  "metrics_specs": [\n    {\n      "metrics": [\n        {\n          "class_name": "ExampleCount"\n        },\n        {\n          "class_name": "BinaryAccuracy",\n          "threshold": {\n            "change_threshold": {\n              "absolute": -1e-10,\n              "direction": "HIGHER_IS_BETTER"\n            },\n            "value_threshold": {\n              "lower_bound": 0.5\n            }\n          }\n        },\n        {\n          "class_name": "BinaryCrossentropy"\n        },\n        {\n          "class_name": "AUC"\n        },\n        {\n          "class_name": "AUCPrecisionRecall"\n        },\n        {\n          "class_name": "Precision"\n        },\n        {\n          "class_name": "Recall"\n        },\n        {\n          "class_name": "MeanLabel"\n        },\n        {\n          "class_name": "MeanPrediction"\n        },\n        {\n          "class_name": "Calibration"\n        },\n        {\n          "class_name": "CalibrationPlot"\n        },\n        {\n          "class_name": "ConfusionMatrixPlot"\n        }\n      ]\n    }\n  ],\n  "model_specs": [\n    {\n      "label_key": "label"\n    }\n  ],\n  "slicing_specs": [\n    {},\n    {\n      "feature_keys": [\n        "race"\n      ]\n    },\n    {\n      "feature_keys": [\n        "sex"\n      ]\n    }\n  ]\n}', 'feature_slicing_spec': None, 'fairness_indicator_thresholds': None, 'example_splits': 'null', 'module_file': None, 'module_path': None} 'custom_extractors'
> E

I think the notebook completes as expected … but I’m not all that sure… is there a definitive output(s) that I can look for to confirm it’s all still working correctly despite the error messages … ?

Hi Shahin! As mentioned in the markdown, that stems from the tensorflow version used by the current TFX version. The discussion here points that it is harmless but we’ve also flagged this to the TFX team so they can take a look. I’m sure this will be fixed in an upcoming release.

Great thanks @chris.favila, I see this closed issue #44403 in your discussion here link which does indeed mention the TF_GraphCopyFunction cannot be null error.

Apologies for not seeing that in the markdown. On my copy, all I can see is:

You can ignore the Exception ignored in: <function CapturableResourceDeleter.__del__>

I don’t see anything in the markdown about the TF_GraphCopyFunction or the other 3 errors that I’ve posted in my original message. I guess they may be all related to this though.

Btw, is there a definitive output that I should look for to confirm it’s all still working in my colab?