Week 1 General Questions on Lab 2

solarflarefx · November 22, 2021, 7:35pm

Hello,

I have a few general questions on the code for week 1, in particular Lab 2.

In the Transform section, per my understanding, the code transforms the original image files into 32-bit float tensors for I/O. The input becomes a 28x28x1 tensor while the output is a single float32 tensor?
The Tuner section says that in the previous section the Transform component had saved transformed examples into TFRecords in compressed .gz format. If the inputs are already transformed, why exactly do we still need the transform graph, tf_transform_output?
Just want to make sure that I understand the dataloading going on here. In the tuner_fn function, we load up the images using _input_fn where we pass in the file paths of the data as well as the transform graph. Within _input_fn, since the default batch_size is set to 32, are we basically loading 32 images at a time? It looks like the 32 images are processed through the transform graph afterward?
Is there any kind of standardization or normalization happening in pre-processing before training? I’m not seeing it as I would usually expect to see something like this when training using images.
It looks like after the Trainer component is finished training, it saves the outputs in a .pb file. Was this specified anywhere or did this happen by default? Is there a way to set the filename? Finally, what format is this .pb file? Is it a SavedModel, frozen graph, or something else?

Thanks in advance.

balaji.ambresh · July 26, 2022, 5:44am

Answers 1 and 4: In the transform section, each image is reshaped to (28, 28, 1) and then all pixels are scaled to 0 to 1 range. Each label is cast to a float.
Answer 2: We want to use the preprocessed version of the images (as highlighed in previous answer) and hence the dependency on transformed output.
Answer 5: pb is a protocol buffer file format. Training model is saved as via call to model.save

Topic		Replies	Views
Week 1 Lab 2 - Using tf.squeeze when transforming the image data Machine Learning Modeling Pipelines in Production	1	530	July 26, 2022
I'm having some confusing questions Introduction to TF for Artificial Intelligence ... week-2	4	542	April 24, 2022
tf.data.Datasets and TFX Machine Learning Modeling Pipelines in Production	2	583	September 15, 2021
Doubts with Practice Lab of week 2 Neural Networks and Deep Learning	3	764	July 18, 2022
C3_W1_Lab_2_TFX_Tuner_and_Trainer, Problem in Tuner section Machine Learning Modeling Pipelines in Production	1	507	May 25, 2023