Problem in my train_step_fn; W2_C3

My model.loss(prediction_dict, true_shape_tensor) does not work. The error is:
ValueError: Shapes must be equal rank, but are 3 and 1 for ‘{{node Loss/Loss/Select}} = Select[T=DT_FLOAT](Loss/Loss/IsNan, concat_1, Loss/stack_2)’ with input shapes: [0], [4,51150,4], [0].

note that interactive_eager_few_shot_od_training_colab.ipynb works, so I compare tensors which are fed into model.loss. They are exactly same, except of some initial format. For example, shape of working ipynb vs. true_shape_tensor of my problematic code:
working shape: Tensor(“Const:0”, shape=(4, 3), dtype=int32)
my true_shape_tensor: Tensor(“Preprocessor/stack_1:0”, shape=(4, 3), dtype=int32)

The same for preprocessed_inputs;
working code: {‘preprocessed_inputs’: <tf.Tensor ‘concat:0’ shape=(4, 640, 640, 3) dtype=float32>, ‘feature_maps’: [<tf.Tensor ‘ResNet50V1_FPN/FeatureMaps/top_down/smoothing_1/Relu6:0’ shape=(4, 80, 80, 256) dtype=float32>, <tf.Tensor ‘ResNet50V1_FPN/FeatureMaps/top_down/smoothing_2/Relu6:0’ shape=(4, 40, 40, 256) dtype=float32>, <tf.Tensor ‘ResNet50V1_FPN/FeatureMaps/top_down/projection_3/BiasAdd:0’ shape=(4, 20, 20, 256) dtype=float32>, <tf.Tensor ‘ResNet50V1_FPN/bottom_up_block5/Relu6:0’ shape=(4, 10, 10, 256) dtype=float32>, <tf.Tensor ‘ResNet50V1_FPN/bottom_up_block6/Relu6:0’ shape=(4, 5, 5, 256) dtype=float32>], ‘anchors’: <tf.Tensor ‘Concatenate/concat:0’ shape=(51150, 4) dtype=float32>, ‘final_anchors’: <tf.Tensor ‘Tile:0’ shape=(4, 51150, 4) dtype=float32>, ‘box_encodings’: <tf.Tensor ‘concat_1:0’ shape=(4, 51150, 4) dtype=float32>, ‘class_predictions_with_background’: <tf.Tensor ‘concat_2:0’ shape=(4, 51150, 2) dtype=float32>}

my code: {‘preprocessed_inputs’: <tf.Tensor ‘Preprocessor/stack:0’ shape=(4, 640, 640, 3) dtype=float32>, ‘feature_maps’: [<tf.Tensor ‘ResNet50V1_FPN/FeatureMaps/top_down/smoothing_1/Relu6:0’ shape=(4, 80, 80, 256) dtype=float32>, <tf.Tensor ‘ResNet50V1_FPN/FeatureMaps/top_down/smoothing_2/Relu6:0’ shape=(4, 40, 40, 256) dtype=float32>, <tf.Tensor ‘ResNet50V1_FPN/FeatureMaps/top_down/projection_3/BiasAdd:0’ shape=(4, 20, 20, 256) dtype=float32>, <tf.Tensor ‘ResNet50V1_FPN/bottom_up_block5/Relu6:0’ shape=(4, 10, 10, 256) dtype=float32>, <tf.Tensor ‘ResNet50V1_FPN/bottom_up_block6/Relu6:0’ shape=(4, 5, 5, 256) dtype=float32>], ‘anchors’: <tf.Tensor ‘Concatenate/concat:0’ shape=(51150, 4) dtype=float32>, ‘final_anchors’: <tf.Tensor ‘Tile:0’ shape=(4, 51150, 4) dtype=float32>, ‘box_encodings’: <tf.Tensor ‘concat_1:0’ shape=(4, 51150, 4) dtype=float32>, ‘class_predictions_with_background’: <tf.Tensor ‘concat_2:0’ shape=(4, 51150, 2) dtype=float32>}

The only difference is “Const:0” vs. “Preprocessor/stack_1:0” for shapes, and ‘concat:0’ vs. ‘Preprocessor/stack:0’ for preprocessed_inputs. Can anyone comment about potential issue for Preprocessor/stack:0? What to do about it?
Thanks!

Or, a more basic question, when I run my assignment script, I get the error below. Any suggestion what can go wrong? Thank you

Start fine-tuning!


ValueError Traceback (most recent call last)
in <cell line: 3>()
18
19 # Training step (forward pass + backwards pass)
—> 20 total_loss = train_step_fn(image_tensors,
21 gt_boxes_list,
22 gt_classes_list,

4 frames
/usr/local/lib/python3.10/dist-packages/tensorflow/python/util/traceback_utils.py in error_handler(*args, **kwargs)
151 except Exception as e:
152 filtered_tb = _process_traceback_frames(e.traceback)
→ 153 raise e.with_traceback(filtered_tb) from None
154 finally:
155 del filtered_tb

/tmp/autograph_generated_filezcypnxsi.py in tf__train_step_fn(image_list, groundtruth_boxes_list, groundtruth_classes_list, model, optimizer, vars_to_fine_tune)
35 ag
.ld(print)(‘\n printing prediction_dict:’)
36 ag__.ld(print)(ag__.ld(prediction_dict))
—> 37 losses_dict = ag__.converted_call(ag__.ld(model).loss, (ag__.ld(prediction_dict), ag__.ld(true_shape_tensor)), None, fscope)
38 total_loss = ag__.ld(losses_dict)[‘Loss/localization_loss’] + ag__.ld(losses_dict)[‘Loss/classification_loss’]
39 gradients = ag__.converted_call(ag__.ld(tape).gradient, (ag__.ld(total_loss), ag__.ld(vars_to_fine_tune)), None, fscope)

/usr/local/lib/python3.10/dist-packages/object_detection/meta_architectures/ssd_meta_arch.py in tf__loss(self, prediction_dict, true_image_shapes, scope)
137 pass
138 ag__.if_stmt(ag__.converted_call(ag__.ld(self).groundtruth_has_field, (ag__.ld(fields).InputDataFields.is_annotated,), None, fscope), if_body_5, else_body_5, get_state_5, set_state_5, (‘losses_mask’,), 1)
→ 139 location_losses = ag__.converted_call(ag__.ld(self).localization_loss, (ag_.ld(prediction_dict)[‘box_encodings’], ag__.ld(batch_reg_targets)), dict(ignore_nan_targets=True, weights=ag__.ld(batch_reg_weights), losses_mask=ag__.ld(losses_mask)), fscope)
140 cls_losses = ag__.converted_call(ag__.ld(self).classification_loss, (ag_.ld(prediction_dict)[‘class_predictions_with_background’], ag__.ld(batch_cls_targets)), dict(weights=ag__.ld(batch_cls_weights), losses_mask=ag__.ld(losses_mask)), fscope)
141

/usr/local/lib/python3.10/dist-packages/object_detection/core/losses.py in tf____call__(self, prediction_tensor, target_tensor, ignore_nan_targets, losses_mask, scope, **params)
47 nonlocal target_tensor
48 pass
—> 49 ag__.if_stmt(ag__.ld(ignore_nan_targets), if_body, else_body, get_state, set_state, (‘target_tensor’,), 1)
50
51 def get_state_2():

/usr/local/lib/python3.10/dist-packages/object_detection/core/losses.py in if_body()
42 def if_body():
43 nonlocal target_tensor
—> 44 target_tensor = ag__.converted_call(ag__.ld(tf).where, (ag__.converted_call(ag__.ld(tf).is_nan, (ag__.ld(target_tensor),), None, fscope), ag__.ld(prediction_tensor), ag__.ld(target_tensor)), None, fscope)
45
46 def else_body():

ValueError: in user code:

File "<ipython-input-443-22da03bdfee6>", line 46, in train_step_fn  *
    losses_dict = model.loss(prediction_dict, true_shape_tensor)
File "/usr/local/lib/python3.10/dist-packages/object_detection/meta_architectures/ssd_meta_arch.py", line 876, in loss  *
    location_losses = self._localization_loss(
File "/usr/local/lib/python3.10/dist-packages/object_detection/core/losses.py", line 78, in __call__  *
    target_tensor = tf.where(tf.is_nan(target_tensor),

ValueError: Shapes must be equal rank, but are 3 and 1 for '{{node Loss/Loss/Select}} = Select[T=DT_FLOAT](Loss/Loss/IsNan, concat_1, Loss/stack_2)' with input shapes: [0], [4,51150,4], [0].

Hello @Dennis_Sinitsky

Based on the below error

Please check if your define the path (string) for each image correctly recalled.

  1. Next check code in section for running a dummy image, make sure you have used conversion of image into tensor with the correct shape, the shape should not be [4, 640, 640, 3] as the instructions clearly indicates pass as batch of 1.

  2. Next in the gradient tape loss,
    make sure you have true tensor shape is correct shape and the preprocess image is not hard coded, you only need to use tf.concat to the preprocess image list. For true tensor shape tf.constant.

Let me know in case your issue is resolved or any further help is required.

Regards
DP

Hi Deepti,
thanks for your message. I also think that something is not preprocessed right; I always have difficulty with tensor dimensions and keeping track of them. Let me poke a couple of days more at this problem before seeking more help.
Thank you
Dennis

sure @Dennis_Sinitsky send your notebook if you are not able to get solution, I have pointed out all the points were you need to check for correction. also there are threads related to this assignment which might help you, so you use the search tool

I finally got it. :exploding_head:

1 Like

I can understand that expression :joy: I had similar experience when I was doing the course