Error while testing Semantic Segmentation to Webcam (Out of Topic)

Hey guys. I tried to test by seeing semantic segmentation in real time with my webcam. Somehow it isn’t working. I actually resize the input shape to (32, 640, 3). But still got an error. Can anyone help?


Remember to add the batch dimension. So, your input should have shape (1, 32, 640, 3). Use np.expand_dims to get this done.

Hey, thanks for the reply! Does putting np.expand_dims in the part of the code i’ve shown below is correct?
It also gives me an error when i tried to run in the code below:

Nice try but, that’s incorrect. You can’t use np.expand_dims on a keras layer.
I recommend leaving the unet_model function alone and calling unet_model.predict like this:
unet_model.predict(np.expand_dims(your_image_array, ...))

Hey, thanks for corrected me. Although i tried your technique but there is another error which actually change the dimension.

Why are you feeding the model a (480, 640, 3) image when it’s expecting (32, 640, 3) as input?

The dimension actually changed when i add np.expand_dims. But i’ll try training the model with that input shape since i can’t resize it. But i’ll let you know some information later!

At your first try, you reside your data to (32, 640, 3). In this case, you need to add one dimension in front by np.expand_dims, as balaji suggested. Now you removed resizing, and just add one dimension, which should fail.

But, what I’m curious is, is your image really (32, 640, 3) ? The aspect ration is quite unique.
On the other hand, your image data seems to have the shape of (480, 640, 3), which is really reasonable. Are you really planning to handle RGB images whose shape is 32x640 ?

I think you may need to control the image size throughout your logic.

With looking at your outputs, I may understand what you intended to do…

In our exercise, we used a batch size of 32. And, each image size was 96x128x3.
Then, input to the network is (32, 96, 128, 3). You are now handling 480x640x3 image. So, inputs should be (32, 480, 640, 3) or (None, 480, 640, 3), not (32, 640, 3). If you want to use unet as what you worked in the assignment, you may want to convert your image to (96, 128, 3).

Hey. My image is really not (32, 640, 3). It should be (480, 640, 3), and the reason that the aspect is like that it’s because the image is a frame from my webcam.

My image is really not (32, 640, 3). It should be (480, 640, 3),

We know that. So, we found that your program was not correct… :wink:

Hey, thanks for reaching out. I start to do training with input (480, 640, 3). But then, when i run the code below, i get an error below, but it did show a pop up of my webcam but then crashed because of the error.

I suppose that cv2 does not accept 4D tensor. Throughout your programming, you are not controlling dimensions well. I think it is better for you to see how dimensions of each object are changing by each step.

Hey @Nobu_Asai , I think it’s 4D Tensor because for the first element. As i see in the summary, i see that the output are in 4D Tensor, (None, 480, 640, 3), and i think the None is the problem that makes it 4D Tensor. Or it is because of the np.expand_dims. Below is the picture.