Input shape of images in convnet

Med-akraou · May 15, 2023, 7:04pm

Hello community,
I would like to know why in this coures the shape of input image rgb is height * width * channels knowing that a rgb image is three matrices red, green, and blue therefore the logical representation is channels * height * width

rmwkwok · May 15, 2023, 8:33pm

Hi @Med-akraou,

Both channels-first and channels-last are valid representations of a multichannel image.

This article explained in pyTorch why channels-last is preferred for the sake of better performance.

Unfortunately I don’t seem to find a similar discussion for Tensorflow at this moment, so you might need to do your search if you would like to check out some benchmarks.

This doesn’t really explain why the course chose channels-last, but this should provide a direction to consider it: that the choice should be more performance-dependence.

Cheers,
Raymond

PS: Tensorflow supports both in many CNN operations, you might indeed test them out and see which runs faster?

paulinpaloalto · May 15, 2023, 8:34pm

You’re right that there are two ways to format images: “channels first” and “channels last”. In this course, they have chosen the “channels last” orientation, possibly because that is the default orientation used by TensorFlow. And when you have a batch of inputs, the first dimension is the “samples” dimension. So we have 4D tensors with dimensions

samples x height x width x channels

The orientation is a choice, but the choice has been made for us by the course staff in this case. When you are working on your own, you can make a different choice.

Med-akraou · May 17, 2023, 8:56am

Therefore, we should transform images from channelsheightwidth to heightwidthchannels ?

rmwkwok · May 17, 2023, 9:17am

Test which works better in your case, and justify your decision by test results.

Cheers,
Raymond

Topic		Replies	Views
Week 1 - Convolution_model_Application (Understanding) Convolutional Neural Networks coursera-platform	2	556	July 24, 2022
Matching image dimensions with numpy arrays Convolutional Neural Networks week-1 , coursera-platform	3	12	October 22, 2024
Convolution_model_Step_by_Step_v1: Why is np.pad that shape in the first code section Convolutional Neural Networks coursera-platform	3	582	February 3, 2023
Doubt about Tensorflow Tensor Shape in Week 3 Assignment 1 Improving Deep Neural Networks: Hyperparameter tun week-3 , coursera-platform	7	35	July 4, 2024
In start they say 6x6x3 image 3 is color channel but now they are telling 37x37x40 40 is filter size. anybody can elaborate this Convolutional Neural Networks coursera-platform	3	497	December 20, 2022

Input shape of images in convnet

Related topics