C3_W1_KMeans_Assignment: Image bits size calculation


I think i understand how we came up with 65,920 bits calculation below but i am not sure how this is tied to the size of the array of X_recovered? Meaning how can I tell that X_recovered is actually taking 65,920 bits? or is that gets dynamically determined in computer memory? Thanks!

Hi @Mo_Okasha, it’s likely not actually taking 4 bit per pixel. Our X_recovered is a numpy array so we can use X_recovered.dtype to check the data type of it. For example, if it is np.int32 then it is using 32 bit per pixel instead of 4 bit and therefore the total memory will be more than 65,920 bits.

If I am not mistaken, the smallest data type we have with numpy is a 8 bit one such as np.uint8, and in that case 128 \times 128 \times 8 bits in memory is the best we can have.

However, when we save the compressed image as a file on disk, there should be some image file formats which will allow us to encode a pixel with 4 bit, and allow us to use a custom color map to store the 16 chosen colors.


So if it’s np.int32, does that mean every number in X_recovered matrix is represented by 32 bits? If so, since X_recovered is 3 dimensional array/matrix of size (128,128,3) then that means every pixel would consume 3 x 32 bits = 96 bits. Am I understanding this correctly? Sorry I know this is beyond the Machine Learning course but i was just curious. Thanks!

Hello @Mo_Okasha, let me give you a more complete picture of what’s going on.

Now we are facing two situations: (A) compressed image loaded in memory, and (B) compressed image stored on disk.

For (A), the total memory used depends on how you load it to the memory and in the case of our assignment, it’s loaded as an array of size 128 \times 128 \times 3. If we use np.int32 for our array then each array element will use 32 bits of memory space. In this way we will be using 128 \times 128 \times 3 \times 32 bits in total.

For (B), the total storage used depends on how you store it on the disk and in the most optimal way, though we didn’t actually do it in the assignment, is to store the 16 representative colors, and a mapping of each of the 128 \times 128 pixels to those 16 colors. Since there are only 16 colors, the mapping is as simple as an integer from 0 to 15 which takes up only 4 bits. This is how we come to 65,920 bits in total.

So to answer your question, yes - if it’s np.int32, every number uses 32 bits and each pixel uses 96 bits. This does not appear to be saving any memory but this is our choice to load it like that to the memory. To see the benefit of compression, we focus on storing it, which is (B).


Thanks @rmwkwok . That clarifies my question.