Week 1 lab 1 : About image compression

bharathithal · July 28, 2022, 1:43pm

I understood that since we use values between 0 to 255 to represent intensities of 3 colors (RGB), we would need 24 bits to represent each pixel value.

My doubt is, how does the it take 16 bits to represent after image compression?

Also, I had a doubt regrading syntax in the image compression exercise.

# Represent image in terms of indices
X_recovered = centroids[idx, :] 
# Reshape recovered image into proper dimensions
X_recovered = np.reshape(X_recovered, original_img.shape)

Shape of centroid is (16, 3) and idx is (16384, )
How does centroid[idx, :] work?

Thanks for answering in advance

rmwkwok · July 28, 2022, 2:19pm

Great question! Here is how the reduction works. Initially, we can use up to 255 * 255 * 255 = 16,581,375 colors to represent the whole color space. Ofcourse, the photo does not use each and every color, so the actual number of used colors is less.

Although it is less, it’s still a very large number, so what if we can reduce that number to something like 16, meaning we pick 16 colors that are most useful to represent the photo among all the 16,581,375 possible choices?

Achieving this would be awesome because after that we only need to remember 16 colors (instead of 16,581,375), and at each pixel, we no longer need three (0-255)-ranged numbers to talk about a color in terms of its RGB values, instead we only need one (0-15)-ranged number to talk about the colors. For example, 0 can represent red, 1 can represent white, 3 can represent brown, and so on. The numbers are no longer a continuous representation of how the intensity of one of the Red / Green / Blue changes, but each of the number from 0 to 15 represents a discrete color and is representative for the photo in question. Note that here, it would only take 4 bits (not 16 bits) to remember 16 distinct numbers.

Shape of centroid is (16, 3) because there are 16 centroid and each centroid is a color represented by its R, G, and B values.

Shape of Idx is (16384, ) because the flattened photo has 16384 pixels, and idx contains for each pixel the ID (between 0 and 15) of the closest centroid, which means it talks about at each pixel which of the 16 colors is going to be used to represent it.

centroids[idx, :] will return you an array of length (16384, 3), because it “recursively” takes the centroid for each pixel out of the centroids array.

Raymond

bharathithal · July 28, 2022, 4:49pm

That clears my doubt!
Thankyou so much for the detailed explaination @rmwkwok

rmwkwok · July 28, 2022, 10:46pm

You are welcome @bharathithal , enjoy course 3

Raymond

Topic		Replies	Views
K-mean assignment doubt Unsupervised Learning, Recommenders, Reinforcement week-1	2	489	March 1, 2023
Original vs 16 colors Unsupervised Learning, Recommenders, Reinforcement week-1	5	259	November 28, 2023
Image color compression using Kmeans algorithm Unsupervised Learning, Recommenders, Reinforcement week-1	1	501	October 31, 2022
C3 W1 Lab 1 meaning of code Unsupervised Learning, Recommenders, Reinforcement week-1	1	514	August 11, 2022
C3_W1_KMeans_Assignment: Image bits size calculation Unsupervised Learning, Recommenders, Reinforcement week-1	4	533	September 20, 2022

Week 1 lab 1 : About image compression

Related topics