Understanding One Hot Coding

Praveen_Titus_F · December 9, 2022, 4:43pm

Hi there,

When there are 3 different ear shapes three columns were created using one hot coding

But when there are two different face shapes only one column is used, Why ?

Thanks in advance !!!

Juan_Olano · December 9, 2022, 5:03pm

Hi @Praveen_Titus_F ,

When you only have 2 classes, like 2 different faces, you have basically a binary situation.

In the case of the 2 faces, say Happy face and Sad face:

When the column has ‘1’ in the Happy face, what happens with the sad face? If happy face is 1, then sad face most be zero, right?

Conversely, when the column has ‘0’ in the Happy face, what happens with the sad face? If happy face is 0, then sad face most be 1, right?

That’s why when you only have 2 classes, you can ‘optimize’ your One-Hot Encoding a bit, and instead of using 2 columns, you can use only one.

And this can be generalized for the case of more classes. I invite you to try to solve a more efficient One-Hot encoding for 4 classes. Can you do it with 3 columns?

Juan

Praveen_Titus_F · December 9, 2022, 5:10pm

Hi @Juan_Olano ,
So if i have 4 categories (Red, blue, green, yellow) in a variable then in total there will be four dummy columns, one for each color.
Am i correct ?

Juan_Olano · December 9, 2022, 5:11pm

Yes, you can certainly use 4 columns. But my challenge to you is: Can you do it with 3 columns?

Praveen_Titus_F · December 9, 2022, 5:14pm

For 4 categories in a variable, then 4-1 = 3 dummy columns.
So if there are k categories then k - 1 dummy columns can be use, but this seems to be confusing, that’s the problem.

Juan_Olano · December 9, 2022, 5:17pm

Ok, let me explain. In your example of 4 colors, red, blue, green, and yellow:

Lets say I use 3 columns, one for red, one for blue and one for green. What happens if all 3 columns are zero?

If all 3 columns, red, blue, green, are zero, then what’s left? yellow, right?

I am not telling you to always use k-1 columns for k classes when one-hot encoding. I am just suggesting a mental exercise to understand your original question.

What do you think?

Praveen_Titus_F · December 9, 2022, 5:20pm

Does the algorithm automatically assumes if all 3 are 0’s, then the left out one is 1

Juan_Olano · December 9, 2022, 5:20pm

Just to make sure: the easiest way to do it is having the same number of columns as number of classes:

2 classes, 2 columns
3 classes, 3 columns
and so on…

Juan_Olano · December 9, 2022, 5:22pm

Going back to the original post of FACE SHAPE: Round/Not Round, yes, the algorithm will learn that automatically. It will know if a shape is ROUND or NOT ROUND.

Praveen_Titus_F · December 9, 2022, 5:22pm

Hi @Juan_Olano
I understand the logic, but how Yellow is assumed in our algorithm

Juan_Olano · December 9, 2022, 5:24pm

Right, if you give 4 different classes but only 3 different options, when the Ground Truth is not red, blue or green, then it will yield to ‘other’, which in this case ‘other’ is ‘yellow’.

And again, for ‘simplicity sake’, for clarity to any reader of the model, it may be more clear to have as many columns as classes are being one-hot encoded.

And to reiterate, and sorry if I caused confusion, my intention was to help you expand your intuition on the answer to your original question.

Praveen_Titus_F · December 9, 2022, 5:26pm

Definitely now its making sense, thank you @Juan_Olano for making me understand better.

Juan_Olano · December 9, 2022, 5:26pm

I am very glad!!! Thanks for following through!!!

TMosh · December 9, 2022, 6:00pm

With two shapes, you can either use two inputs with one-hot coding, or use a single feature and re-frame the data labels as a true/false condition. For example, It might be "Pointy ears? with a true/false answer. False would imply the other ear shape.

There’s no difference in the information content, so the results should be the same.

rmwkwok · December 10, 2022, 1:44am

Hello all!

Thank you for this wonderful discussion! I would just like to make a summary of this topic.

In many cases (decision tree in particular, and I would also include neural network, linear regression, and logistic regression), the one-hot encoded variables ain’t considered as a separated group of variables.

In human’s eye, if I represent 4 categories with 3 columns, I can tell it’s the 4th category if all 3 columns equal zero. Here, I am considering the 3 columns as a group and they are related.

In a ML algorithm’s eye, each of those encoded columns are just one of the features and if an encoded feature (representing color = Red) is equal to zero, then it simply means “Not Red” instead of the 4th colors. “Not Red” can mean any of the 3 other colors, instead of just the 4th color.

The situation of a binary (this or that) cateogory is pretty special, because we don’t care the difference between “Not this” and “that”, or we could say, we assume “Not this = that” and such assumption gets rid of the second column.

Raymond

Topic		Replies	Views
C2W4L2 - One-hot encoding Advanced Learning Algorithms week-module-4	17	520	August 7, 2023
Does it make sense to call it one-hot encoding of something that has only two values? Advanced Learning Algorithms week-module-4	3	533	June 21, 2022
Using one-hot encoding of categorical features Advanced Learning Algorithms week-module-4	5	347	March 12, 2024
Week4 one hot encoding Advanced Learning Algorithms week-module-4	3	31	July 1, 2025
Why applying one-hot encoding Advanced Learning Algorithms week-module-4	9	303	December 4, 2023

Understanding One Hot Coding

Related topics