Decision boundary of ReLU activation function

fulorianarendra5 · June 24, 2024, 6:58am

Hi, I am working on the “Multi-Class Classification lab” of Week 2 of Advanced Learning Algorithms. When analysing the working of Layer 1 with ReLU activation function, I cant understand what the decision boundary of this function implies. Like in case of sigmoid function the decision boundary was something that distinguishes between Probability being greater than or less than 0.5. Can someone explain what does decision boundary in this case(ReLU activation) imply?

TMosh · June 24, 2024, 7:18am

ReLU doesn’t give you a decision boundary. It isn’t used for making predictions. It is only used in a hidden layer to provide a non-linear activation function.

fulorianarendra5 · June 25, 2024, 6:38am

Hey, Thanks for replying.
In the lab that I mentioned above (“Multi-Class Classification lab” of Week 2 of Advanced Learning Algorithms), layer 1 uses ReLU and the solution shows that the two units of the layer 1 classify the 4 coloured data sets in two distinct ways. How is this different from classification of data ?

TMosh · June 25, 2024, 6:49am

I will check the details and reply again later.

TMosh · June 25, 2024, 9:44pm

The hidden layer L1 uses ReLU units. They aren’t really doing ‘classification’ in the sense of the output of the entire model.

Each ReLU unit is just drawing a line that splits the input data into two regions. Nothing in the model tells the L1 units exactly how to do this. Each unit just learns the weight and bias values that help to minimize the cost at the model’s output.

It’s not shown clearly in the lab, but the true “decision boundaries” are at the output layer, where each unit learns to identify one class and reject the other three. That happens automatically because there are four output units, and the model is compiled to convert the linear output units into logits and to use “Sparse Categorical Crossentropy” for the cost (loss) function.

During training, any errors in the output predictions are feed back into the hidden layer, where the two ReLU units each learn to identify a different pairs of clusters - not because they’re specifically told to, but because that’s the solution that minimizes the cost.

fulorianarendra5 · June 26, 2024, 4:13am

Ok so the 2 units in ReLU layer are just segregating the area of the plot into two regions, depending on where the value of activation function crosses zero, and the shaded area in the plots for Layer 1 shows just that i.e. the “line” where this transition occurs, though its still not a classification of any kind. To segregate these shaded areas the units in layer 1 adjusts to weights which ultimately minimise the output loss, and the real classification happens when the linear output of the second layer is input in softmax model. This gives the probabilty distribution for each unit of the output layer as shown by the shaded areas for probability distribution in the 4 plots of output.

Thanks a lot!!

Topic		Replies	Views
Intuition Decision Boundaries Neural Networks and Deep Learning	4	789	July 27, 2023
C2_W2_Multiclass_TF Advanced Learning Algorithms week-2	9	484	April 3, 2023
Sigmoid function & Decision Boundary Supervised ML: Regression and Classification week-3	8	386	August 25, 2023
Neural networks and deep learning/week 3/Planar data Neural Networks and Deep Learning week-3	3	26	October 28, 2024
C2_W2_Multiclass_TF - linear (?) decision boundary Advanced Learning Algorithms week-2	5	513	May 11, 2023

Decision boundary of ReLU activation function

Related topics