After plotting the decision boundary based on the values of w (the weight vector) and b (the bias) obtained from the gradient descent algorithm, there’s a markdown cell that states: “In the plot above, the decision boundary is the line where the probability = 0.5.”
Why is this probability set to 0.5? From my understanding, it seems the threshold value could be adjusted to different levels (e.g., 0.5, 0.6, etc.). So, why is 0.5 commonly chosen as the default threshold?
The 0.5 probability threshold is typically chosen by default because logistic regression is often used for binary classification tasks, where the model predicts a probability that the input belongs to one of two classes. In this case, the data points labeled 𝑦=1 are shown as red crosses, while the data points labeled 𝑦=0 are shown as blue circles.
Thanks for answering. So just to make sure i understand this correctly, if the threshold was set to 0.7, does that mean that the decision boundary is the line where the probability is equal to 0.7?
Not exactly. Changing the threshold to 0.7 doesn’t change the location of the decision boundary itself, but it does change the point at which you decide class membership. The decision boundary is the set of points where the model predicts a probability of exactly 0.5 - where the model is “on the fence” between classes. This line is determined by the weights w and bias b learned by the model.
By changing the threshold to 0.7, you’re adjusting the classification criterion so that only points with a predicted probability of 0.7 or greater are classified in the positive class (class 1), while those below 0.7 are classified in the negative class (class 0).
So while the decision rule changes, the decision boundary itself - the line where the probability is 0.5 - remains the same. Instead, the threshold adjustment changes the area on either side of the boundary that is classified as a positive or negative class, making the model more or less selective in its classifications.
@nadtriana, I think @Tera_Byte’s summary is right. The whole idea of the decision boundary is to show the boundary we use to decide whether we’re putting an item in one class or the other. Points on one side of the boundary are in one class and points on the other side of the boundary are in the other class. So, if you change the predicted probability you’re using to determine whether an item falls into one class or the other from 0.5 to 0.7, then you are by definition, changing the decision boundary.
@Wendy, yes, but, there’s a slight difference in what exactly changes when we change the threshold from 0.5 to 0.7. The decision boundary in binary classification is defined as the line or region where the model predicts a probability of 0.5. Adjusting the threshold changes the sensitivity of the classification, but does not change the fundamental position of the decision boundary, which is based on a probability of 0.5. The change shifts the decision criteria without physically moving the boundary set by the model weights.