C2_W1_Lab02_CoffeeRoasting_TF - Logistic regression vs Neural Network

Madhav76 · December 9, 2024, 12:23pm

Hi - While looking at coffee roasting neural network example, I was intrigued how simple logistic regression would behave on this dataset of 200 entries. So, I added below code
lr_model = LogisticRegression()
lr_model.fit(Xn, Y)
y_pred = lr_model.predict(Xn)

But there are only 7 good roast predictions, y_pred == 1 instead of 43 in original dataset. Since I am doing predictions on same dataset which is used for training, should I not get 100% accuracy?

I used below code to check for accuracy
print(“Accuracy on training set:”, lr_model.score(Xn, Y)) and it is showing 75% accuracy.

What am I doing wrong here?

Also, when I try to plot the predictions using code below, I dont get x markers for bad roast predictions (y_pred == 0)

X_t = X[:,0] (temperature values for x axis)
X_d = X[:,1] (duration values for y_axis)
X_t_0 = X_t[y_pred == 0]
X_t_1 = X_t[y_pred == 1]
X_d_0 = X_d[y_pred == 0]
X_d_1 = X_d[y_pred == 1]
y_p_0 = y_pred[y_pred == 0]
y_p_1 = y_pred[y_pred == 1]
plt.scatter(X_t_0,X_d_0,y_p_0,marker =‘x’, c=‘b’)
plt.scatter(X_t_1,X_d_1,y_p_1,marker =‘o’, c=‘r’)

conscell · December 10, 2024, 3:31am

Hi @Madhav76,

Logistic Regression is a linear model, meaning it tries to fit a linear decision boundary in the feature space. Because the data is not linearly separable (which you can see on the plot), the model cannot perfectly classify all points, even on the training data. This explains the 75% accuracy and why y_pred doesn’t match the ground truth labels perfectly.

You can try adding non-linear features, which can significantly improve the performance of LR model:

poly = PolynomialFeatures(degree=2, include_bias=False)
X_poly = poly.fit_transform(Xn) 
lr_model = LogisticRegression()
lr_model.fit(X_poly, Y)
y_pred = lr_model.predict(X_poly)
print("Accuracy on training set with polynomial features:", lr_model.score(X_poly, Y))
Accuracy on training set with polynomial features: 0.93

Madhav76 · December 10, 2024, 3:34am

conscell:

poly = PolynomialFeatures(degree=2, include_bias=False)
X_poly = poly.fit_transform(Xn) 
lr_model = LogisticRegression()
lr_model.fit(X_poly, Y)
y_pred = lr_model.predict(X_poly)
print("Accuracy on training set with polynomial features:", lr_model.score(X_poly, Y))
Accuracy on training set with polynomial features: 0.93

Thank you so much @conscell for the explanation. Really appreciate!!

conscell · December 10, 2024, 4:56am

@Madhav76

Here is a plot of the decision boundary for a model trained with polynomial features:

# Create a grid of temperature (X_t) and duration (X_d) values
X_t = Xn[:,0]  # Temperature values for x axis
X_d = Xn[:,1]  # Duration values for y axis
x_min, x_max = X_t.min() - 1, X_t.max() + 1
y_min, y_max = X_d.min() - 1, X_d.max() + 1
xx, yy = np.meshgrid(np.linspace(x_min, x_max, 100), np.linspace(y_min, y_max, 100))

# Combine the grid points into a feature matrix
grid_points = np.c_[xx.ravel(), yy.ravel()]

# Transform the grid points with polynomial features
grid_poly = poly.transform(grid_points)

# Predict using the logistic regression model
Z = lr_model.predict(grid_poly)
Z = Z.reshape(xx.shape)  # Reshape to match the grid

# Plot the decision boundary
plt.contourf(xx, yy, Z, alpha=0.5, cmap='coolwarm')

# Scatter plot the original data
plt.scatter(X_t[Y == 0], X_d[Y == 0], marker='x', c='b', label='Bad Roast')
plt.scatter(X_t[Y == 1], X_d[Y == 1], marker='o', c='r', label='Good Roast')

# Labels and legend
plt.xlabel('Temperature (Normalized)')
plt.ylabel('Duration (Normalized)')
plt.legend()
plt.title('Decision Boundary with Polynomial Features')
plt.show()

Topic		Replies	Views
Forward propagation - Coffee machine Advanced Learning Algorithms week-1	16	176	June 5, 2024
Training a Simple NN on the Coffee Roasting Data Advanced Learning Algorithms week-2	4	336	November 10, 2023
Getting the perfect inputs for the highest activation of the output layer Advanced Learning Algorithms week-1	3	21	July 24, 2024
Train/test accuracy mismatch Neural Networks and Deep Learning	4	721	April 29, 2021
CoffeRoasting with relu and improved numerical accuracy Advanced Learning Algorithms feedback , week-2	7	47	April 4, 2025

C2_W1_Lab02_CoffeeRoasting_TF - Logistic regression vs Neural Network

Related topics