Sequential vs KerasClassifier

Toyomu_Matsuda · January 1, 2024, 8:18pm

When creating a post, please add:

Week # must be added in the tags option of the post. (Week 2)
Link to the classroom item you are referring to:
Description (include relevant info but please do not post solution code or your entire notebook):

I have the following simple neural network:

# Neural Network
def create_nn_clf(number_of_features, first_layer_neurons=25, second_layer_neurons=15):
    nn_clf = Sequential([
        Dense(first_layer_neurons, activation='relu', input_shape=(number_of_features,)), #input_shape seems necessary for KerasClassifier below
        Dense(second_layer_neurons, activation='relu'),
        Dense(1, activation='linear')
    ])
    return nn_clf

The easiest way of training is:

nn_clf_1 = create_nn_clf(number_of_features=number_of_features)
nn_clf_1.compile(optimizer=tf.keras.optimizers.legacy.Adam(learning_rate=learning_rate), loss=BinaryCrossentropy(from_logits=True))
nn_clf_1.fit(X_train, y_train_5, epochs=epochs, batch_size=batch_size, verbose=1)
y_predict_1 = nn_clf_1.predict(X_train)

However, sometimes the above approach is not sufficient, say if you want to do cross validation based on sklearn.model_selection. For this case, (as I learned from the internet), we can use KerasClassifier as follows:

# Second estimator
nn_clf_2 = KerasClassifier(create_nn_clf, number_of_features=number_of_features, epochs=epochs, batch_size=batch_size, verbose=1, loss=BinaryCrossentropy(from_logits=True), optimizer=tf.keras.optimizers.legacy.Adam(learning_rate=learning_rate))
nn_clf_2.fit(X_train, y_train_5)
y_predict_2 = nn_clf_2.predict(X_train)

However, y_predict_1 and y_predict_2 are kind of different (I expect that they should be same). For example, if we use prediction vs recall plot, we get the following.
sequential_vs_kerasclassifier
Why? Should I trust KerasClassifier?

Entire code is:

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.losses import BinaryCrossentropy 
from scikeras.wrappers import KerasClassifier
from sklearn.metrics import precision_recall_curve

from sklearn.datasets import fetch_openml

# mnist dataset
mnist = fetch_openml('mnist_784', as_frame=False)
X, y = mnist.data, mnist.target
X_train, y_train = X[:60000], y[:60000]
# binary classification if the digit is 5 or not
y_train_5 = (y_train == '5')

# Neural Network
def create_nn_clf(number_of_features, first_layer_neurons=25, second_layer_neurons=15):
    nn_clf = Sequential([
        Dense(first_layer_neurons, activation='relu', input_shape=(number_of_features,)), #input_shape seems necessary for KerasClassifier below
        Dense(second_layer_neurons, activation='relu'),
        Dense(1, activation='linear')
    ])
    return nn_clf

# Common parameters
number_of_features = X_train.shape[1]
learning_rate = 1e-5
epochs = 20
batch_size = 50

# First estimator
nn_clf_1 = create_nn_clf(number_of_features=number_of_features)
nn_clf_1.compile(optimizer=tf.keras.optimizers.legacy.Adam(learning_rate=learning_rate), loss=BinaryCrossentropy(from_logits=True))
nn_clf_1.fit(X_train, y_train_5, epochs=epochs, batch_size=batch_size, verbose=1)
y_predict_1 = nn_clf_1.predict(X_train)
precisions_1, recalls_1, thresholds_1 = precision_recall_curve(y_train_5, y_predict_1)

# Second estimator
nn_clf_2 = KerasClassifier(create_nn_clf, number_of_features=number_of_features, epochs=epochs, batch_size=batch_size, verbose=1, loss=BinaryCrossentropy(from_logits=True), optimizer=tf.keras.optimizers.legacy.Adam(learning_rate=learning_rate))
nn_clf_2.fit(X_train, y_train_5)
y_predict_2 = nn_clf_2.predict(X_train)
precisions_2, recalls_2, thresholds_2 = precision_recall_curve(y_train_5, y_predict_2)

# Compare predictions
plt.figure(figsize=(6, 5))  # extra code – not needed, just formatting

plt.plot(recalls_1, precisions_1, "b-", linewidth=2,
         label="Bare sequential")
plt.plot(recalls_2, precisions_2, "--", linewidth=2, label="KerasClassifier")

plt.xlabel("Recall")
plt.ylabel("Precision")
plt.axis([0, 1, 0, 1])
plt.grid()
plt.legend(loc="lower left")
plt.savefig("sequential_vs_kerasclassifier.png")
plt.show()

rmwkwok · January 2, 2024, 1:36am

Hello @Toyomu_Matsuda,

Interesting work!

Your graph does not just shows the problem, but is also a strong lead to the cause of the problem that the graph shows.

To draw the precision-recall curve, you need to give it the predictions in terms of probabilities. The questions are:

Why probabilities?
Are you giving it the probabilities for both the first and the second estimates?
How did you verify that they are probabilities? E.g. did you print them out?

Cheers,
Raymond

rmwkwok · January 2, 2024, 1:44am

Note that we don’t investigate it for you, so what have you done to investigate? Your investigation work is something we can discuss.

For example, you expect the results to be the same, then, did you make sure the networks (including their weights) start out to be the same? How did you verify that the weights were the same?

(If you don’t know how to print out the weights or how to make sure the weights to be the same, there is no need to delay and wait for someone’s answer, because we can google and research ourselves )

TMosh · January 2, 2024, 2:11am

You’re comparing two fundamentally different types of models.

Your first one has a linear (i.e. real-number) output.
Your second is a classifier.

It seems obvious you’re going to get different types of results.

Toyomu_Matsuda · January 2, 2024, 6:50am

Thank you very much. By changing KerasClassifier to KerasRegresser, both yield similar precision-vs-recall curve (not exactly same, though).

Topic		Replies	Views
Binary classification - Simplest classification task Neural Networks and Deep Learning coursera-platform	11	463	August 3, 2023
New to keras, tensorflow, layers - quick guide? Convolutional Neural Networks coursera-platform	5	662	September 2, 2022
Implementing Planar Classifier with Keras Neural Networks and Deep Learning coursera-platform	1	512	January 16, 2022
Keras Implementation of model - Week 2 Assignment Advanced Learning Algorithms week-2	3	506	December 9, 2022
Practical Use of NN Advanced Learning Algorithms week-3	2	500	July 23, 2022

Sequential vs KerasClassifier

Related topics