Binary classification - Simplest classification task

Michal_Majk_Ritcherd · July 23, 2023, 5:48pm

I’ve created for myself the most basic classification task.

The input has 2 training examples with one feature as [[-1], [1]] and the outputs are [1, 0].
I’ve created Neural Network with 1 neuron with Linear activation function and the model’s uses Binary crossentropy loss.

Here’s the code of the described example.

features = np.array([[-1], [1.]], dtype=np.float16)
labels = np.array([1., 0.], dtype=np.float16)

model = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(1,)),
    tf.keras.layers.Dense(1, activation='linear', dtype=tf.float16)
])

model.compile(
    optimizer=tf.keras.optimizers.Adam(1e-2),
    loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
    metrics='accuracy'
)

model.fit(features, labels, None, 100000)

If I am correct, this task should be fairly easy to train to 100%, meaning I can get loss equal to 0 (altough accuracy is 100%). But this model is unable to overfit the training data to have 0 loss (or predict probability exactly 1 for ‘-1’ and 0 for ‘1’).
Am I missing something? Or why doesn’t it converge to 0?

If I make from this task Linear regression task by switching the loss from BinaryCrossentropy to MeanSquaredError, the model then converges to zero.

TMosh · July 24, 2023, 12:15am

A single dense layer with no hidden layer isn’t really an NN. It is just regression.

Try it without the logits, try different learning rate or iterations.

Michal_Majk_Ritcherd · July 24, 2023, 4:59am

Yeah I did try a lot of things, but I wasn’t able to learn the regression.

TMosh · July 25, 2023, 12:11am

Sorry, but i am currently on leave, and do not have ability to try your experiment.

TMosh · July 31, 2023, 11:57pm

@Michal_Majk_Ritcherd, were you able to make progress on this experiment?

Michal_Majk_Ritcherd · August 2, 2023, 5:06pm

Yes, I was.
I needed way higher learning rate (2.55).

But this did not work using tf.float32 and tf.float64, there I needed some changes as follows:
I’ve created new model with 1 hidden layer with 6 neurons (ReLU activation) and set pretty high learning rate (1.29) and was able to train the NN. Although it depends on the parameter’s initialization.
In one run I an able to train the NN in just 10 epochs, in another I wasn’t able (Tried 300 000 epochs and possibly it would learn with far more).

But still wondering why I am not able to train these 2 scenarios the same way as with tf.float16, meaning with just output layer…

TMosh · August 2, 2023, 9:19pm

That’s not unusual for neural networks. Their cost function is not convex, so you can get a local minimum.

What weight and bias values did you get when it did converge?

TMosh · August 2, 2023, 9:45pm

Just FYI, I used your data set of two examples, using both logistic regression and an NN with one hidden layer, implemented in a different toolset, and it converged very quickly.

Michal_Majk_Ritcherd · August 3, 2023, 7:08am

When it converged, I got weight = [[-70.1]] and bias [-16.4]. With another run I get another different params, which is expected.

Michal_Majk_Ritcherd · August 3, 2023, 7:12am

implemented in a different toolset, and it converged very quickly

What toolset did you use? Still I am not sure why I can’t train on the Logistic regression with different data type such as tf.float32 and tf.float64. I know that the updates will be small, but in my cases it gets stuck and don’t change at all…

Anyway thanks for your time.

TMosh · August 3, 2023, 5:07pm

I don’t think the size of the float data has anything to do with your issue.

Michal_Majk_Ritcherd · August 3, 2023, 5:23pm

I agree with you, maybe I have bug somewhere in my code

Topic		Replies	Views
Why "use a prediction layer with one neuron (as a binary classifier only needs one)"? Convolutional Neural Networks coursera-platform	2	1117	November 27, 2022
Transfer Learning Assignment - Binary Classification Question Convolutional Neural Networks coursera-platform	2	531	July 9, 2023
Assignment: model does not coverge Advanced Computer Vision with TensorFlow week-module-4	2	554	December 17, 2021
DLS 4, Week 2, Programming Assignment - Transfer Learning with MobileNet v1: Convolutional Neural Networks coursera-platform	4	612	November 17, 2022
Add the new Binary classification layers in programming assignment: Transfer Learning with MobileNet Convolutional Neural Networks coursera-platform	3	745	June 29, 2021

Binary classification - Simplest classification task

Related topics