I am just curious why the prediction layer in the alpaca_model is linear instead of sigmoid activation for binary classifier?
Because they use the Binary Cross Entropy loss function in “from_logits = True” mode. Here’s the cell that defines that:
base_learning_rate = 0.001
model2.compile(optimizer=tf.keras.optimizers.Adam(lr=base_learning_rate),
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
Have a look at the documentation for that loss function and look at the meaning of the from_logits parameter.
2 Likes
Here’s another recent thread that addresses the same issue.
2 Likes