How to know the True class in binary classification?

Hello and Greetings ,
In the model prediction below , I’d like to know how to determine that if model prediction > 0.5 the class is human and not horse ??
could someone help me ??

If you look at your training data you will probably see that the labels are 1 for humans and 0 for non-human. The training tries to learn parameters to reproduce those outputs for test images. In real life, however, the predictions are unlikely to be all 1s or 0s. Instead, they are a range of predictions, p, such that 0. <= p <= 1.. The closer p is to 1., the more confident the prediction of human. The closer p is to zero, the more confident the prediction of no -human. Near the middle of the range means less confident in either pick. Might be interesting to look at images with values at the two extremes and compare them to images with predictions near 0.5

You could set aside several training images with known labels and use them to test this out. Let us know if you see a pattern? (You wouldn’t do this in real life, but for simplicity and just for improving understanding you could also just run a training image through .predict(), hopefully the prediction matches the correct label.)

1 Like

Thank you for clarifying.

1 Like

You can actually use the ImageDataGenerator to determine your class labels. From the week 4, lab 1 notebook, go to the Data Preprocessing section and find the cell that begins with

from tensorflow.keras.preprocessing.image import ImageDataGenerator

then find the line that says

train_generator = train_datagen.flow_from_directory(

That line executes the flow_from_directory() method. The method returns a list of images and their labels which we store in the train_generator variable. To see which label has been assigned to which class, create a new cell after the ImageDataGenerator cell and run this code:

%matplotlib inline
import matplotlib.pyplot as plt

labeled_images = train_generator[0] 
images = labeled_images[0] 
labels = labeled_images[1]
print(labels[0:10], '\n')

plt.imshow(images[0])

For me, the first label is 0 and the first image is a horse. The tenth label is 1 and the tenth image is a human.

I’ve noticed that horse comes before human alphabetically and horse = 0. In the Cats vs Dogs classifier from the next course, cat occurs before dog and cat = 0. These may be coincidences, but `flow_from_directory()’ may assign binary labels alphabetically.

1 Like

I took the course in 2019 and forgot that it uses some of that fancy Keras voodoo. I’m old enough to remember when you had to have actual label files and assign values yourself :scream:

The part of my reply about the predicted value outputs is still relevant, but to understand how the labels are automagikally associated with the training images, you need to understand what is described in the Keras doc https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator See the section on flow_from_directory() in particular the argument classes

  • Optional list of class subdirectories (e.g. ['dogs','cats'] ). Default: None. If not provided, the list of classes will be automatically inferred from the subdirectory names/structure under directory , where each subdirectory will be treated as a different class (and the order of the classes, which will map to the label indices, will be alphanumeric). The dictionary containing the mapping from class names to class indices can be obtained via the attribute class_indices*

(So no, it’s not coincidence)

that together with the argument class_mode Is what determines the labels. cheers.

1 Like

Great reference @ai_curious. I skimmed the method documentation but didn’t read carefully.

You can specify the labels by passing the optional classes argument:

train_generator = train_datagen.flow_from_directory(
        'horse-or-human/',  # This is the source directory for training images
        target_size=(300, 300),  # All images will be resized to 300x300
        batch_size=128,
        class_mode='binary'
        classes=['horses', 'humans'])

would set the labels as horses=0, humans=1. These labs specifically omit that argument to let keras infer the class names from the directory names.

The easiest way to check the class labels now appears to be:

print( train_generator.class_indices )

{ ‘horses’ : 0, ‘humans’ : 1 }

3 Likes