What are we predicting after training the model?

we have compiled the program for BBC news and IMDB movie review dataset. but I am not getting what are we predicting? we have just found the accuracy and loss.

Hello @samin
Welcome to our Community! Thanks for reaching out. We are here to help you.

This is one of the most important tasks to do before coding; you should understand the data; in this case, we have the BBC new and the IMDB dataset; let’s make this example with the IMDB dataset; this dataset is of large movie review we can see some review of users about a movie, and in the target of output you can see a binary classification; you have 0 if its a negative review and one if it’s a positive review, that means that you are creating an algorithm that takes as an input a review. It will automatically classify if it is positive or negative with an accuracy value.

If you don’t see that information in the case, you can google it and search for more information about the dataset here: IMDB reviews

Now you are getting a specific value, in this case, accuracy. You should take that accuracy and see how possible that that review is positive or negative; you can understand the accuracy as the probability of the prediction being one of the two possible classes. If you have a high accuracy of the level one (Positive) that says that it is a positive review, that will be the output of your algorithm.

Hopefully, help :muscle:

With regards,

1 Like

Thankyou very much for clearing this out. As you said, I will check out the code once more for clarification.

Since this is a binary classification- the review is either positive or negative- isn’t the probability of the prediction being one of the two possible cases actually 1.0 ?

According to the Keras documentation, the Accuracy class ‘Calculates how often predictions equals labels.’

https://keras.io/api/metrics/accuracy_metrics/

If you use a softmax to produce floating point values for the different outcomes that sum to 1.0, say P(positive) =.74 P(negative)=.26 you can think of those numbers as probability or confidence, but they are not accuracy

1 Like

Yes, it’s correct @ai_curious ; thank you for the clarification.

@samina_yasmin I didn’t find the previous responses particularly insightful, so hope you don’t mind if I add my own perspective to your question despite them having been already accepted as the ‘solution’.

Using a neural network containing an Embedding layer for sentiment classification has some similarities and some differences from other ones, say CNN for images, you may have studied previously. The first similarity is that it has one input layer, one output layer, and one or more intermediate layers. The input layer accepts one or more encoded text strings, the intermediate layers perform transformations on those encoded strings, and the output layer emits a single floating point value. Nothing special. The second similarity is that during training, the emitted output is compared against a known value, a label, in order to compute loss. Since this is a binary classification, the labels for each training input are either 1 or 0. The accuracy value is computed as the number of forward propagations (predictions) that match their corresponding label. During back propagation model parameters are modified based on the derivatives of the loss function. Again, nothing special. However, here is where the first difference shows up: in many uses the Embedding layer is configured as not trainable. That is, the weights for the Embedding layer are set using a pre-trained embedding matrix - like GloVe - for the corpus when the layer is instantiated and don’t change thereafter.

  tf.keras.layers.Embedding(..., trainable=False),  <==

In this case, only the trainable weights in the other layers are updated during training. For each epoch, accuracy and loss are computed, and the weights for any Convolutional or Dense layers (and any other trainable layer types) in the model are updated.

When the training completes, you have a model with the static corpus embeddings in the Embedding layer, and the learned weights in the others. In this exercise, the model isn’t used for anything else. As you correctly observe, no predictions are made using it. However, that doesn’t mean you can’t do so.

All you have to do is feed one or more encoded (and padded) strings into the model’s predict() function, and it will execute forward propagation and produce a floating point output for each provided input. For example, you could just feed it the training_sequences again…

print(model.predict(training_sequences))
[[0.92069566]
 [0.0220463 ]
 [0.63894075]
 ...
 [0.20007838]
 [0.8982831 ]
 [0.03547856]]

In practice, you could apply a threshold to force a binary output ie

[[ True]
 [False]
 [ True]
 ...
 [False]
 [ True]
 [False]]

In other words, the model is capable of performing prediction just like any other trained Keras model. The difference is it expects its input in a very particular format that is related to the embedding matrix provided when the model was built (and trained).

Hope this helps.

EDIT
I built another model that includes training of the Embedding layer. This one is able to directly input strings, not their encoded and padded versions. So it’s completely transparent what the model is doing when it runs predict(). Here is the tl;dr

examples = [
    "The movie was great. Hilarious. Beautiful!",
    "The movie was pretty good",
    "The movie was okay.",
    "The movie was terrible...",
    "The worst most awful ridiculous pathetic movie ever!"
]

export_model.predict(examples)
array([[0.75278544],
       [0.49039313],
       [0.40566325],
       [0.323757],
       [0.06831602] ], dtype=float32)
1 Like

Now I understand the concept . Thank you very much for the clear explanation for each layer used in NLP.
Actually I was not hoping for any more answers to my silly questions!

1 Like