Backpropagation algorithm

Like mentioned in the slide, we need the target vector Y to calculate dZ and the carry out the whole backprop. But in cases of application of neural networks in unstructured data like sentiment analysis on a product review or images, videos that have unlabeled target , how would the backpropagation even work?

I think I read that in such cases where there are unlabeled data, the model can still make predictions based on its training, but these predictions may not be as accurate as they would be with labeled data. But how is it possible to make that prediction in the first place?

As far as I know neural networks are supervised models i.e. they need labeled data to be trained.

If you have no labeled data at all then you could use other techniques to semi-label them like for eg. create clusters and then use that data for training another supervised model.

1 Like

Hi @Arisha_Prasain ,

Following on @gent.spah answer, you mention in your question the case of unlabeled data.

Supervised learning definitively needs labeled data, the ground truth, to learn from it. And in Supervised learning we use a loss formula as the starting point of back propagation.

For unlabeled data we can use Unsupervised learning. Unsupervised learning models learn by looking for patterns in the data they are given. These models usually use some sort of distance or other similarity measure to determine how well a given input fits with the patterns it has identified in the data.

  • Clustering, for example, uses distance between data points to group data together.

  • Other types of unsupervised learning may use other similarity models, like a probabilistic model, to find patterns and group data.

I am thinking that you could start with an unlabeled dataset, run it through an unsupervised learning to arrive to some ground truth, and then use that to train a supervised model with the now-labeled data.

What do you think about this?


1 Like

Hi there,

an example for unsupervised learning with neural networks are e.g. autoencoders or variational autoencoders. As mentioned in the previous answers they can learn how the typical distribution looks like. They are discussed in technical detail for example in this chapter:

You can utilize these models for example for the use case of anomaly detection. E.g. when only using unlabelled normal data for training, the model can learn in an unsupervised way what „normal“ looks like in the data. After deploying the model: if a metric (like reconstruction loss or distance measure) exceeds a certain threshold for new data, this could be an indicator within an early warning system or an anomaly detection system, indicating a potential anomaly case since it seems to be sufficiently different from the „normal data“.

Feel free to take a look at this example for fitting a variational autoencoder model: TFP Probabilistic Layers: Variational Auto Encoder  |  TensorFlow Probability

Not concerning neural networks, but very good to get more familiar w/ unsupervised learning: also PCA (principal component analysis) is worth mentioning as unsupervised learning method. Feel free to take a look at this example as inspiration: CRISP-DM-AI-tutorial/Classic_ML.ipynb at master · christiansimonis/CRISP-DM-AI-tutorial · GitHub

Best regards