Discussion of real-number evaluation

So prof Ng said that changing the parameters of the system is faster when it is being developed, for example check supervised learning model.

Is he talking about supervised learning, because when I fast forwarded, he introduced labels in anomaly detection

Please share around what time in which video did you quote them from.

This video, 0:00 - 2:00 mins please watch between this timeframe

The real number evaluation, is he talking about loss function?

Oh wait, he explained here

Hello @tbhaxor,

I think you have found the answer yourself.

If we can evaluate, we can make new decision based on evaluation results. However, we can’t evaluate without any label, right? We need labels to tell whether the predictions are good or bad, right? However, we are talking about the anomaly detection in unsupervised manner, so how can we have labels?

It might seem contradictive but the key is for us to not go to the extreme that we have abolutely no labels at all. Instead we can still try to collect a few just for the purpose of evaluation. We don’t really have labels for all samples, and we don’t use the labels in the training process which still makes it an unsupervised learning.

We need a function for the evaluation. It can be the loss function, or it can be any other metric function of your choice. The loss function is critical in the training process, but we don’t have to limit ourselves to the same loss function at evaluation.


1 Like

Yes, infact in clustering where we dont take any CV still it converges better. mean distance between centroid and data point is the real-number evaluation done there which is then compared with previous iteraction.

So without real-number eval, I dont see how model can converge

Actually I dont think it is contradictory, because still model is learning from data that does not have label. But CV was label. So \epsilon value is indirectly influenced by the CV.

Makes sense, One question on this. Do you think F1 score is loss function or just a metric function for evaluation?

It’s a useful statistical matric.

1 Like

We also want the loss function to be differentiable. We need to compute gradients.

Yeah but f1 score is not differentiable


so I agree with Tom that it’s a useful metric.

That’s why we don’t use it as a loss function.

1 Like