If we can evaluate, we can make new decision based on evaluation results. However, we can’t evaluate without any label, right? We need labels to tell whether the predictions are good or bad, right? However, we are talking about the anomaly detection in unsupervised manner, so how can we have labels?
It might seem contradictive but the key is for us to not go to the extreme that we have abolutely no labels at all. Instead we can still try to collect a few just for the purpose of evaluation. We don’t really have labels for all samples, and we don’t use the labels in the training process which still makes it an unsupervised learning.
We need a function for the evaluation. It can be the loss function, or it can be any other metric function of your choice. The loss function is critical in the training process, but we don’t have to limit ourselves to the same loss function at evaluation.
Yes, infact in clustering where we dont take any CV still it converges better. mean distance between centroid and data point is the real-number evaluation done there which is then compared with previous iteraction.
So without real-number eval, I dont see how model can converge
Actually I dont think it is contradictory, because still model is learning from data that does not have label. But CV was label. So \epsilon value is indirectly influenced by the CV.