Hi @darthShana
great question and great application!
So, one guiding point to start with according to CRISP-DM… is business understanding: You can ask yourself: E.g. if you provide a prediction for a range of fair prices (using the available prices as labels) this could be interesting for the end user who either wants to buy or wants to sell a vehicle and your price prediction (preferably with confidence intervals) would help the user to reduce uncertainty on the end user‘s side. This speaks in favour of modelling it as regression problem, (e.g. with a probabilistic model if you want to provide uncertainty or confidence estimates.)
In conclusion: I do not see how a classification problem could contribute to solve a users problem or contribute to business understanding in this example.
There are other examples where things might be not so clear: let’s assume you want to predict how a certain device or machine is doing. Here you would have the opportunity (assuming sufficient data and labels are available):
- to model a remaining useful life (regression problem)
- to model a multi-class problem (e.g. with classes: normal, failure1, failure 2, …) if you have also failure labels which are often hard to get (classification)
- to model an anomaly detection in an unsupervised way, e.g. with an autoencoder for example, if you have tons of normal data only
I would suggest to let your business understanding considering also technical boundaries guide you in your problem definition.
Note also dependent on your designed system, regression, classification and unsupervised methods can be combined technically, see also this application here.
Hope that helps! Please let me know what you think.
Best regards
Christian