First of all, process feedback is using customer feedback to help the model correct its productions
Say I have a translation model which takes in an input i and outputs o, the translated sentence. The user can then respond to the translated sentence with a good/bad pertaining to the quality of the translation. I can see how process feedback would work with a good label, by adding a training example (i, o) to the dataset. But if the user answers bad, is it still possible for the model to learn from this? Or would process feedback normally also need the user to input the correct sentence?
tl;dr can a model learn from an input and a purposely incorrect label? Thanks!
While in some cases the negative feedback can directly help the model, like in GANs, the output which was marked as incorrect by the user can go in as input to the discriminator to make the model outputs robust.
In case of other models, the negative examples are still useful, a SME (subject matter expert) can comb through the negative feedback and curate a new training set which has the rectified outputs. The end-user doesn’t have to provide the corrected answer, the SME will be able do it.
At-least I am not aware if a model could have directly used the negative example.
I personally think that a sample of user feedback should be audited from time to time, esp. if the model is made available to everyone (over the internet for example), as some people may deliberately provide wrong feedback to deteriorate the model performance. If the audience is more controlled (like accessible to only customers etc.) it may be a smaller problem.