Process Feedback with Negative Examples

andrew0 · December 6, 2021, 8:34pm

First of all, process feedback is using customer feedback to help the model correct its productions

Say I have a translation model which takes in an input i and outputs o, the translated sentence. The user can then respond to the translated sentence with a good/bad pertaining to the quality of the translation. I can see how process feedback would work with a good label, by adding a training example (i, o) to the dataset. But if the user answers bad, is it still possible for the model to learn from this? Or would process feedback normally also need the user to input the correct sentence?

tl;dr can a model learn from an input and a purposely incorrect label? Thanks!

SomeshChatterjee · December 8, 2021, 3:02am

Hi Andrew,

While in some cases the negative feedback can directly help the model, like in GANs, the output which was marked as incorrect by the user can go in as input to the discriminator to make the model outputs robust.

In case of other models, the negative examples are still useful, a SME (subject matter expert) can comb through the negative feedback and curate a new training set which has the rectified outputs. The end-user doesn’t have to provide the corrected answer, the SME will be able do it.
At-least I am not aware if a model could have directly used the negative example.

I personally think that a sample of user feedback should be audited from time to time, esp. if the model is made available to everyone (over the internet for example), as some people may deliberately provide wrong feedback to deteriorate the model performance. If the audience is more controlled (like accessible to only customers etc.) it may be a smaller problem.

andrew0 · December 9, 2021, 1:00am

Makes sense, thanks!

Topic		Replies	Views
Process Feedback with Online Learning NN Machine Learning in Production	1	534	April 24, 2022
How to feed model output data back for training? Machine Learning in Production	1	797	February 28, 2023
Extra info about week2 - Cleaning up incorrectly label data Structuring Machine Learning Projects	5	541	December 29, 2022
Learner assigns negative label to all examples Neural Networks and Deep Learning	1	492	May 3, 2022
Clarification on lecture video RLHF: Obtaining feedback from humans Generative AI with Large Language Models week-3	6	452	July 8, 2023

Process Feedback with Negative Examples

Related topics