Obtaining Labels for Fine-Tuning LLMs

Dean_Orenstein · July 14, 2023, 5:04pm

In practice, it is probably difficult or time consuming to manually label a dataset of hundreds of examples of task-specific items for fine tuning an LLM. What are some ways or best practices to get these labels, if there is a shortage of labelers or there is simply too many examples to label?

TMosh · July 14, 2023, 5:09pm

Yes, it is. That’s why crowd-sourcing was a very popular method.

Dean_Orenstein · July 14, 2023, 5:35pm

I see. What about in the typical company setting, where a team wishes to fine-tune a model using the data they have gathered for their task, but there are only perhaps 1-3 people who can manually label the data? Let us also assume there are no existing datasets out there and no one else in the company has the time to help in this endeavor.

TMosh · July 14, 2023, 5:42pm

They’re going to have to work long hours, or recruit some new team members, or manage a crowd-sourced labeling project.

Juan_Olano · July 14, 2023, 5:45pm

I think you may be already suspecting it: The company has to either have these 1-3 people work on this, or hire people to do it. Or both.

And since we are in the topic, labeling is art and science, and requires a very clear set of instructions. Human labelers have different backgrounds, biases, understandings, points of view, so the same sample can be one class for some, and another class for others. So it is very important to have super detailed instructions, and to do cross-validation (same sample labeled by multiple labelers).

There are platforms that assist in the labeling process, and I think it is important to use one of these platforms, because they already contain a lot of ‘knowledge’ on the process of labeling.

TMosh · July 14, 2023, 6:01pm

One other factor to consider. Currently there is a trend in data labeling to rely on chat-bots to automate the labeling process.

This is fraught with peril, as chatbots are simply language models. Their results can pollute your data set with hallucinations or outright lies.

The industry is struggling with this dilemma, as it can also impact a crowd-sourcing project if the participants resort to chatbots to do their work.

Topic		Replies	Views
Generative AI project in e-commerce domain AI Discussions ai-discussions , project	3	344	March 22, 2024
Data preparation for Instruction fine-tuning: where the labels come from? Generative AI with Large Language Models week-2	1	254	January 19, 2024
Automating labeling process for supervised learning AI Discussions ai-discussions , data-centric	1	67	May 16, 2023
Can you mix and match different types of data? Finetuning Large Language Models	2	114	September 21, 2023
Alex, could you say more about the Natural Language Interfaces you mentioned? AI Discussions ai-discussions , data-centric	1	43	May 16, 2023

Obtaining Labels for Fine-Tuning LLMs

Related topics