Stacking fine-tuning

Tomas_Vaitulevicius · December 6, 2023, 12:07am

I can see myself wanting to optimise the linguistic reasoning inside of an LLM using LoRA with extra data that it hadn’t seen in pretraining.
But what happens if would then want to tweak the HHH of the outputs using RLHF and again LoRA?
Can I stack LoRAs at inference time or merge the first LoRA weights into the base weights (LLM*LoRA1) and then run the RLHF LoRA on this updated LLM?

gent.spah · December 6, 2023, 6:33am

As far as I understand the RLHF is run during training with LORA. Once training has finished you can use LLM + LORA weights, but those all are frozen now, and inference can happen with these frozen weights.

Tomas_Vaitulevicius · December 6, 2023, 4:10pm

Hi, thanks for the quick response.
RLHF + LoRA makes sense to me, but what I meant is that in Week2 we learned that LoRA can be used for normal PEFT, for example if we want a pretrained model to learn new insights based on our internal input-output training data.
What do we do if we want to:

use FLAN-T5
fine tune it for our domain with PEFT LoRA
and then we want to fine tune the outputs to reduce toxicity - again with PEFT LoRA

Topic		Replies	Views
Week 2 Lab: Training configuration of the PEFT model Generative AI with Large Language Models ai-discussions	3	41	November 21, 2024
Help in model training strategies (PEFT/LORA + RAG) AI Discussions ai-discussions , project	0	27	November 2, 2024
Question tokenizer PEFT training Generative AI with Large Language Models week-2	3	173	May 1, 2024
Improve LLM effeciency AI Discussions ai-discussions	4	46	October 3, 2024
Fine-Tuning a Large Language Models with QLoRA and PEFT(LLMs) Generative AI with Large Language Models week-2	11	713	July 15, 2023

Stacking fine-tuning

Related topics