Hi all,
I am a little bit confused about the code implementation/ usage of the peft library in week 2 and 3.
In week 2 we are using get_peft_model
from the peft library to train our peft model. In particular we define our lora_config and pass it together with the original model into the get_peft_model method.
from peft import LoraConfig, get_peft_model, TaskType
lora_config = LoraConfig(
r=32, # Rank
lora_alpha=32,
target_modules=["q", "v"],
lora_dropout=0.05,
bias="none",
task_type=TaskType.SEQ_2_SEQ_LM # FLAN-T5
)
peft_model = get_peft_model(original_model,
lora_config)
In week 3 we are using PeftModel.from_pretrained
to train our peft model instead, but not get_peft_model any more.
lora_config = LoraConfig(
r=32, # Rank
lora_alpha=32,
target_modules=["q", "v"],
lora_dropout=0.05,
bias="none",
task_type=TaskType.SEQ_2_SEQ_LM # FLAN-T5
)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name,
torch_dtype=torch.bfloat16)
peft_model = PeftModel.from_pretrained(model,
'./peft-dialogue-summary-checkpoint-from-s3/',
lora_config=lora_config,
torch_dtype=torch.bfloat16,
device_map="auto",
is_trainable=True)
Can anyone explain me the difference? When should I choose which approach?
Any help is much appreciated!
Thank you!