Transfer Learning or Train

Felipe_Rodriguez · April 27, 2023, 2:09am

Hi,

I was thinking when should I train a model vs using Transfer Learning. Given the fact that GPT-3 cost USD5M in training, and adding the fact that I’m nowhere near to invest even 1% of that budget into training a model, is it a good idea to even try? Or should we use Transfer Learning all the time given how LLM is evolving, leaving all Average Joes behind of the training era, and making competition against them not even worth it?
Let’s give scale it up even a little further. If I want to focus on a specific subject, for example, giving responses like Shakespeare would. Will my trained model of all of his books perform better than a T5 or GPT-4 in that area? Is it worth the time? Or should I use the books to fine-tune these LLMs and have better results instead.

Does anybody have an opinion on the topic? I’m just finishing these courses, so I still need to take my learnings into practice, but this question has been bugging my mind. Any idea, thought, experience, I’m more than happy to hear!

Thanks for your time

arvyzukai · April 27, 2023, 5:27am

Hi @Felipe_Rodriguez

Given the way I interpreted your sentiment about the topic, you should not be jumping into training huge models from scratch

Most probably not.

It depends how you value your time. Compared to what? Computer games? Friends and family? Etc.

For decent results you should go for that - fine-tuning open source LLMs or you can use existing ones (like ChatGPT API does not cost a fortune), depends what are you going to do with “Shakespeare responses” (in other words, whats your business plan?)

All in all I think there is enough space for everyone:

fine-tuning open source LLMs;
using existing LLMs through APIs;
searching for a new activation function (like ReLU);
you name it

It seems daunting at first but if you like what you do you can succeed in any field. It’s true that the most “effective” path is not clear but I think you should concentrate not on the efficiency but on the things that interest you.

Just my thougts

Felipe_Rodriguez · April 27, 2023, 5:32pm

Thanks for your reply @arvyzukai

I considered implementing NLP into my everyday tasks and new business to develop. That’s why it intrigued me about what’s the best way to move forward.

What I got from your reply is that it’s better to leverage an LLM and fine-tune it rather than start it from scratch, which makes sense to me too. I’d like to know if there’s a line on specific subjects that are better to train yourself, and that’s why I brought the Shakespeare example (Everybody can get Shakespeare’s literature, and hence, train it to reply like him).

It does bother me, nevertheless, that I could get unexpected answers if I use the LLM instead of training it from scratch. For example, if I want it to answer like Shakespeare, with an LLM, it could be possible for it to quote Proust instead of Shakespeare, rather than if I train my model just with Shakespeare’s books and quotes, then there’s no room for the model to answer like any other author.

That’s where my confusion lies

arvyzukai · April 27, 2023, 6:08pm

There is some truth to that but if you fine-tune it good enough then there should be no problem. Question is how big are disadvantages of these edge cases and are advantages of LLMs big enough. If you loose your leg when the model quotes Proust, then I think it’s better to stay safe But if the occasional deviations from the Shakespeare are barely noticeable and conversational capabilities are more important then fine-tuning should not be a problem.

Again, this is just my opinion and not an advice for career or anything similar In other words, consult with your doctor / lawyer etc. Or on a serious note, listen and ponder other opinions too.

Topic		Replies	Views
LLM Paper - Knowledge AI Discussions ai-discussions , large-language-model	6	248	February 18, 2024
Inferring and NLP Tasks - is there a model comparison? ChatGPT Prompt Engineering for Developers	5	108	September 24, 2023
Accent on the prompt learning in the course NLP with Attention Models certificate , general	2	79	June 14, 2024
Week3 - I have just completed the course, excited to put my knowledge into practice! Generative AI with Large Language Models week-1	2	42	October 15, 2024
Couple Questions From Week 1 Generative AI with Large Language Models week-1	1	383	October 2, 2023

Transfer Learning or Train

Related topics