What is the value about learning RNNs in the world of LLMs?

Rodolfo_Novarini · April 29, 2023, 9:54pm

Just completed the short course on " ChatGPT Prompt Engineering for Developers" and suddenly I realize that the value of what I am learning in the DLS has been significantly impacted by the dawn of LLMs. Sentiment reading, translations, etc, as mentioned by Andrew himself in this short course, used to require to collect the label data set, train the model, figure out how to deploy the model somewhere in the cloud and make inferences, taking days or even weeks for a skilled machine learning developer. And now it’s only a matter of prompting refinement.

While intellectual curiosity is always a good reason to learn, I feel like learning “how things work” given that the most valuable things that I could be building with this knowledge have already been built. It’s like learning how the internal combustion engine works… interesting, but not that useful. Except for the few world experts that will be working on improving the existing versions.

Here is my question, given this scenario and the fact that not too far in the future we might have a version of computer vision and reinforcement learning that will be a significant leap of what LLMs meant to previous sequence models, what is still the truly valuable knowledge to be gained in the field? Is it still the case of MLOPs as mentioned by Andrew before? Or is this also being similarly impacted but these recent advancements (and other advancements to come in the near future)?

balaji.ambresh · April 30, 2023, 6:04am

LLMs don’t come for free. You still need to pay for usage via API. If you develop your own model, there is no additional fee for using the model.
LLMs aren’t easy to train from scratch & deploy. You can finetune LLMs to your dataset but the training is expensive when done using a 3rd party.
LLMs can’t be deployed on devices like mobile phones / web browser. Deploying on these devices reduces your server bill considerably.

Christian_Simonis · April 30, 2023, 6:39am

I agree to @balaji.ambresh‘s excellent feedback. E.g. for ChatGPT and many other LLM-powered applications this is the case also in my understanding.

Regarding the second point I would also point to these two sources / threads since they are relevant on the question how a cost-aware training of a LLM from scratch can look like:

Best regards
Christian

Christian_Simonis · April 30, 2023, 6:46am

In addition here also a nice post from Yann LeCun published recently on LinkedIn which touches upon a quite cool overview of the evolutionary tree w/ tagging of closed source vs. open source LLMs:

(architecture nomenclature for LLMs seems a bit unfortunate as mentioned in the post)…

Best regards
Christian

Rodolfo_Novarini · April 30, 2023, 3:43pm

I appreciate your comments. Based on your reply I understand that there is space for further development in the LLMs space (light versions for mobile devices, additional training with private data, etc).

Yet, it’s clear that the economies of scale of big tech companies are changing the landscape and it’d be interesting to hear opinions of the areas of knowledge that will be more valuable in this changing landscape, for example MLOps.

balaji.ambresh · April 30, 2023, 8:47pm

Have you seen this?

Rodolfo_Novarini · May 1, 2023, 7:46pm

Yes, that’s the reason I am asking the question

balaji.ambresh · May 2, 2023, 6:15am

I’m not aware of any LLM specific MLOps tools.

k8s should be able to handle deployments. Tools covered in the MLOps specialization should be a good starting point for scalable training and model analysis.

Christian_Simonis · July 6, 2023, 8:36pm

This thread might be interesting for you as well since it’s about the way from RNNs to the transformer architecture, referring to highly relevant and popular papers:

Best regards
Christian

Christian_Simonis · July 6, 2023, 8:39pm

In past weeks LLMOps seems to become a topic more and more. This article is nice since it also touches (among others) upon human feedback and language model relevant evaluation metrics like:

bilingual evaluation understudy (BLEU)
recall-oriented understudy for gisting rvaluation (ROUGE)
…

Best regards
Christian

Topic		Replies	Views
Transfer Learning or Train NLP with Attention Models week-module-3	3	530	April 27, 2023
Plese update the course material Sequence Models week-module-1 , coursera-platform	3	33	January 19, 2025
Some questions about LLM Training AI Discussions ai-discussions	3	175	March 6, 2024
LLM Model Development AI Discussions	3	201	January 8, 2024
Preference of LLMs API over custom models ChatGPT Prompt Engineering for Developers	4	122	May 19, 2023

What is the value about learning RNNs in the world of LLMs?

Related topics