What is the value about learning RNNs in the world of LLMs?

Just completed the short course on " ChatGPT Prompt Engineering for Developers" and suddenly I realize that the value of what I am learning in the DLS has been significantly impacted by the dawn of LLMs. Sentiment reading, translations, etc, as mentioned by Andrew himself in this short course, used to require to collect the label data set, train the model, figure out how to deploy the model somewhere in the cloud and make inferences, taking days or even weeks for a skilled machine learning developer. And now itā€™s only a matter of prompting refinement.

While intellectual curiosity is always a good reason to learn, I feel like learning ā€œhow things workā€ given that the most valuable things that I could be building with this knowledge have already been built. Itā€™s like learning how the internal combustion engine worksā€¦ interesting, but not that useful. Except for the few world experts that will be working on improving the existing versions.

Here is my question, given this scenario and the fact that not too far in the future we might have a version of computer vision and reinforcement learning that will be a significant leap of what LLMs meant to previous sequence models, what is still the truly valuable knowledge to be gained in the field? Is it still the case of MLOPs as mentioned by Andrew before? Or is this also being similarly impacted but these recent advancements (and other advancements to come in the near future)?

  1. LLMs donā€™t come for free. You still need to pay for usage via API. If you develop your own model, there is no additional fee for using the model.
  2. LLMs arenā€™t easy to train from scratch & deploy. You can finetune LLMs to your dataset but the training is expensive when done using a 3rd party.
  3. LLMs canā€™t be deployed on devices like mobile phones / web browser. Deploying on these devices reduces your server bill considerably.
1 Like

I agree to @balaji.ambreshā€˜s excellent feedback. E.g. for ChatGPT and many other LLM-powered applications this is the case also in my understanding.

Regarding the second point I would also point to these two sources / threads since they are relevant on the question how a cost-aware training of a LLM from scratch can look like:

Best regards
Christian

In addition here also a nice post from Yann LeCun published recently on LinkedIn which touches upon a quite cool overview of the evolutionary tree w/ tagging of closed source vs. open source LLMs:

(architecture nomenclature for LLMs seems a bit unfortunate as mentioned in the post)ā€¦

Best regards
Christian

I appreciate your comments. Based on your reply I understand that there is space for further development in the LLMs space (light versions for mobile devices, additional training with private data, etc).

Yet, itā€™s clear that the economies of scale of big tech companies are changing the landscape and itā€™d be interesting to hear opinions of the areas of knowledge that will be more valuable in this changing landscape, for example MLOps.

Have you seen this?

Yes, thatā€™s the reason I am asking the question

Iā€™m not aware of any LLM specific MLOps tools.

k8s should be able to handle deployments. Tools covered in the MLOps specialization should be a good starting point for scalable training and model analysis.

This thread might be interesting for you as well since itā€™s about the way from RNNs to the transformer architecture, referring to highly relevant and popular papers:

Best regards
Christian

In past weeks LLMOps seems to become a topic more and more. This article is nice since it also touches (among others) upon human feedback and language model relevant evaluation metrics like:

  • bilingual evaluation understudy (BLEU)
  • recall-oriented understudy for gisting rvaluation (ROUGE)
    ā€¦

Best regards
Christian