What makes deploying a GenAI model much easier?

In Week 2, video 1 on " Using generative AI in software applications", Andrew says that the time to deploy traditional supervised learning models can be months, whereas for prompt-based development, it can be hours / days.

What is the reason for this? I don’t quite understand how Generative AI differs in terms of deployment compared to traditional AI. Is it because the LLM is not hosted in your own server?

1 Like

Thanks for this question, Tajinder. The differences between genAI and traditional AI lie not only in the hosting environment, but also the training approach itself, the flexibility of the models for various tasks and the way it is deployed. Traditional AI applications involve excessive training and batch processing, i.e. data is processed in chunks of a previously collected and labelled dataset specific to the task at hand, rather than genAI’s real-time response generation. You also experiment with different architectures, hyperparameters, and optimization strategies.

1 Like

LLMs are incredibly difficult and expensive to train, every bit as much and perhaps even more so than the “traditional supervised learning models” that Prof Ng mentions. But the point is that when you develop the type of prompt based system that the short courses are about, you don’t have to do any training of the base LLM that you are using: that has already been done for you. You are basically putting a prompt interface layer on top of the underlying trained LLM model and then optionally pointing it at your own dataset.


Thank you for your response :slightly_smiling_face:
I was looking very strictly at the three steps Andrew mentioned, the data collection, training, and deployment. So, I bunched the topics you mentioned under training and was then wondering why deployment still would take a lot of time in the case of traditional AI as compared to Gen AI. But I suppose the deploy step would also include some monitoring and iterations on the design.

The point you made about flexibility also seems key. Something like RAG is fantastic, because you don’t need to train the model but can simply provide it specific context and let it reason with it.

1 Like

Yes, that part is clear, thanks :slight_smile:
I was a bit confused at the differences in the deploy stage, why it takes months to deploy a traditional AI application compared to days for a Gen AI application. I guess this is due to iterations you might need after deploying and monitoring, and these could take much longer in a traditional AI application. In addition, maintenance, etc. I suppose.

1 Like

I haven’t listened to that specific lecture, but here are a couple of guesses:

  1. Maybe Prof Ng is really talking about the whole process starting from the point you decide to build a solution to a particular problem, when he said “deploy” in that context.
  2. But if you really mean what we normally think of as just how to deploy an existing ML/DL solution, maybe that is somewhat simpler because ChatGPT and other such engines already have a web-based interface, whereas if you want to deploy a ConvNet solution or some other kind of DL model, you have to deal with that issue.
1 Like