This course, developed in partnership with Intel, teaches you to build an interactive system for querying video content using multimodal AI. You’ll create a sophisticated question-answering system that processes, understands, and interacts with video.
Increasingly, language models and AI applications have added the capability to process images, audio, and video. In this course, you will learn more about these models and applications by implementing a multimodal RAG system. You will understand and use a multimodal embedding model to embed images and captions in a multimodal semantic space. Using that common space, you will build and use a retrieval system that returns images using text prompts. You will use a Large Vision Language Model (LVLM) to generate a response using the images and text from the retrieval.
By the end of this course, you’ll have the expertise to create AI systems that can intelligently interact with video content. This skill set opens up possibilities for developing advanced search engines that understand visual context, creating AI assistants capable of discussing video content, and building automated systems for video content analysis and summarization. Whether you’re looking to enhance content management systems, improve accessibility features, or push the boundaries of human-AI interaction, the techniques learned in this course will provide a solid foundation for innovation in multimodal AI applications.
In this course, you will make API calls to access multimodal models hosted by Prediction Guard on Intel’s cloud.
We’re sorry for the trouble you’ve had finding the course. Unfortunately, it is no longer available on our platform as we are in the process of updating our content.
We sincerely apologize for any disappointment this may cause. We encourage you to explore our other courses in at Courses - DeepLearning.AI
It still appears in the list at the link you provided. I was midway through watching it before it went “under maintenance”. Could you do something like “leaving the platform in 30 days” for the ones you’re considering retiring? And remove them from the list of courses available?
This is absolutely unacceptable. It’s extremely disappointing to see that after investing a significant amount of hard work, a course is suddenly removed without any prior notification to the students. I believe many students like me are more than halfway through the course, making this sudden change particularly frustrating.
I m looking for the course titled Multimodal RAG: Chat with Videos on the site, but I can’t find it anymore. Could someone please let me know if this video still exists or if it has been removed? Thanks!