🌟 New Course! Enroll in Multimodal RAG: Chat with Videos

Community-Team · September 11, 2024, 2:06pm

This course, developed in partnership with Intel, teaches you to build an interactive system for querying video content using multimodal AI. You’ll create a sophisticated question-answering system that processes, understands, and interacts with video.

Increasingly, language models and AI applications have added the capability to process images, audio, and video. In this course, you will learn more about these models and applications by implementing a multimodal RAG system. You will understand and use a multimodal embedding model to embed images and captions in a multimodal semantic space. Using that common space, you will build and use a retrieval system that returns images using text prompts. You will use a Large Vision Language Model (LVLM) to generate a response using the images and text from the retrieval.

By the end of this course, you’ll have the expertise to create AI systems that can intelligently interact with video content. This skill set opens up possibilities for developing advanced search engines that understand visual context, creating AI assistants capable of discussing video content, and building automated systems for video content analysis and summarization. Whether you’re looking to enhance content management systems, improve accessibility features, or push the boundaries of human-AI interaction, the techniques learned in this course will provide a solid foundation for innovation in multimodal AI applications.

In this course, you will make API calls to access multimodal models hosted by Prediction Guard on Intel’s cloud.

andyhe · October 6, 2024, 2:42am

Where can I get the code in the tutorial? In particular gradio_utils.py

TMosh · October 7, 2024, 5:37pm

@andyhe, I recommend you post in the forum area for that course.
You can find it in the “Short Course Q&A” area.

Topic		Replies	Views
🌟 New Course! Enroll in Building Multimodal Search and RAG News and Announcements short-course , dl-ai-learning-platform	5	312	May 15, 2024
Can we create a dataset from the transcripts of videos for the educational purposes? AI Discussions chroma , project	14	1636	March 22, 2024
👉 New Course! Getting Started with Mistral! News and Announcements short-course , dl-ai-learning-platform	4	382	April 29, 2024
Using RAG to teach a (image + text) scanned PDF doc to the model Building Multimodal Search and RAGs	0	376	May 18, 2024
Information retrieval model using past question paper AI Discussions ai-discussions	1	20	August 14, 2024

🌟 New Course! Enroll in Multimodal RAG: Chat with Videos

Related topics