Wanted insights on RAG from PDF

Hello Everyone,

I am currently trying to implement a RAG by myself basis the course of RAG in coursera. In the course, they skip part of actually vectorizing the document(s) and that is where I am currently in. I am not planning to use langgraph and want to do it using weaviate API directly. Anyone who has done this? What should I do? I have a PDF file and I am planning to use text-embedding-3-small for encoding.

Also, why in the course do they need to run flask app to do the same?

Thanks in advance.

Hi! Thanks for your message. We have an Introduction to the Weaviate API, which shows a general example of how to vectorize documents

Weaviate integrates with OpenAI, so you can use these models directly.

Unfortunately, due to limitations in the Coursera environment, we used a small Flask app to run a local model for document vectorization to demonstrate the whole process. In a production environment, you’d likely use a managed provider like OpenAI instead.

Let me know if you still have questions!

Cheers,
Lucas

Hello thanks for your reply. I tried to replicate the project in my windows system and had realized that I have to use docker for running the weavite service and embed only works for Linux and mac. Am I right to understand that we use docker as an intermediate alone to both create and maintain db even in production? Also, how standard practice is it to want to use docker in rag applications?