How to work with PDF files that has tables in it?

Hi! First of all thank you for making this amazing course!

I have a question building a chatbot with Langchain. Some of my files are PDFs and they have a lot of tables. How can I make it to “read” and clasify in chunks properly does tables so the chatbot when asking can find the proper answer in the tables.

Appreciate a lot your time!

Hello @Zaesar you can simply try changing the format of the file .
Try python libraries like PyPDF2 for extracting data.
Hope this helps

But this doesn’t answer the question if it will read tables? What I’ve found so far is that Langchain doesn’t work well reading tables in pdfs and I guess in other formats. Do you or anyone have knowledge about this? Appreciate