Can LangChain figure out PDF Page Numbers?

Hi guys,

I’m going really well with this - was even able to demo a POC at work this week, and I’m going to pursue a career path in this.

Just wondering - does anyone know a reliable way to use LangChain to keep note of the original page numbers in the PDF for the extracted information? I’m finding it works sometimes, but can also be off by one or two pages, which I’m guessing is because sometimes the chunks cut off in odd places?