✨ New course! Enroll in Document AI: From OCR to Agentic Doc Extraction

:arrow_forward: Enroll Now!

Join this new short course on Document AI, built with LandingAI and taught by David Park, Senior Director of Applied AI, and Andrea Kropp, Applied AI Engineer at LandingAI.

Much of the world’s data is locked in PDFs, JPEGs, and other documents. Traditional OCR extracts text but loses critical information—the layout of tables with merged cells, the relationship between charts and captions, the reading order of multi-column layouts. This course shows you how to build agentic workflows that process documents the way humans do: breaking them into parts, examining each piece carefully, and extracting information through multiple iterations.

You’ll start by exploring traditional OCR. After understanding its limitations, you’ll build agents equipped with additional tools for document processing like layout detection, reading order, and multimodal reasoning models. Next, you’ll learn to use the Agentic Document Extraction (ADE) framework from LandingAI to automate this workflow. ADE treats documents as visual objects. It uses custom models to parse complex elements and ground extracted fields to precise locations on the page. You’ll integrate ADE into RAG applications and deploy them as production-ready pipelines on AWS.