Merging text from chunk window

Shawn_Chiao · June 11, 2024, 8:44pm

In the section “Adding Relationships to the SEC Knowledge Graph”, the cypher query joins the text chunks within the window. However, when the chunks were split in the beginning, there’s an overlap of 200 characters. This means that the merged chunks from a window will have those characters duplicated. Is that correct?

With that in mind, when using KG, is it still good practice to include the overlap in chunking document, or better rely on the use of window, which is adjustable at runtime?

(or, it sounds like it’s one of those it depends on the use case situation?)

Topic		Replies	Views
Knowledge Graphs for RAG: why would you need a graph for the "window" technique? Knowledge Graphs for RAG ai-discussions	1	63	February 12, 2025
Sentence text splitters and chunk/overlap sizes? LangChain for LLM Application Development	6	1674	July 20, 2023
🌟 New Course! Enroll in Knowledge Graphs for RAG News and Announcements short-course	7	592	March 20, 2024
Document splitting: Chunksize LangChain for LLM Application Development	0	101	July 6, 2023
L2 - Basic RAG Pipeline Chunking Strategy Building and Evaluating Advanced RAG Applications	0	282	January 30, 2024

Merging text from chunk window

Related topics