In the section “Adding Relationships to the SEC Knowledge Graph”, the cypher query joins the text chunks within the window. However, when the chunks were split in the beginning, there’s an overlap of 200 characters. This means that the merged chunks from a window will have those characters duplicated. Is that correct?
With that in mind, when using KG, is it still good practice to include the overlap in chunking document, or better rely on the use of window, which is adjustable at runtime?
(or, it sounds like it’s one of those it depends on the use case situation?)