C1M3 ungraded lab .Possible bug: overlap implementation doesn't match section 2.2 diagram

ulyaaliyeva206 · March 22, 2026, 10:44am

Hi everyone,

I’m currently going through the RAG course and noticed something in the get_chunks_fixed_size_with_overlap function that I’d like to clarify.

In the current implementation, the loop steps by chunk_size, but the overlap is added by extending the start index backwards:

for i in range(0, len(text_words), chunk_size):
    chunk_words = text_words[max(i - overlap_int, 0): i + chunk_size]

This means that from the second chunk onwards, each chunk actually contains chunk_size + overlap_int words instead of a fixed chunk_size. For example, with chunk_size=10 and overlap_fraction=0.2:

Chunk 1: words[0:10] → 10 words
Chunk 2: words[8:20] → 12 words
Chunk 3: words[18:30] → 12 words

However, the diagram in section 2.2 shows chunks of equal/fixed size, where the step size is reduced to maintain the overlap while keeping chunk size constant.

To match the diagram, I believe the implementation should be:

step = chunk_size - overlap_int
for i in range(0, len(text_words), step):
    chunk_words = text_words[i : i + chunk_size]

This way, every chunk stays at exactly chunk_size words, and the overlap is achieved by reducing the step between chunks.

Is this intentional, or could it be a small bug? Would love to hear thoughts from the instructors or other learners.

Thanks!

Deepti_Prasad · March 22, 2026, 3:35pm

hi. @ulyaaliyeva206

Great focused learning is a sign of enjoying what you are learning, the instructor actually did really explained RAG techniques with detailed explanation.

As far as I remember when I took the course, the reason behind including the previous chunk size and end chunk size, is mainly improve semantic fragmention, maintain the contextual relativity, preventing any hallucinations thereby improving overall retrieval accuracy.

This part the instructor, @Zain_Hassan mentions in one of the video but I cannot remember! the particular video title where he mentions this part.

Regards

Dr. Deepti

ulyaaliyeva206 · March 22, 2026, 3:38pm

Hi @Deepti_Prasad, thanks for the response!

I totally agree that overlap is important for maintaining context and improving retrieval accuracy — that’s not what I’m questioning

My point is specifically about the chunk size growing. With the current implementation:

Chunk 1: 10 words (correct)
Chunk 2: 12 words (chunk_size + overlap)
Chunk 3: 12 words (chunk_size + overlap)

The overlap is achieved by extending the chunk backwards, which makes it larger. But the diagram in section 2.2 shows fixed-size chunks where the overlap is achieved by reducing the step instead.

Both approaches give you overlap but:

Current code: chunks become bigger than chunk_size
Diagram approach: chunks stay exactly chunk_size

So the question isn’t “should we have overlap?” (yes!) but rather “should chunks stay at fixed size while overlapping?” — because the diagram shows one thing and the code does another.

Small difference, but wanted to flag it for accuracy!

Deepti_Prasad · March 22, 2026, 3:48pm

I did see the diagram but I read the instructions in that diagram which is matching with lab codes implementation.

probably the figure needs to be updated !??

Will inform the staff.

thank you for informing.

Regards

Dr. Deepti

Topic		Replies	Views
The formula in 'get_chunks_fixed_size_with_overlap' of C1M3_Ungraded_Lab_2.ipynb might not be correct Retrieval Augmented Generation week-module-3 , dl-ai-learning-platform , coursera-platform	1	27	February 23, 2026
Chunking ungraded lab C1M3_Ungraded_Lab Retrieval Augmented Generation week-module-3 , dl-ai-learning-platform	10	124	October 20, 2025
ACP lesson 3 chunk_size and chunk_overlap ACP: Agent Communication Protocol	4	45	June 30, 2025
Merging text from chunk window Knowledge Graphs for RAG	0	75	June 11, 2024
Sentence text splitters and chunk/overlap sizes? LangChain for LLM Application Development	6	2106	July 20, 2023

C1M3 ungraded lab .Possible bug: overlap implementation doesn't match section 2.2 diagram

Related topics