C1M2 : Implementing Retriever Functions in a RAG System -Exercise 1

# Index the tokenized query with the retriever (as per assignment)
BM25_RETRIEVER.index(None)

The hints asks us to index the tokenized query with the retriever, I input the “tokenized_query” but it does not work.

I also try to use the global variable provided. But it does not work too.
BM25_RETRIEVER.index(TOKENIZED_DATA)

Any help?

Thanks

Hi! Your approach on indexing the TOKENIZED_DATA object is correct. As next step, you should cal the BM25_RETRIEVER.retrieve function with the appropriate parameters (here you add the tokenized query and the appropriate parameter for the top k).

Let me know if this helps.

Thanks
Lucas

1 Like

The hint says:

“Index the tokenized query with the retriever (as per assignment)”

However, I don’t think we should index the tokenized query with the retriever—shouldn’t we only index the data, not the query?

Also, in the code, TOKENIZED_DATA is indexed globally, so it seems unnecessary to index TOKENIZED_DATA again locally within my function.

Could someone clarify:

  1. Are we actually supposed to index the tokenized query with the retriever, or just use it for retrieval/search?
  2. If TOKENIZED_DATA is already globally indexed, is there any need to index it again locally?

Thanks for any help!

1 Like

Hi.

  1. No, we are not supposed to index the tokenized query with the retriever.
  2. TOKENIZED_DATA is a globally indexed, and ideally there should not be a need to index it again locally. The requirement in the graded cell seems to be a redundancy.
2 Likes

Hi, This assignment is really giving me trouble… I tried to follow the steps exactly as instructed, howver i run into errors. i tried to comment out this line : # Index the tokenized chunks with the retriever
# BM25_RETRIEVER.index(tokenized_query) , i believe i am calling the retrieve method correctly : # Use the ‘BM25_RETRIEVER’ to retrieve documents and their scores based on the tokenized query
# Retrieve the top ‘k’ documents
results, scores = BM25_RETRIEVER.retrieve(tokenized_query , k = top_k) but still encounter errors… i have been at it for a while now :frowning:

i get errors telling me that my corpus size is 1.

It works! i commented out the problematic line and fixed how i was returning the indices…

1 Like

I’m having some similar issue, everything is working fine when i execute my cells, even the tests are all passwed, however when i submit my assignment i got the following error for all graded cells.

There was a problem compiling the code from your notebook, please check that you saved before submitting. Details: name 'BM25_RETRIEVER' is not defined

anyone familiar with it? Thanks in advance

Hey @jahanzebnawaz, the code for this exercise has just been updated & fixed on Coursera, refreshing your workspace should give you access to the fixed copy of the notebook. Let me know if that solves your issue.

Would it be possible to ask for the raw copy of the code, before edits. Unfortunately I messed up with it, I am not sure what is wrong.

Hi @nazarb, yes, you can get a fresh new copy of all files in your lab by following the instructions here.