I was thinking to generalize the learnings from this course by utilizing other models that can work on other languages. What would be the limitation? the OpenAI helper functions only?
Yes, you can use any LLM from HuggingFace. For example one from Stability AI
from llama_index.llms import HuggingFaceLLM
llm = HuggingFaceLLM(
context_window=4096,
max_new_tokens=256,
generate_kwargs={"temperature": 0.7, "do_sample": False},
system_prompt=system_prompt,
query_wrapper_prompt=query_wrapper_prompt,
tokenizer_name="StabilityAI/stablelm-tuned-alpha-3b",
model_name="StabilityAI/stablelm-tuned-alpha-3b",
device_map="auto",
stopping_ids=[50278, 50279, 50277, 1, 0],
tokenizer_kwargs={"max_length": 4096},
# uncomment this if using CUDA to reduce memory usage
# model_kwargs={"torch_dtype": torch.float16}
)
service_context = ServiceContext.from_defaults(
chunk_size=1024,
llm=llm,
)
2 Likes
Which reranked model do you suggest to use?
Can I still use BAAI/bge-reranker-base?
Thank you
@cosma Sure, rerankers only re-evaluate and re-order the search results you get from Retriever.
You can use other rerankers, for example Cohere
from llama_index.postprocessor.cohere_rerank import CohereRerank
cohere_rerank = CohereRerank(api_key=os.environ['COHERE_API_KEY'], top_n=2)
sentence_window_engine = sentence_index.as_query_engine(
similarity_top_k=similarity_top_k,
node_postprocessors=[cohere_rerank]
Great
Many thanks!