AP4B week3 lesson 4 - llm Hallucination about Rio de Janeiro

Jochen_Keilitz · December 15, 2024, 1:53pm

Hello, studying lesson 4 of week 3 I encountered the following hallucination of the llm:

Asking the llm to provide a HTML output for rio_de_janeiro.txt

prompt = f"“”
Given the following journal entry from a food critic, identify the restaurants and their best dishes.
Highlight and bold each restaurant (in orange) and best dish (in blue) within the original text.
Highlight and bold each dessert (in green) within the original text.
Add a relevant emoji beside any ingredients within the original text.

Provide the output as HTML suitable for display in a Jupyter notebook.

Journal entry:
{journal_rio_de_janeiro}
“”"

I get at the very end of the response:

"…

For dessert, I couldn’t resist trying the torta de limão (lemon pie) at Confeitaria Colombo, which was a refreshing finish to a delightful meal."

Unfortunately this text is not in the rio_de_janeiro.txt file!

I found out that the instruction

Highlight and bold each dessert (in green) within the original text.

caused the llm to invent some additional text.

Personally I can understand to dream about Rio de Janeiro but … what to do about it?

Deminiko · December 15, 2024, 6:38pm

Hello, I have done this about the topic

To control hallucinations in your AI response, you can enhance your prompts to ground them specifically to a provided text file. Here are some techniques:

1. Clear and Explicit Instructions

Specify Source Material: Clearly instruct the AI to use only the information from the provided file. For example:
- “Use only the information from the attached file to answer the following questions.”
- “Refer exclusively to the content of the provided document when generating your response.”
Handle Unavailable Information: Direct the AI on how to respond when the information is not present in the file:
- "If the answer is not available in the file, please respond with ‘I do not know.’”

2. Prompting Techniques

Contextual Prompting
- Definition: Providing detailed context within prompts guides the model’s responses by including relevant background information or specifying the desired response format.
- Application: Include excerpts or summaries from the file to set the context. For example:
  - “Based on the following excerpt from the file, explain the main challenges discussed.”
System Message Design
- Definition: Crafting system-level instructions that define the AI’s role and behavior to steer its outputs consistently.
- Application: Set a clear role for the AI at the beginning of the interaction. For example:
  - “You are an assistant that provides answers solely based on the information contained in the provided document.”
Prompt Engineering with Constraints
- Definition: Incorporating specific constraints or guidelines within prompts to limit the scope of the model’s output.
- Application: Define boundaries for the AI’s responses. For example:
  - “Summarize the key points from the file without introducing external information.”
  - “Provide a list of recommendations based only on the data in the attached text.”

3. Advanced Techniques for Real Grounding

While the above methods are effective for basic grounding and reducing hallucinations, achieving robust grounding often requires more advanced techniques such as:

Retrieval-Augmented Generation (RAG)
- Description: Combines retrieval of relevant documents with generation, allowing the AI to reference specific parts of the text file when generating responses.
- Note: Implementing RAG typically involves integrating additional tools or frameworks and may be beyond the scope of introductory courses.
Embedding-Based Retrieval
- Description: Uses vector embeddings to find the most relevant sections of the text file based on the query, ensuring that responses are directly tied to the source material.

4. Practical Approach for Beginners

For now, focusing on modifying prompts to influence the AI model’s behavior is the most straightforward and often sufficient technique. Here’s how you can practice:

Iterative Prompt Refinement: Experiment with different prompt phrasings to see which yields the most accurate and relevant responses.
Feedback Loops: Assess the AI’s responses and provide corrective feedback to fine-tune its performance.
Template Creation: Develop prompt templates that consistently enforce the use of the provided file as the sole information source.

Code Example

Here’s an example of how you can modify your prompt to ground the AI model to a specific text file using Python:

import openai

def load_context(file_path):
    """
    Reads the content of the provided file.

    Args:
        file_path (str): Path to the text file containing context.

    Returns:
        str: Content of the file.
    """
    with open(file_path, 'r') as file:
        return file.read()

def create_prompt(context, user_query):
    """
    Constructs a prompt that includes system messages, context, and user queries.

    Args:
        context (str): The contextual information from the file.
        user_query (str): The user's question or request.

    Returns:
        str: The complete prompt to be sent to the LLM.
    """
    system_message = (
        "You are an AI assistant specialized in prompt engineering techniques. "
        "Use the following context to answer the user's query accurately."
    )
    prompt = (
        f"{system_message}\n\n"
        f"--- Context ---\n"
        f"{context}\n"
        f"--- End of Context ---\n\n"
        f"User Query: {user_query}\n"
        f"Assistant:"
    )
    return prompt

def get_llm_response(prompt, api_key, model="gpt-4", max_tokens=150, temperature=0.7):
    """
    Sends the prompt to the OpenAI API and retrieves the response.

    Args:
        prompt (str): The prompt to send to the LLM.
        api_key (str): Your OpenAI API key.
        model (str): The model to use (default is "gpt-4").
        max_tokens (int): Maximum number of tokens in the response.
        temperature (float): Sampling temperature.

    Returns:
        str: The AI's response.
    """
    openai.api_key = api_key
    response = openai.Completion.create(
        engine=model,
        prompt=prompt,
        max_tokens=max_tokens,
        temperature=temperature,
        n=1,
        stop=None  # You can define stop sequences if needed
    )
    return response.choices[0].text.strip()

def main():
    # Path to your context file
    context_file = 'context.txt'

    # Load context from the file
    context = load_context(context_file)

    # Define the user's query
    user_query = "How can I improve the reliability of AI-generated content?"

    # Create the complete prompt
    prompt = create_prompt(context, user_query)

    # Your OpenAI API key
    api_key = 'your_openai_api_key_here'  # Replace with your actual API key

    # Get the LLM's response
    answer = get_llm_response(prompt, api_key)

    # Print the response
    print("Assistant:", answer)

if __name__ == "__main__":
    main()

Example how to improve the commented prompt using the techniques

prompt = f"""
You are an AI assistant specialized in analyzing food critic journal entries. 

**Task:**
Given the following journal entry from a food critic, perform the following actions:

1. **Identify Restaurants and Their Best Dishes:**
   - **Restaurants:** Highlight and bold each restaurant name in **orange**.
   - **Best Dishes:** Highlight and bold each best dish in **blue**.
   - **Desserts:** Highlight and bold each dessert in **green**.

2. **Annotate Ingredients:**
   - Add a relevant emoji beside any ingredient mentioned within the text.

3. **Output Format:**
   - Provide the output as **HTML** suitable for display in a **Jupyter Notebook**.

**Journal Entry:**
{journal_rio_de_janeiro}

**Formatting Guidelines:**
- Use HTML `<span>` tags with inline CSS for coloring and bolding text.
  - Example for restaurant: `<span style="color: orange; font-weight: bold;">Restaurant Name</span>`
  - Example for best dish: `<span style="color: blue; font-weight: bold;">Best Dish</span>`
  - Example for dessert: `<span style="color: green; font-weight: bold;">Dessert</span>`
- Place emojis immediately after the ingredient they represent.
  - Example: `tomato 🍅`

Jochen_Keilitz · December 17, 2024, 9:53pm

Hello Nicolas,

thank you very much for this valuable information.

I used your prompt for the “Highlight” exercise of lesson 4 (week 3) and it works very well. The llm does not hallucinate any longer.

Thank you also for the overview of Grounding techniques.

You also provided some code for grounding the llm model using a context file. As I understand the user_query gets enriched by a system_message and that context file?

How would such a context file look like? Could I use your prompt example as a context file?

I got a bit confused by the user_query “How can I improve the reliability of AI-generated content?” in that code.

Was it meant as an example for any user_query or was the idea to first let the llm work as a prompt engineer in a first step to optimize some non optimized user prompt, for example first let it optimize my prompt:

prompt = f"“” As a prompt engineer optimize the following prompt:

Prompt:

Given the following journal entry from a food critic, identify the restaurants and their best dishes.
Highlight and bold each restaurant (in orange) and best dish (in blue) within the original text.
Highlight and bold each dessert (in green) within the original text.
Add a relevant emoji beside any ingredients within the original text.

Provide the output as HTML suitable for display in a Jupyter notebook.

Journal entry:
{journal_rio_de_janeiro}
“”"

and then use that llm optimized prompt in a second step?

And then how would a context file look like in this case? Or could I use your prompt example as a context file in this case also?

I found out that this highlighting exercise is not easy, especially tokyo.txt also is quite tricky.

Deminiko · December 17, 2024, 11:01pm

hello Jochen, glad it was useful, the context file is just a file with text in the prompt of the model, which is just as you said, it can be in the system message or just as a user message (some models do not have system message). so at the end is just text in the model input.
And your idea is great, you can add intermediate steps to review your prompt or to review the LLM answer, so, for example, you can give the LLM answer as context for the next question, and that is called “chain of thought”.

in

**Journal Entry:**
{journal_rio_de_janeiro}

#where
journal_rio_de_janeiro = text_file.txt #(text file with the journal info)

Topic		Replies	Views
AI Python for begineers: Automating Tasks with Python: Next course preview: working with files AI Python for Beginners	2	67	September 11, 2024
Problem to download file on Course 3 - Lesson 4 AI Python for Beginners	3	48	September 10, 2024
AP4B week3 lesson 3 - Classification into several categories AI Python for Beginners week-3	3	29	January 9, 2025
Hallucinations Functions, Tools and Agents with LangChain	2	190	December 4, 2023
Candy Analysis in Week 4 Exercise 5 AI Python for Beginners week-4	7	41	January 9, 2025