AP4B week3 lesson 3 - Classification into several categories

Hello, I try to write a prompt that classifies the files of the lesson into several categories (exercise 2 of lesson 3).

my code looks like

files = [“cape_town.txt”, “madrid.txt”, “rio_de_janeiro.txt”,
“sydney.txt”, “tokyo.txt”]

for file in files:
# Read journal file for the city
f = open(file, “r”)
journal = f.read()
f.close()

prompt = f"""Classify the journal entry by responding with one of the following: "Vegetarian", "Vegan", "Both", "Neither" or No".
- Respond "No" if the journal does not describe restaurants and food dishes.
- Respond "Vegetarian" if the journal describes restaurants and food dishes and mentions a vegetarian dish.
- Respond "Vegan" if the journal describes restaurants and food dishes and mentions a vegan dish.
- Respond "Both" if the journal describes restaurants and food dishes and mentions both a vegetarian and a vegan dish.
- Respond "Neither" if the journal describes restaurants and food dishes but mentions neither a vegetarian nor a vegan dish.

Journal:
{journal}"""

# Use LLM to determine if the journal entry is useful
print(f"{file} -> {get_llm_response(prompt)}")

Unfortunately I always get the answer “No”.

I tried to rewrite the prompt like this:

prompt = f"""Classify the journal entry by responding with one of the following: "No relevance", "Relevance", Vegetarian", "Vegan", "Both" or "Neither".
Go step by step: 
Step 1: In a first step check whether the journal is about restaurants and food dishes.
If it is not respond "No relevance" and do not take any further action.
If the journal is about restaurants and food dishes then perform a second step as follows:
Step 2:
- Respond "Relevance"

Journal:
{journal}"""

This works.

But as soon as I replace Step 2

# Refining the prompt    
prompt = f"""Classify the journal entry by responding with one of the following: "No relevance", "Relevance", Vegetarian", "Vegan", "Both" or "Neither".
Go step by step: 
Step 1: In a first step check whether the journal is about restaurants and food dishes.
If it is not respond "No relevance" and do not take any further action.
If the journal is about restaurants and food dishes then perform a second step as follows:
Step 2:
Respond "Vegetarian" if the journal mentions a vegetarian dish (without meat)
Respond "Vegan" if the journal mentions a vegan dish.
Respond "Both" if the journal mentions both a vegetarian and a vegan dish.
Respond "Neither" if the journal mentions neither a vegetarian nor a vegan dish.    

Journal:
{journal}"""

I end up in “No relevance” and surprisingly “Relevance” for rio_de_janeiro.txt (although not listed in Step2, but specified in my classification list “No relevance”, “Relevance”, Vegetarian", “Vegan”, “Both” or “Neither”. If I do not put it in there, all will end up in “No relevance”.)

How can I overcome this problem in a structured way.

Would it not be better, to run several for-loops:

In loop-1 sorting out non relevant journals I create a helper list relevant_journals.
In loop-2 and 3 I fill two helper lists journal_with_at_least_one_vegatarian_dish and journal_with_at_least_one_vegan_dish and then finally run
loop-4 classifying the journals based on whether they are in these lists?

Looks more reliable to me but I is is not flexible. If I add or modify the classification I have to modify the code.

So, who can help for a solution and let llm do the job?

Hello Jochen I made this for you in complementation with the other topic of your other question about grounding the AI model responses to a text file, both problems are about prompt design, here in deeplearning are a couple of good prompt engineering courses, you can search for them in the courses section.

  1. Clear and Structured Steps:

    • Step 1: Determines the relevance of the journal entry.
    • Step 2: Classifies based on the presence of vegetarian and/or vegan dishes.
    • This structured approach ensures the AI follows a logical sequence, reducing the chances of incorrect classifications.
  2. Explicit Classification Criteria:

    • Defined clear criteria for each classification category.
    • This helps the AI understand the exact conditions for each category, minimizing ambiguity.
  3. Strict Response Format:

    • Instructed the AI to respond only with one of the specified phrases.
    • This prevents the AI from adding unnecessary explanations or deviating from the expected output.
  4. Use of Exact Phrases:

    • Specified the exact phrases (including capitalization and spacing) that the AI should use.
    • This ensures consistency and makes it easier to parse the results programmatically.
  5. Example Responses:

    • Provided examples of valid responses.
    • This serves as a reference for the AI, enhancing its understanding of the expected output.
  6. Removal of Redundant Categories:

    • Removed “Relevance” from the second step to avoid confusion since “Relevance” is only applicable when the journal mentions restaurants and food dishes but does not specify any dishes.
  7. Code Enhancements:

    • Used with open context manager for better file handling.
    • Ensured that the prompt sent to the LLM includes clear instructions and follows the specified format.
  8. Avoiding Conflicting Instructions:

    • Ensured that the categories in the step-by-step instructions align with the final classification list.
    • Removed any overlapping or conflicting instructions that might lead the AI to default to “No Relevance” incorrectly.

Additional Recommendations:

  • Validation of AI Responses:

    • Implement a validation step in your code to ensure that the AI’s response matches one of the expected categories. If not, prompt the AI again or log the anomaly for review. you can add a step where other AI reviews the answer and if is not as expected redo the cycle.
  • Use of Few-Shot Learning:

    • Provide a few examples within the prompt with the desired output. This can enhance the AI’s understanding and improve accuracy.

Example with Few-Shot Learning:

prompt = f"""
You are an AI assistant specialized in classifying journal entries related to restaurants and food dishes. Your task is to analyze the given journal entry and classify it into one of the following categories: "No Relevance", "Relevance", "Vegetarian", "Vegan", "Both", or "Neither".

**Classification Criteria:**
1. **No Relevance**: The journal does not describe restaurants and food dishes.
2. **Relevance**: The journal describes restaurants and food dishes but does not mention any specific dishes.
3. **Vegetarian**: The journal describes restaurants and food dishes and mentions at least one vegetarian dish (without meat).
4. **Vegan**: The journal describes restaurants and food dishes and mentions at least one vegan dish.
5. **Both**: The journal describes restaurants and food dishes and mentions both vegetarian and vegan dishes.
6. **Neither**: The journal describes restaurants and food dishes but mentions neither vegetarian nor vegan dishes.

**Instructions:**
1. **Step 1:** Determine if the journal entry is about restaurants and food dishes.
   - If **No**, respond with **"No Relevance"** and stop.
   - If **Yes**, proceed to Step 2.
2. **Step 2:** Identify the types of dishes mentioned.
   - Check for **vegetarian** dishes (dishes without meat).
   - Check for **vegan** dishes (dishes without any animal products).
3. **Step 3:** Classify based on the presence of dishes:
   - **Both**: If both vegetarian and vegan dishes are mentioned.
   - **Vegetarian**: If only vegetarian dishes are mentioned.
   - **Vegan**: If only vegan dishes are mentioned.
   - **Neither**: If dishes are mentioned but none are vegetarian or vegan.

**Response Format:**
- Respond **only** with one of the following exact phrases (including capitalization and spacing):
  - "No Relevance"
  - "Relevance"
  - "Vegetarian"
  - "Vegan"
  - "Both"
  - "Neither"

**Examples:**

**Journal Entry:**
"I recently visited Gourmet Haven in downtown. Their grilled chicken and beef steak are exceptional. The chocolate lava cake was a delightful end to the meal."

**Response:**
"Neither"

**Journal Entry:**
"Fresh Greens offers a variety of salads and plant-based bowls. Their quinoa salad is both hearty and delicious."

**Response:**
"Vegetarian"

**Journal Entry:**
"Vegan Delight serves only plant-based dishes. Their tofu stir-fry and vegetable sushi are must-tries."

**Response:**
"Vegan"

**Journal Entry:**
"Sunrise Bistro provides both vegetarian and vegan options. Their veggie burger and almond milkshake are popular choices."

**Response:**
"Both"

**Journal Entry:**
"I attended a tech conference last weekend."

**Response:**
"No Relevance"

**Journal Entry:**
{journal}

**Response:**"""
2 Likes

Hello Nicolas,

thank you so much for this detailed and valuable information. I am very happy to receive such great support.
I will implement and experiment with your proposed solution.

Thank you also for the reference to the courses about prompt engineering. As I see both python skills and prompt engineering are essential for getting reliable results.

1 Like