Hello, I try to write a prompt that classifies the files of the lesson into several categories (exercise 2 of lesson 3).
my code looks like
files = [“cape_town.txt”, “madrid.txt”, “rio_de_janeiro.txt”,
“sydney.txt”, “tokyo.txt”]
for file in files:
# Read journal file for the city
f = open(file, “r”)
journal = f.read()
f.close()
prompt = f"""Classify the journal entry by responding with one of the following: "Vegetarian", "Vegan", "Both", "Neither" or No".
- Respond "No" if the journal does not describe restaurants and food dishes.
- Respond "Vegetarian" if the journal describes restaurants and food dishes and mentions a vegetarian dish.
- Respond "Vegan" if the journal describes restaurants and food dishes and mentions a vegan dish.
- Respond "Both" if the journal describes restaurants and food dishes and mentions both a vegetarian and a vegan dish.
- Respond "Neither" if the journal describes restaurants and food dishes but mentions neither a vegetarian nor a vegan dish.
Journal:
{journal}"""
# Use LLM to determine if the journal entry is useful
print(f"{file} -> {get_llm_response(prompt)}")
Unfortunately I always get the answer “No”.
I tried to rewrite the prompt like this:
prompt = f"""Classify the journal entry by responding with one of the following: "No relevance", "Relevance", Vegetarian", "Vegan", "Both" or "Neither".
Go step by step:
Step 1: In a first step check whether the journal is about restaurants and food dishes.
If it is not respond "No relevance" and do not take any further action.
If the journal is about restaurants and food dishes then perform a second step as follows:
Step 2:
- Respond "Relevance"
Journal:
{journal}"""
This works.
But as soon as I replace Step 2
# Refining the prompt
prompt = f"""Classify the journal entry by responding with one of the following: "No relevance", "Relevance", Vegetarian", "Vegan", "Both" or "Neither".
Go step by step:
Step 1: In a first step check whether the journal is about restaurants and food dishes.
If it is not respond "No relevance" and do not take any further action.
If the journal is about restaurants and food dishes then perform a second step as follows:
Step 2:
Respond "Vegetarian" if the journal mentions a vegetarian dish (without meat)
Respond "Vegan" if the journal mentions a vegan dish.
Respond "Both" if the journal mentions both a vegetarian and a vegan dish.
Respond "Neither" if the journal mentions neither a vegetarian nor a vegan dish.
Journal:
{journal}"""
I end up in “No relevance” and surprisingly “Relevance” for rio_de_janeiro.txt (although not listed in Step2, but specified in my classification list “No relevance”, “Relevance”, Vegetarian", “Vegan”, “Both” or “Neither”. If I do not put it in there, all will end up in “No relevance”.)
How can I overcome this problem in a structured way.
Would it not be better, to run several for-loops:
In loop-1 sorting out non relevant journals I create a helper list relevant_journals.
In loop-2 and 3 I fill two helper lists journal_with_at_least_one_vegatarian_dish and journal_with_at_least_one_vegan_dish and then finally run
loop-4 classifying the journals based on whether they are in these lists?
Looks more reliable to me but I is is not flexible. If I add or modify the classification I have to modify the code.
So, who can help for a solution and let llm do the job?