Evaluation

Manishankar · August 31, 2023, 11:24am

I tried the Evaluation in my own jupyter notebook but after typing List of products it just stuck there and doesn’t give any kind of output but if I try it out in your notebook list is displaying. Whether it is because of “Rate limit reached for default-gpt-3.5-turbo” ? and whether using time sleep is an efficient method for our local jupyter notebook?

Nataraj · January 5, 2025, 9:00pm

Make sure you download/copy the Utils.py and the products.json files

Update the import sections

import os
import openai
import sys
#sys.path.append('../..')
sys.path.append('.')
import utils

Update the below sections

def get_completion_from_messages(messages, model="gpt-3.5-turbo", temperature=0, max_tokens=500):
    response = openai.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature, 
        max_tokens=max_tokens, 
    )
    print ("In get completion from messages....")
    return response.choices[0].message.content

Make sure you update the utils.py as well

def get_completion_from_messages(messages, model="gpt-3.5-turbo", temperature=0, max_tokens=500):
    response = openai.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature, 
        max_tokens=max_tokens, 
    )
    return response.choices[0].message.content

Below is the updated function for process_user_message

def process_user_message(user_input, all_messages, debug=True):
    delimiter = "```"
    
    # Step 1: Check input to see if it flags the Moderation API or is a prompt injection
    response = openai.moderations.create(input=user_input)
    #moderation_output = response["results"][0]
    moderation_output = response.results[0].flagged
        #print(moderation_output)
    
   # if moderation_output["flagged"]:
   #     print("Step 1: Input flagged by Moderation API.")
   #    return "Sorry, we cannot process this request."
    
    if moderation_output:
        print("Step 1: Input flagged by Moderation API.")
        return "Sorry, we cannot process this request."


    if debug: print("Step 1: Input passed moderation check.")
    
    category_and_product_response = utils.find_category_and_product_only(user_input, utils.get_products_and_category())
    print("category_and_product_response")
    
    # Step 2: Extract the list of products
    category_and_product_list = utils.read_string_to_list(category_and_product_response)
    print(category_and_product_list)

    if debug: print("Step 2: Extracted list of products.")

    # Step 3: If products are found, look them up
    product_information = utils.generate_output_string(category_and_product_list)
    if debug: print("Step 3: Looked up product information.")

    # Step 4: Answer the user question
    system_message = f"""
    You are a customer service assistant for a large electronic store. \
    Respond in a friendly and helpful tone, with concise answers. \
    Make sure to ask the user relevant follow-up questions.
    """
    messages = [
        {'role': 'system', 'content': system_message},
        {'role': 'user', 'content': f"{delimiter}{user_input}{delimiter}"},
        {'role': 'assistant', 'content': f"Relevant product information:\n{product_information}"}
    ]

    final_response = get_completion_from_messages(all_messages + messages)
    if debug:print("Step 4: Generated response to user question.")
    all_messages = all_messages + messages[1:]
    
    # Step 5: Put the answer through the Moderation API
    response = openai.moderations.create(input=final_response)
    #moderation_output = response["results"][0]
    moderation_output = response.results[0].flagged

    #if moderation_output["flagged"]:
    if moderation_output:
        if debug: print("Step 5: Response flagged by Moderation API.")
        return "Sorry, we cannot provide this information."

    if debug: print("Step 5: Response passed moderation check.")

    # Step 6: Ask the model if the response answers the initial user query well
    user_message = f"""
    Customer message: {delimiter}{user_input}{delimiter}
    Agent response: {delimiter}{final_response}{delimiter}

    Does the response sufficiently answer the question?
    """
    messages = [
        {'role': 'system', 'content': system_message},
        {'role': 'user', 'content': user_message}
    ]
    evaluation_response = get_completion_from_messages(messages)
    if debug: print("Step 6: Model evaluated the response.")

    # Step 7: If yes, use this answer; if not, say that you will connect the user to a human
    if "Y" in evaluation_response:  # Using "in" instead of "==" to be safer for model output variation (e.g., "Y." or "Yes")
        if debug: print("Step 7: Model approved the response.")
        return final_response, all_messages
    else:
        if debug: print("Step 7: Model disapproved the response.")
        neg_str = "I'm unable to provide the information you're looking for. I'll connect you with a human representative for further assistance."
        return neg_str, all_messages

user_input = "tell me about the smartx pro phone and the fotosnap camera, the dslr one. Also what tell me about your tvs"
response,_ = process_user_message(user_input,[])
print(response)

Topic		Replies	Views
L7 - Confusing with notebook content and Video content Building Systems with the ChatGPT API	3	178	May 21, 2024
L5: Generate_output_string() function extracting products not among products/categories mentioned in user query Building Systems with the ChatGPT API	1	26	November 29, 2024
OpenAI moderation for chatbot Building Systems with the ChatGPT API	0	141	September 14, 2023
L7_student notebook possible errors Building Systems with the ChatGPT API	0	108	March 1, 2024
Evaluation prompt - part I :: v2 responses are more ambiguous than v1. Opposite to what is dictated by Andrew Ng Building Systems with the ChatGPT API	1	134	November 1, 2023

Evaluation

Related topics