When we use format_instructions (StructuredOutputParser) in template, we have to mention
name=“gift”, description=“Was the item purchased as a gift for someone else?
Answer True if yes, False if not or unknown.”
In template again we write below in the beginning:
…
gift: Was the item purchased as a gift for someone else?
Answer True if yes, False if not or unknown.
…
{format_instructions}
Isn’t this duplication of instruction regarding gift?
And when we send this as message to GPT, it considers every word as a token.
hence, don’t we send the same tokens twice ?
Hi Lalit_Somnathe,
Good question. This seems to be necessary in order to instruct clearly what should be extracted. When I cut out the part in review_template_2 that states
gift: Was the item purchased as a gift for someone else?
Answer True if yes, False if not or unknown.
delivery_days: How many days did it take for the product
to arrive? If this information is not found, output -1.
price_value: Extract any sentences about the value or price,
and output them as a comma separated Python list.
the output value for the “gift” key becomes incorrect:
{
"gift": false,
"delivery_days": "2",
"price_value": ["It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."]
}
This is probably due to the descriptions in the ResponseSchema’s not being clear instructions, as they form part of the JSON blob. I guess one way to save on input tokens is to limit descriptions if they do not add much to the instructions.