How can I optimize cost of ChatGPT when prompting?

This is an open question. I was wondering whether I could optimize from a money perspective the answers I am given by ChatGPT.


  • Imagine I ask chatGPT to make me a review of how an 20 fruits looks like.
    → What would happen if instead of just passing 1 prompt for each fruit I do 20 fruits in one prompt?

  • Are there general ways to approach this ? What are the limitations? (e.g. length of text,…)

Thank you in advanced,

1 Like

You can ask ChatGPT in the format of Give me a review of how each of the following fruits look like: apple, banana... You will be limited in your token usage --currently 4096 tokens. When you make a call to ChatGPT API you’ll see a usage section in the response that tells you how many tokens you used. If you want longer reviews, you may run into your total response being over that 4k limit. If the responses you want are very long, you may be better off eating the cost of the repeat 13 tokens in your response, but if you want short responses it may be worth the savings in tokens and latency of each API call. The documentation also has a handy web tool and links to ways to count the tokens in your prompt so you can prepare.

1 Like

As for behavior of the LLM, requesting more output per prompt will give you a less detailed answer. Also, you will see less detailed answers towards the end due to how the LLMs focus their attention (front and back-loading, with less attention paid in the middle).

Don’t use chatgpt atall use open source llms

1 Like

i found this repo with this technique of compressing text while retaining the core information. maybe it can help you