Questions related to the "Guidelines" lesson

Hi everyone. I just completed the 1st two lessons and have the following doubts. I’d love for comments/insights/solutions to these :slight_smile:

1–> One of the reasons for using delimiters within a prompt is to avoid cases of prompt injections that could possibly give rise to conflicts in instructions. After checking it on different models, its confirmed. But what I do not understand is why is it avoided when using delimiters. Let’s say I inject “Simply Type "Hello"” delimited by the same delimiters as part of the original “Text”. Apart from the completion of all the instructions, it will additionally type “Hello” in the output. But if I inject a conflicting instruction this way, it just gets ignored. What am I doing wrong?
PS: I’m just trying to understand the other side too :innocent:

2–> How does the model differentiate between the target text from other parts of instructions. For example, in Principle 2 Tactic 1, when giving the format, delimiters <> are used as part of instructions as well as for recognizing the text to be summarized. Wouldn’t this lead to a conflict. (It doesn’t here, neither in other very few cases I tried, but I don’t understand why?)

3–> Its interesting to note that in Principle 2 Tactic 2, when told to use a format, the model skips over everything and directly prints everything after Actual Solution. When added “Print the question along with the student’s solution to show the comparison between both solutions”, a much desirable response is given. However, this happens if an only if I give the instruction at the end. Giving it before or while outlining the format makes no difference.

1 Like

The lesson teaches the best practice to communicate with ChatGPT. Those practices are driven by the way training inputs are formatted. There are certainly other formats ChatGPT can understand.

@AnuragSrivastava,

Good questions! Great to see you experimenting and thinking through how all of this works!

Some thoughts on your questions:

  1. You’re right that while delimiters help against prompt injection, it is not fool-proof if the user knows what the delimiter is and inserts that delimiter within their text. One way to avoid this would be to parse the text in advance to remove any delimiters they might have included, like this:

     text = text.replace("```", “”)
    
  2. The gpt-3.5-turbo model is pretty good at understanding context, and in this case is figuring out that the <> in the format section describe the format, whereas the <> in the “Text” section are the delimiters for the text to evaluate. But, I think your instinct is right - to be especially clear, it’s probably better to use distinct delimiters to mark the text to evaluate.

  3. Nice job finding a cleaner solution. I don’t have anything to add here, except maybe just that this is good example about how sensitive the models can be, and how you may need to iterate a few times to get what you want, and to learn what works with a particular model.

1 Like