Off topic guardrails dataset

Off-topic guardrails can be helpful to block malicious or playful user prompts that intend to use the LLM application in an unintended way.

To train and benchmark such guardrails, I’ve built this dataset using GPT4o and the structured outputs. Synthetic data generation was done by seeding with real examples and random words.

Any feedback would be appreciated