I built a system prompt architecture that forces LLMs (like Gemini 3.0 / Claude 4.5) to self-audit before responding.
It splits the model into two personas:
Agent: Generates the draft.
Auditor: Critiques the draft for logic errors and hallucinations.
The Code (Gist): [https://gist.github.com/ginsabo/9d99ee98068d904b214be9351f09…]
I also wrote a documentation book about the theory, but the Gist above is fully functional and free to use. I’d love to hear your feedback on this “User-Side Safety” approach.