Awesome Prompt Injection Defense 
A curated list of tools, papers, datasets, and resources for defending Large Language Models against prompt injection and indirect prompt injection attacks.
Prompt injection is the leading security risk in production LLM systems. The defense ecosystem is fragmented across academic preprints, vendor blogs, npm/PyPI utilities, and ad-hoc system prompts. This list is an attempt to bring it together in one place.
Contents
Detection libraries
Drop-in checks you call before passing untrusted text to an LLM.
- prompt-injection-shield (npm) - Zero-dep detector for classic override, URL exfiltration, system-prompt impersonation, and tool-call hijack patterns.
- prompt-injection-shield-py (PyPI) - Python port of the above with the same rule set.
- Rebuff - Self-hardening prompt injection detector, originally by ProtectAI.
- LLM Guard - Comprehensive LLM input/output security suite, includes prompt injection scanner.
- PromptArmor - Hosted prompt injection detection API.
- Lakera Guard - Commercial guardrail with a generous free tier.
RAG-specific guardrails
Prompt injection in RAG often hides inside retrieved documents (indirect injection) or poisoned vectors.
Evaluation datasets
Labeled corpora for benchmarking detectors.
- prompt-injection-eval - 74 hand-curated rows across 9 categories (classic override, URL exfil, system impersonation, tool hijack, role override, encoded, indirect RAG poison, borderline, benign). MIT.
- deepset/prompt-injections - Larger English/German prompt injection corpus.
- JailbreakBench - Standardised benchmark for jailbreaks (related but broader category).
- Lakera Gandalf prompts - Real attack prompts collected from the Gandalf game.
Live demos
Try detectors in the browser.
GitHub Actions for CI
Plug into pull-request flows.
- rag-guardrails-action - Composite Action wrapping prompt-injection-shield + vector-poison-score. Fail on high severity, warn on medium.
Research papers and preprints
Not strictly prompt-injection but commonly composed with it.
- agentvet - Validate LLM tool-call arguments before execution.
- agentguard - Network egress firewall for tool-using agents.
- agentcast - Validate-and-retry loop for structured outputs.
- agentsnap - Snapshot tests for tool-call traces.
- agentfit - Token-aware message truncation.
Background reading
Contributing
Send a pull request. Each entry should:
- Already exist (no vaporware or roadmaps).
- Be free or have a meaningful free tier.
- Have a one-line description of what makes it useful, not just a name.
Sort within each section alphabetically, except where the order is meaningful.
License

To the extent possible under law, the maintainer has waived all copyright and related rights to this list.