In scope
- Prompt injection in direct user prompts.
- Indirect injection in retrieved documents, webpages, PDFs, email, tool output, and agent memory.
- Jailbreak attempts that ask the model to ignore or override policy.
- Data exfiltration attempts, including secret-bearing outbound actions.
- System prompt leakage and hidden-rule disclosure requests.
- Unsafe tool calls, including write, send, delete, execute, upload, webhook, and external API actions.
- Unsafe model outputs, including secret leakage and policy-violating completions.
- Obfuscated malicious instructions, including encoded, spaced, or disguised attack text.
Out of scope
- Magic perfect model safety. Firefish is a gateway control, not a guarantee that every model will behave perfectly.
- Replacing AppSec. Teams still need threat modeling, least privilege, secrets management, testing, and incident response.
- Guaranteeing every model response is true. Firefish is not a factuality oracle.
- Protecting systems that bypass the gateway. Traffic and tool calls must pass through Firefish to be enforced.