Architecture | Purple Firefish

Core components

Normalizes text, classifies source type, applies deterministic rules, local lexical signals, optional anomaly checks, and configured judge routing.

Maps risk score, attack type, and source context into ALLOW, FLAG, REDACT, SANDBOX, REQUIRE_APPROVAL, or BLOCK.

Mediates model/runtime calls so protected chat traffic can be evaluated before and after model interaction.

Checks model output for secrets, unsafe content, and policy violations before the response reaches a user.

Inspects streaming response chunks while output is still moving through the gateway.

Evaluates proposed tool calls against action class, destination risk, secret exposure, reversibility, and user goal alignment.

Redacts sensitive material from tool results before model or operator consumption.

Stores decisions, reason codes, redacted previews, traces, and latency metadata for review.

Shows mission control, threat lab, tool firewall, benchmark center, demo flows, and technical proof points under /app/.

Classify

Normalize input and identify source type.

Detect

Run deterministic and configured analysis layers.

Decide

Apply policy without lowering global block thresholds.

Audit

Save redacted traces for operators and reviewers.