Policies
Turn a risk score into an enforceable verdict per agent, tool, and request attribute.
Content inspection produces a risk score and signals; a policy turns that into
a verdict (allow, deny, or require_approval) for a given tool, agent, and
set of request attributes. Policies are authored in the dashboard
(Outbound → Policies) or over the workspace API.
Not part of the public gateway contract
Policy CRUD is a workspace-authenticated control-plane API (session or an API
key with policies:read / policies:write), not part of the versioned
gateway contract. The shapes below are the source of truth.
Anatomy of a policy
| Field | Type | Notes |
|---|---|---|
name | string | 1–120 chars. |
toolPattern | string (glob) | Matches the tool name; * matches one or more characters (github.*, aws.delete_*, exact gmail.send). |
action | allow · deny · require_approval | The verdict when the rule fires. |
riskThreshold | int 0–100 · null | Fallback rule: deny when the call's risk ≥ threshold. null for a non-threshold rule. |
signalCategory | secret · pii · destructive · injection · egress · null | Fire only when a detector flagged that category. |
context | object · null | Attribute conditions (ABAC). See below. |
priority | int | Lower is evaluated first. Default 100. |
enabled | boolean | Default true. |
How a verdict is reached
Enabled rules whose toolPattern and context match the call are evaluated in
priority order (lowest number first) under deny-overrides:
- A matching
denywins: the call is blocked. - Otherwise a matching
require_approvalholds the call for a human (the SDK waits the hold out and resolves to the finalallow/deny). - Otherwise a matching
allowpasses the call. - If no rule sets a verdict, the highest-priority
riskThresholdrule denies when the call's risk ≥ its threshold. - Otherwise, allow.
Explicit rules (no signalCategory) take precedence over signal-aware ones, so a
blanket deny can't be undercut by a narrower signal rule.
Attribute conditions (ABAC)
A policy's optional context fires the rule only when the request satisfies
every constraint present (AND). Each set constraint supports negate.
| Constraint | Shape | Matches |
|---|---|---|
ip | { anyOf: string[], negate? } | Source IP in a CIDR list (or exact IPv4). |
time | { windows: [{ days?, start, end }], tz?, negate? } | Time-of-day windows in an IANA tz (default UTC); a window wraps past midnight when end ≤ start. |
resource.environment | { anyOf: ["production"…], negate? } | The target's environment, from the MCP server registry. |
resource.type | { anyOf: ["database","http_api","filesystem","messaging","other"], negate? } | The target's resource type. |
resource.host | { anyOf: string[], negate? } | Destination host; *.corp.com matches that suffix or below. |
agent.labels | { anyOf: string[], negate? } | The governing agent's identity labels (case-insensitive). |
mlThreatClass | { anyOf: ["prompt_injection","jailbreak","data_exfiltration","malware","social_engineering","policy_violation"] } | The agent's most recent ML threat assessment. Fail-open when no assessment exists. |
A combined example that holds any production-database write outside business hours:
{
"name": "Approve prod DB writes off-hours",
"toolPattern": "db.*",
"action": "require_approval",
"context": {
"resource": {
"environment": { "anyOf": ["production"] },
"type": { "anyOf": ["database"] }
},
"time": {
"windows": [{ "days": [1, 2, 3, 4, 5], "start": "09:00", "end": "18:00" }],
"tz": "America/New_York",
"negate": true
}
},
"priority": 50,
"enabled": true
}Test before you ship
POST /api/policies/simulate: dry-run a tool call against your live policy set; returns the decision, reason, risk, signals, and which policy matched.POST /api/policies/backtest: replay a draft rule over recent audit logs and report how many decisions would flip, before you enable it.POST /api/policy-suggestions/generate: mine recent traffic for candidate rules; accept or dismiss each viaPATCH /api/policy-suggestions/{id}.
Next steps
- Content-inspection engine: the signals and risk a policy acts on.
- Secrets broker: inject upstream credentials a policy never exposes.
- MCP gateway: where a server's environment and resource type are set.