Creating Policies
A Tool Guard policy is a YAML file with three required sections: identity (policy_id, name, version, status, mode), a scope that says which tool calls the policy applies to, and a list of rules that say what to do.
Anatomy
policy_id: pol-refund-cap # unique within the loaded set
name: refund-cap # human label
description: >
Allow refunds under $500; deny above.
version: 1
status: approved # draft | review | approved | archived
mode: enforcement # enforcement | shadow
scope:
tool_names: [issue_refund] # match by tool_name AND/OR
tool_groups: [monetary_outflow] # by tool_group
rules:
- rule_id: rule-amount-cap
name: Single refund amount limit
rule_type: threshold
conditions:
field: amount
operator: gt
value: 500
effect: deny
effect_config:
severity: high
suggested_response: "Refunds over $500 require supervisor approval."
citation:
document_id: doc-refund-sop
section: "2.3"
excerpt: "Individual refund transactions exceeding $500 require supervisor approval prior to processing."
tg lint -policy validates the shape and runs eight checks - warnings and errors (see tg lint heuristics). A policy that lints with error-severity findings will be REFUSED by tg-proxy at load.
Scope
scope.tool_names and scope.tool_groups are independently matched: either the envelope's tool_name is in tool_names, OR its tool_group is in tool_groups. A policy with empty scope matches every envelope - tg lint warns about this as policy-scope-leak.
The recommended pattern is BOTH a tool_names allowlist AND a tool_groups allowlist. The lint heuristic scope-no-tool-group catches tool-substitution bypasses (an attacker pivots to a sibling tool in the same group).
Modes
- `enforcement` - the decision returned to the agent is applied.
- `shadow` - the decision is computed and audited but the agent is
always told allowed. Useful for rolling out a new rule and watching how it would have behaved against real traffic.
A near-miss flag on the trace records which mode the rule was operating in and what the "would-be" decision would have been.
Effects
| Effect | Behaviour | |---|---| | allow | Pass-through; trace is still audited | | flag | Pass-through; severity recorded; useful for observation | | escalate | Returns 202 Accepted; agent gets a poll_url; operator approves / denies via /escalations/ | | deny | Returns 200 with denied; agent is expected not to call the tool |
Severity hierarchy: deny > escalate > flag > allow. When multiple rules fire, the strongest effect wins.
Conditions - the leaf shapes
Field comparison
conditions:
field: parameters.amount
operator: gt
value: 500
Operators:
| Operator | Notes | |---|---| | eq, neq | Equality; nil values are rejected at load time. | | gt, gte, lt, lte | Numeric comparison. String values that parse as numbers are coerced. | | gt_field, lt_field | Compare against another field of the envelope. | | in | Value is a list; field's value must be a member. | | contains | Substring match. | | regex | Go RE2; pattern compiled at policy load. No backreferences / lookaround. |
SQL classifier
conditions:
sql_classify:
field: parameters.sql
dialect: postgres # postgres | mysql | sqlite | mssql
require:
top_level_kinds: [SELECT] # allow only single-stmt SELECT
no_dynamic_sql: true # forbid DO / EXECUTE / CALL
no_program_exec: true # forbid COPY ... FROM/TO PROGRAM
allowed_functions: [count, sum, avg, min, max, now, lower, upper, length, coalesce]
denied_tables: [audit_log, secrets] # prefix match
allowed_tables: [users, orders]
Closes the structured-DDL-smuggle / DML-via-SELECT / function-exec bypass families. See examples/postgres-attack/policies/db-protect.yaml for a full bundle.
Path classifier
conditions:
path_classify:
field: parameters.path
require:
clean_first: true # normalise `..`, `.`, `//` first
absolute_only: true # reject relative paths
allowed_canonical_prefixes: [/var/log/] # canonical path must start here
deny_shell_metas: true # ; | & $() ` deny
include_backslash: true # Linux-only mode treats \ as meta
resolve_symlinks: true # EvalSymlinks before matching
deny_on_resolve_failure: true # unresolvable path → deny (write tools)
max_path_length: 4096
Closes traversal / symlink / shell-meta bypass families on filesystem-path tools.
Shell classifier
conditions:
shell_classify:
field: parameters.argv # MUST be a parsed argv list
require:
argv0_allowlist: [uptime, date, whoami, df, pwd]
# Path-shaped argv elements are filepath.Clean'd and matched
# against these prefixes; ** is a deep glob.
denied_argv_paths: [/etc/, /root/, /home/*/.ssh/]
resolve_symlinks: true
argv_env_pattern_deny: '^(BASH_FUNC_|LD_PRELOAD|PATH=)'
argv_deny_patterns: ['(?i)\b(rm|dd|mkfs|wget|curl)\b']
The tool MUST exec argv directly, never sh -c. The classifier treats argv[0] as the binary and argv[1:] as its arguments. Closes shell-meta / env-wrap / interpreter-substitution bypasses (env, sudo, nice, chroot re-launching arbitrary binaries).
LLM classifier (multimodal)
conditions:
llm_classify:
prompt_field: parameters.prompt
image_url_field: parameters.source_image_url # optional; multimodal
model: gemma4:e4b
ollama_url: http://localhost:11434 # optional; default localhost
timeout_seconds: 30
forbidden:
- csam
- real_person_likeness
- weapons
- self_harm
- copyrighted_style
- extremist_content
The classifier asks Ollama-served Gemma 4 to assign the prompt to one of the forbidden labels or safe. Any non-safe verdict fires the rule. Empty model responses are treated as model_refused (fail-closed). Confidence below 0.6 is ambiguous (also fail-closed). Image URLs are fetched through an SSRF-hardened client.
See content-gen-bundle.md for a full walk-through.
Condition trees
Compose with and, or, not:
conditions:
and:
- field: tool_name
operator: eq
value: issue_refund
- or:
- field: amount
operator: gt
value: 500
- field: parameters.reason
operator: regex
value: '\$[0-9,]{4,}'
- not:
field: context.verified.justification.verified
operator: eq
value: true
not: may wrap a leaf condition (as above) but not a classifier (sql_classify / path_classify / shell_classify / llm_classify). Classifiers are fail-closed - they fire on malformed or adversarial input so a deny rule trips - and negating one inverts that into fail-OPEN. ValidatePolicy rejects any classifier under a not: node; express the allowed set positively (e.g. require.top_level_kinds: [SELECT]) instead.
ValidatePolicy refuses condition trees deeper than 64 nodes (stack-exhaustion defence).
tg lint heuristics
| Rule | Severity | Catches | |---|---|---| | policy-scope-leak | warn | Empty scope - policy matches every call | | scope-no-tool-group | warn | Tool-substitution bypass surface | | amount-without-semantic-check | warn | Free-text amount fragmentation; suppressed when a compiling regex / contains rule on a non-amount free-text field is present | | rule-missing-citation | warn | Auditor traceability gap | | rule-id-collision | error | Two rules share an ID | | invalid-regex-syntax | error | Regex that doesn't compile in RE2 | | unknown-operator | error | Operator the engine cannot evaluate | | structural-validation | error | Empty / multi-form condition, bad operator/value type, oversize glob, bad LLM-classify shape |
Run tg lint -policy after every edit. The CI workflow includes a lint gate.
Effect config
effect_config:
severity: critical # low | medium | high | critical
escalate_to: approver # role label, used by the approver UI
timeout_minutes: 30 # escalation expiry
suggested_response: "Refunds over $500 require supervisor approval."
timeout_minutes on an escalate effect overrides the proxy's default escalation timeout (-escalation-default-timeout-min).
Citations
Every rule should reference the policy document it implements:
citation:
document_id: doc-finance-sop
section: "3.1"
excerpt: >
Per the FY26 Finance SOP, individual purchase orders exceeding
$10,000 require written CFO approval before submission.
This is what the auditor sees on the trace - the WHY behind the decision. tg lint warns on rules with missing citations.
Example policy bundles
| Bundle | Surface | |---|---| | examples/postgres-attack/policies/ | SQL + shell + path for DB-tool agents | | examples/finance-cfo/policies/ | Procurement, wires, expenses | | examples/business-ops/policies/ | Data export, mass communication, support comp | | examples/content-gen/policies/ | Image / audio / text generation with Gemma 4 | | examples/sample-app/policies/ | The Quick Start refund policy |
Each bundle ships a test-policies.sh curl-driven suite that asserts the policy's correctness against a live proxy. Use them as templates for your own policies.
Reloading at runtime
POST /reload (or kill -HUP $(pidof tg-proxy)) re-reads -policy-dir. Validation runs on every load; if any file fails, the OLD policy set stays live and the proxy logs the error. There is no half-load state.
Validation surface
ValidatePolicy (run on every load) refuses:
- empty / multi-form condition nodes
- nil values on `eq` / `neq`
- numeric operator on a non-numeric value
- regex that doesn't compile
- glob with > 2 `**` segments
- LLM-classify with empty `forbidden` list, `safe` as a forbidden
label, duplicates, comma/newline/quote in a label, > 64 labels, bad ollama_url scheme, timeout_seconds outside [0, 120]
A policy that fails any of these gates is rejected at load; the previously loaded policy set stays live.