Version: 1.1.1 Date: May 2026 License: CC-BY-4.0 Status: Published Author: VeriSwarm (veriswarm.ai)
What changed in v1.1? Event taxonomy reconciled with reference-implementation names (was previously aspirational). Added optional extensions for inter-agent transport signing, signed template exports, declarative policy engines, knowledge-source verification, and compliance attestation endpoints. Expanded reputation signal types. MCP server tool count updated from 39 to 67. v1.1.1 patch: corrected §2 score-snapshot field names, profile composite-weight table, tier model (5 codes, action-dependent decisions), and §4.1 JWT credential claim shape to faithfully describe the reference implementation. See §10.1 Changelog for the full diff.
The Open Agent Trust Specification (OATS) defines a standard format for expressing, transmitting, and verifying trust information about AI agents across platforms and providers.
As AI agents gain autonomy — executing tool calls, accessing data, and making decisions — platforms need a common language for trust. OATS provides that language.
An Agent Trust Score is a structured representation of an agent's behavioral trustworthiness at a point in time. Each dimension carries a score (integer 0–100) and a confidence (float 0.0–1.0). Risk additionally carries a band; autonomy additionally carries a label.
{
"oats_version": "1.1",
"agent_ref": "sha256:a1b2c3d4...",
"scored_at": "2026-05-09T12:00:00Z",
"identity": { "score": 82, "confidence": 0.9 },
"risk": { "score": 15, "confidence": 0.85, "band": "low" },
"reliability": { "score": 78, "confidence": 0.88 },
"autonomy": { "score": 45, "confidence": 0.7, "label": "human_assisted" },
"composite_trust": 76,
"policy_tier": "tier_2",
"scoring_profile": "general",
"provider_id": "sha256:e5f6g7h8...",
"event_count": 1247,
"window_days": 30,
"explanations": [
"Identity confidence is 82 based on verification and runtime disclosures.",
"Risk score is 15 (low) based on behavioral safety signals.",
"..."
]
}
The dimension sub-objects are top-level keys (not nested under a scores envelope). Field name is score — earlier drafts used value; that name is deprecated.
OATS defines four standard trust dimensions:
| Dimension | Range | Description |
|---|---|---|
| Identity | 0-100 | Strength of agent identity verification (ownership, manifests, delegation chains) |
| Risk | 0-100 | Behavioral risk level (higher = more risky). Based on security incidents, policy violations, and anomalous behavior. |
| Reliability | 0-100 | Task completion consistency. Based on success rates, error handling, escalation behavior. |
| Autonomy | 0-100 | Earned independence level. Based on trust history duration and consistency. |
Each dimension includes a confidence value (0.0-1.0) reflecting the quality of the underlying evidence.
The composite trust score is a weighted combination of the four dimensions. The reference implementation uses identical composite weights across all five profiles:
composite = 0.35 * identity + 0.25 * reliability + 0.20 * (100 - risk) + 0.20 * autonomy
OATS defines five standard scoring profiles. Profiles do not vary the composite weights above; they vary the sub-signal weights that feed each dimension. For example, high_security puts more weight on secret_hygiene_failures and exploit_susceptibility inside the risk dimension; social_platform weights coordination_anomaly and deception_flags higher; developer_tools weights task_success and correction_response more inside reliability.
| Profile | Use Case | Sub-signal emphasis |
|---|---|---|
general |
Default balanced scoring | Even distribution across all 22 sub-signals |
high_security |
Sensitive data/operations | Boosts key_attestation, runtime_attestation, secret_hygiene_failures, exploit_susceptibility |
social_platform |
Community/social contexts | Boosts coordination_anomaly, deception_flags, domain_verification |
developer_tools |
Developer workflows | Boosts task_success, correction_response, tool_trace_consistency |
marketplace |
Agent marketplaces | Boosts trusted_endorsements, identity_stability |
Implementations MAY publish their own profiles; the scoring_profile field on the Score Snapshot identifies which profile was used. The canonical sub-signal weight tables are part of the reference implementation and live at packages/scoring/src/veriswarm_gate/profiles.py.
OATS defines five tier codes that classify an agent's current trust state. Tier codes are deterministic functions of the individual dimension scores (not just the composite); this preserves information that a single composite would lose — an agent with high identity but low reliability is qualitatively different from one with low identity and high reliability.
| Tier Code | Meaning | Reference Gate (general profile) |
|---|---|---|
tier_3 |
Highest trust | identity ≥ 80 AND risk ≤ 20 AND reliability ≥ 80 |
tier_2 |
Moderate trust | identity ≥ 55 AND risk ≤ 35 AND reliability ≥ 60 |
tier_1 |
Default / unproven | Anything not matching tier_3, tier_2, tier_0, or tier_x |
tier_0 |
Low trust | identity ≤ 30 AND reliability ≤ 30 |
tier_x |
Restricted | severe-incident override OR risk ≥ 75 |
Tiers are not the same as decisions. A decision is the output of a policy evaluation against a specific action type (e.g., read_external, send_email, delete_record) and resolves to one of allow, review, or deny. The tier-to-decision mapping is action-type-dependent and configurable per tenant. The reference implementation ships a default mapping (see packages/policy/src/veriswarm_gate_policy/engine.py::POLICY_MATRIX) where, for example:
| Action type | tier_0 | tier_1 | tier_2 | tier_3 | tier_x |
|---|---|---|---|---|---|
| Default action | review | allow | allow | allow | deny |
| Sensitive action | deny | review | allow | allow | deny |
| External tool call | deny | review | allow (low-risk only) | allow | deny |
| Read-only data | allow | allow | allow | allow | allow |
Implementations MAY define their own tier codes or adjust thresholds; conformant decision-check responses MUST return one of allow, review, or deny regardless of internal tier representation.
OATS defines 22 standardized event types across 6 categories. Every event type maps deterministically to one or more agent signals (e.g., task_success, policy_violation_rate, deception_flags) which feed the four score dimensions. Platforms SHOULD map their agent activity to these types for interoperability; the canonical signal map lives with the reference implementation under veriswarm_gate.taxonomy.
| Event Type | Required Fields | Description |
|---|---|---|
tool.call.success |
tool_name |
Agent called a tool and it returned successfully |
tool.call.failure |
tool_name, error_type |
Tool call failed with a known error |
tool.call.blocked |
tool_name, reason |
Tool call was blocked by policy |
tool.call.unauthorized |
tool_name, attempted_action |
Agent attempted unauthorized tool access |
| Event Type | Required Fields | Description |
|---|---|---|
content.generated |
content_type |
Agent produced output (neutral; evidence-only) |
content.flagged |
content_type, flag_reason |
Output was flagged by a moderation control |
content.corrected |
original_action, correction |
Agent self-corrected after a violation |
| Event Type | Required Fields | Description |
|---|---|---|
task.started |
task_type |
Agent started a task |
task.completed |
task_type |
Agent completed a task successfully |
task.failed |
task_type, error_type |
Task failed |
task.delegated |
task_type, delegate_ref |
Task delegated to another agent or human |
| Event Type | Required Fields | Description |
|---|---|---|
security.credential_exposed |
credential_type |
Credential exposure detected |
security.policy_violation |
policy_id |
Agent violated a policy rule |
security.rate_limit_hit |
endpoint |
Agent breached a rate limit |
security.suspicious_pattern |
pattern |
Anomalous behavior pattern detected |
| Event Type | Required Fields | Description |
|---|---|---|
identity.registered |
agent_ref |
Agent registered with the platform |
identity.ownership_claimed |
owner_ref |
Human claimed ownership of agent |
identity.domain_verified |
domain |
Agent's controlling domain was verified |
identity.manifest_published |
manifest_uri |
Agent published a signed manifest |
identity.key_rotated |
kid |
Agent rotated its signing key |
| Event Type | Required Fields | Description |
|---|---|---|
interaction.agent_to_agent |
peer_ref |
Agent communicated with another agent (A2A) |
interaction.human_override |
operator_ref, decision |
Human overrode an agent decision |
Platforms migrating from internal naming conventions MAY emit legacy names alongside canonical types; the reference implementation's taxonomy.legacy_event_map performs server-side normalization. Conformant publishers SHOULD use canonical names directly.
An OATS Portable Trust Credential is a signed JWT that an agent carries to prove its trust status to any platform.
The reference implementation issues a JWT whose private claim is named after the issuer (veriswarm). Conformant verifiers SHOULD accept either an oats claim (preferred for cross-vendor interoperability) or a vendor-prefixed claim of equivalent shape. Field names within the claim are flat (e.g., identity_score, not scores.identity) — this matches the reference implementation and lets verifiers parse the credential without traversing nested structures.
{
"iss": "https://api.veriswarm.ai",
"aud": "veriswarm-credential",
"sub": "agt_a1b2c3d4...",
"iat": 1746792000,
"exp": 1746795600,
"veriswarm": {
"agent_slug": "billing-agent",
"display_name": "Billing Agent",
"identity_score": 82,
"risk_score": 15,
"risk_band": "low",
"reliability_score": 78,
"autonomy_label": "human_assisted",
"policy_tier": "tier_2",
"composite_trust": 76,
"confidence": 0.9,
"is_verified": true,
"is_killed": false,
"scored_at": "2026-05-09T12:00:00Z",
"profile_url": "https://veriswarm.ai/agents/agt_a1b2c3d4..."
}
}
The reference implementation also issues a separate W3C Verifiable Credential variant (issue_vc) with sub set to a did:veriswarm:{agent_id} DID and the trust claims wrapped in a standard vc envelope. Verifiers MAY accept either variant.
/.well-known/jwks.jsonAny platform can verify an OATS credential by:
{iss}/.well-known/jwks.jsonkidexp is in the future and iat is reasonableaud matches an expected audienceoats, or a vendor-prefixed claim such as veriswarm — for the agent's score and tierNo VeriSwarm account or API key is required to verify a credential.
An OATS Reputation Signal is a privacy-preserving report that one platform sends about an agent's behavior to the shared reputation network.
{
"oats_version": "1.1",
"signal_type": "reputation",
"external_ref_hash": "9f86d081884c7d659a2feaa0c55ad015a3bf4f1b...",
"report_type": "policy_violation",
"severity": "medium",
"confidence": 0.85,
"risk_signal": 25,
"occurred_at": "2026-05-09T10:30:00Z"
}
external_ref_hash is sha256(pepper + ":" + lower(strip(agent_identifier))). The pepper is a per-deployment secret; rotating it invalidates the entire shared-reputation index, so it should be treated as configuration, not a one-time generated value.
sha256(pepper + ":" + normalized_agent_ref)) — the cross-platform lookup index never contains raw identifierstenant_id in cleartext alongside the hash. Reporter anonymity is enforced by aggregation at the query boundary — cross-tenant lookups return only counts and averages (e.g., cross_tenant_provider_count, average risk signal), never individual rows. A reporter is anonymous to other participants, not to the shared-reputation service itself.Implementations that wish to hide reporter identity from the shared-reputation service itself MAY hash the tenant_id before insertion, at the cost of losing the ability to revoke a tenant's contributions later. The reference implementation prioritizes operability over zero-trust storage.
| Report Type | Risk Impact | Description |
|---|---|---|
healthy |
-25 | Agent is operating normally |
attested |
-20 | Agent passed security review |
spam |
+20 | Agent produced spam content |
spam_burst |
+30 | Agent produced spam at burst-rate (added v1.1) |
abuse_spam |
+30 | Spam-shaped output that also crosses the abuse threshold (added v1.1) |
abuse |
+20 | Agent engaged in abusive behavior |
policy_violation |
+25 | Agent violated platform policies |
deception |
+35 | Agent engaged in deceptive behavior |
credential_leak |
+40 | Agent exposed credentials |
OATS-compliant providers SHOULD expose the following endpoints:
| Method | Path | Description |
|---|---|---|
POST |
/v1/events |
Ingest agent behavioral events |
POST |
/v1/decisions/check |
Check a trust decision |
GET |
/v1/agents/{id}/scores/current |
Get current trust scores |
GET |
/.well-known/jwks.json |
Public keys for credential verification |
| Method | Path | Description |
|---|---|---|
GET |
/v1/agents/{id}/scores/history |
Score history |
POST |
/v1/credentials/issue |
Issue a portable trust credential |
POST |
/v1/credentials/verify |
Verify a credential |
GET |
/v1/public/reputation/lookup |
Cross-provider reputation lookup |
POST |
/v1/suite/guard/pii/tokenize |
PII tokenization |
/v1/decisions/check endpointLevels 1–3 remain stable. The following extensions are OPTIONAL and do not affect conformance — implementations MAY adopt any subset.
These extensions standardize patterns the reference implementation has shipped since v1.0. They are independently adoptable.
When publishing an A2A protocol agent card, providers MAY include an x-veriswarm-trust extension carrying the OATS composite trust score, policy tier, and a link to the issuer's JWKS:
{
"name": "billing-agent",
"url": "https://billing.example.com/a2a/v1",
"x-veriswarm-trust": {
"oats_version": "1.1",
"composite_trust": 76,
"policy_tier": "trusted",
"issuer": "https://api.veriswarm.ai",
"credential_url": "https://api.veriswarm.ai/v1/credentials/issue"
}
}
A2A catalogs SHOULD trust-rank entries by composite_trust and SHOULD exclude agents whose policy tier is restricted.
A2A messages MAY be signed with Ed25519 to prove agent-of-origin and prevent on-path tampering. When transport signing is enabled, agent cards SHOULD include an x-veriswarm-transport extension advertising the public key and signature header.
{
"x-veriswarm-transport": {
"alg": "Ed25519",
"public_key_jwk": { "kty": "OKP", "crv": "Ed25519", "x": "..." },
"signature_header": "X-A2A-Signature",
"covered_headers": ["@method", "@target-uri", "content-digest", "date"]
}
}
Recipients MUST verify the signature against the JWK before processing the request body.
Agent or workflow templates published to a marketplace MAY be Ed25519-signed using a manifest-of-files digest. Importers MUST verify signatures when present and MUST reject tampered content. Unsigned imports MAY be accepted with a degraded trust badge.
Knowledge documents in a retrieval-augmented agent's index SHOULD carry a is_verified_source boolean and a verifier identity. Retrieval responses MUST include a retrieval_policy_summary reporting:
{
"total_chunks": 8,
"verified_chunks": 6,
"unverified_chunks": 2,
"unverified_document_ids": ["doc_..."],
"all_sources_verified": false
}
Policy engines MAY consume this summary to demote or refuse generations grounded in unverified context.
Implementations of OATS that proxy external tool surfaces (e.g., MCP servers) SHOULD perform a pre-flight scan of each tool definition at registration. CRITICAL findings MUST block the tool from registration; HIGH findings SHOULD be annotated into the tool description visible to the model and to human reviewers. Pre-flight events SHOULD emit tool.call.blocked (when blocked at registration) or a custom signal mapped to policy_violation_rate.
Trust decisions MAY be evaluated against a Cedar policy set scoped to the calling tenant. When Cedar is used:
Providers MAY expose GET /v1/compliance/{framework} returning a JSON attestation mapping live posture (Vault audit trail, scoring activity, policy state) to a named regulatory framework. Recommended framework codes:
| Framework Code | Counsel-Reviewed | Description |
|---|---|---|
eu-ai-act |
yes | EU AI Act high-risk obligations |
nist-ai-rmf |
yes | NIST AI Risk Management Framework |
iso-42001 |
yes | ISO/IEC 42001 AI management systems |
42-cfr-part-2 |
technical_preview | 42 CFR Part 2 — SUD records |
colorado-ai-act |
technical_preview | Colorado AI Act |
us-state-conv |
technical_preview | US state-level convergence baseline |
ny-raise-act |
technical_preview | NY RAISE Act |
california-sb-53 |
technical_preview | California SB-53 |
Attestations MUST be regenerable on demand and MUST cite specific Vault entries as evidence.
Implementations that route LLM calls MAY:
task_success on agreement, deception_flags on dissent)/v1/analytics/sre/dashboardThese are operational extensions; they do not alter the four canonical score dimensions.
The reference implementation of OATS is VeriSwarm (veriswarm.ai), which implements all three conformance levels and all v1.1 optional extensions.
Open source components:
veriswarm_scoring / veriswarm_gate (Python)cedarpy) tenant policiespackages/sdk-python), Node.js (packages/sdk-node)veriswarm (Python)packages/github-action)OATS follows semantic versioning.
oats_version field in all data structuresA faithful-to-implementation pass after v1.1.0. v1.1.0 fixed event taxonomy and added optional extensions but left §2 (Core Concepts) and §4.1 (JWT claims) at their v1.0-draft shape, which never matched the reference implementation. v1.1.1 corrects:
scores. Field name is score, not value. policy_tier is a tier code (e.g., tier_2), not a label like trusted. Added band, label, and explanations fields that the reference implementation actually emits.general 0.25/0.30/0.25/0.20) were never accurate.tier_3/tier_2/tier_1/tier_0/tier_x), gated multi-dimensionally on identity AND risk AND reliability. Added an action-type-vs-tier decision matrix. Previous "3 tiers gated on composite alone" model never matched the engine.veriswarm), the flat *_score field naming, the aud claim, and a note about the W3C VC variant.kid matching + aud check, and acknowledged that conformant verifiers may need to accept vendor-prefixed claims (e.g., veriswarm) until ecosystem migration.SharedReputationSignal row shape. Privacy model corrected: reporter identity is not hashed in storage — anonymity is enforced by aggregation at the query boundary. Documented the pepper as a per-deployment secret with rotation implications.tool_usage, content, task, security, identity, interaction. Net total unchanged at 22.spam_burst (+30) and abuse_spam (+30) for fast-rate or compound abuse signals.Draft to Published.This specification is published under CC-BY-4.0 (Creative Commons Attribution 4.0 International).
Anyone may implement OATS. Attribution to VeriSwarm is required when referencing the specification.
The specification is open. Implementations may be proprietary.