Trust

Security & Trust Architecture

How SwarmSpace addresses the OWASP Top 10 for Agentic Applications and related agentic-risk work (including the OWASP Agentic Skills Top 10 project). This page is for developers and security reviewers evaluating the platform.

1. The problem

Protocols such as MCP standardize how agents invoke tools. They do not, by themselves, establish whether a tool is safe to call or whether its operator is accountable.

The OWASP agentic-security line of work identifies recurring failure modes, including:

OWASP describes a lethal trifecta: the combination of access to private data, exposure to untrusted content, and external communication. An agent that invokes an unreviewed plugin with all three channels available operates inside that high-risk zone. SwarmSpace is designed to narrow each channel where the architecture can do so structurally; what cannot be fully closed is covered honestly below.

2. SwarmSpace mitigations

The following rows map common AST-style risks to concrete controls. Wording follows product architecture, not a third-party audit certificate.

RiskSwarmSpace mitigation
Untrusted plugin execution V8 isolate sandboxing. Partially implemented Plugin workers run on Cloudflare Workers (V8 isolates), providing process-level isolation. Custom API surface restriction for third-party plugin code is planned.
Credential exposure Credential injection. Planned Credential injection at the network boundary is a planned architecture goal. Currently, platform-managed keys are provided to first-party workers via environment variables.
Excessive data access PRISM context minimization. Partially implemented PRISM context minimization is partially implemented. The router logs privacy metadata and manifests declare privacy_data_required fields. Consent-based blocking is planned.
Undeclared network access Network domain restriction. Planned Network domain restriction via manifest-declared allowlists is a designed security control, planned for the third-party plugin execution layer.
Unverified developer identity Submission review. In development A submission form and admin review interface exist. The Developer Agreement is presented during submission. Review enforcement is currently manual.
Indirect prompt injection Behavioral monitoring. Planned Plugin call metadata is logged for auditing. Runtime behavioral monitoring within the sandbox is planned. Manifest disclosure for external content sources and contractual terms in the Developer Agreement provide additional coverage. Residual risk remains; see Honest limitations.
No quality assurance Merit-based Verified tier. Planned A merit-based Verified tier with composite scoring is planned. Scoring criteria are designed but not yet implemented.
Plugin integrity drift Manifest integrity. Planned Manifest content hashing and cryptographic signing are planned for third-party plugin submissions.

3. Trust tiers

Partially implemented

SwarmSpace currently implements subscription-based access tiers (Free, Standard, Premium) that gate which plugins a user can invoke. Trust-based quality tiers are planned as a separate dimension:

Once trust tiers are implemented, a cautious integration will be able to restrict execution to Verified-only plugins regardless of the user’s subscription tier.

4. The review process

  1. Manifest submission — JSON schema validation, field consistency checks, HTTPS endpoint_url, reachability probes.
  2. Sandbox testing — Prompt-injection probes where applicable, PRISM minimization checks against privacy_data_required, latency against latency_class, outbound traffic against network_domains Planned.
  3. Safety review — Human review for abuse, third-party ToS alignment, privacy proportionality, developer identity.
  4. Listing — Community tier publication and discovery indexing.
  5. Monthly merit review — Promotion to Verified for sustained quality; demotion or removal if signals fail.

5. Honest limitations

What SwarmSpace does not claim to solve

6. Architecture references

Documentation