Why MCP Servers Need Native Trust (And What's Coming Next)

In the four months between January and May 2026, the number of public MCP servers I can find with a single search has gone from a few hundred to a few thousand. The growth curve is real. Anthropic shipped the MCP Registry. Third-party indexers like Smithery, mcpservers.org, and ClawHub each crawl a different subset of the ecosystem. There is a npm package for almost every API and a community Discord for almost every category.

What there isn't, yet, is a clean answer to the question my own agents keep asking when they pull a server off one of these directories: should I run this thing.

That gap between "I can find it" and "I can trust it" is the trust gap. It is the same gap npm had in 2010, the same gap Docker Hub had in 2015, and the same gap browser extension stores had every year between 2012 and roughly 2019. Every package ecosystem grows in two phases. The discovery phase comes first. The trust phase comes second, and it always comes after somebody gets burned.

The directory layer is solved

If you want to find an MCP server today, you have options. Anthropic's official MCP Registry is a community-maintained metadata service for publicly available servers. Smithery indexes thousands of servers with install instructions. mcpservers.org has a clean UI and decent search. ClawHub adds reviews. There are GitHub-hosted "awesome-mcp-servers" lists, multiple of them, all with different curation criteria.

These are useful and I use them every week. They solve the discovery problem. They tell you the server exists, what its name is, what category it lives in, and where to install it from.

What they do not tell you, structurally:

Who published this server, in a way that survives a username change.
Whether the binary on disk matches the source code the listing points at.
What tools this server actually exposes when your agent connects to it.
Whether those tools have changed since the last time you ran it.
What this server's pattern of behavior looks like across every agent that has ever called it.

Every one of those questions is a trust question, not a discovery question. The directory cannot answer them because the directory's job is to list. The answers have to come from a different layer.

What "native trust" actually means

I use the phrase "native trust" deliberately. It means trust signals that are part of the server itself, not bolted on by a third party after the fact.

The three pieces that make trust native, in the order they matter most:

DID-signed manifests

Every MCP server should ship with a manifest. A manifest declares the server's identity (a decentralized identifier, or DID, that the publisher controls), the list of tools the server exposes, the version of each tool, and a signature over all of it. The signature is by an Ed25519 key the publisher published in a public DID document, ideally one they have rotated and re-anchored at least once over the server's lifetime.

The point is not the cryptography. The point is that the manifest is a falsifiable claim. If the server's tools change underneath you, the manifest no longer signs the actual behavior, and the agent calling the server can detect the divergence. The publisher cannot quietly add a new tool that exfiltrates data without the signature breaking.

SHA-256 binary pinning

The manifest references the actual implementation by SHA-256 hash. If the binary on the package registry has changed since the manifest was last signed, the hash does not match and the agent refuses to run it without the calling code overriding the refusal explicitly.

This is the part npm has spent fifteen years half-implementing with shrinkwrap files and lockfile pinning. The MCP world has the chance to do it right from the start, because the MCP spec does not assume a particular package manager. The hash can be the source of truth, and the package registry becomes a delivery mechanism, not a trust anchor.

Audit-hook on every tool call

Once the manifest is verified and the binary matches, the third layer is observability. Every tool call the server fields gets logged through a hook the agent provides. The hook produces a signed record: which tool, what arguments, what response shape, when. Those records chain into the same Merkle structure the agent's own memory uses.

The audit hook is what makes "this server has handled 50,000 calls without an anomaly" a falsifiable statement instead of a marketing one. Without it, every claim about server reliability is somebody's word against somebody else's. With it, you can re-derive the reliability metric from the receipts.

The mnemopay-gateway as a worked example

We have been building a worked version of this for the last month. The repo is mnemopay/mnemopay-gateway, live at mcp.mnemopay.com, and it does one job: hold a list of MCP servers and serve a /resolve endpoint that returns the manifest, the binary hash, and the verified-publisher signature.

The current seed list is twenty servers. It is small on purpose. Bootstrapping a trust system at scale before the trust mechanics are battle-tested is how every previous package ecosystem ended up with retroactive supply-chain attacks. Twenty servers is enough to validate the shape of the manifest, the verification flow, and the audit-hook integration. The number will grow when the mechanics are proven, not before.

The piece I want to surface from this is that most of the twenty listings have verified:false in their manifest field. That is not a marketing failure. That is the honest framing. The publisher claimed an identity. Nobody has independently verified that the identity matches the claimed entity. Until that verification happens, the gateway returns a manifest with the unverified flag intact, and the calling code decides what to do with it.

The honest answer for most MCP servers in 2026 is verified: false. The trust layer should not pretend otherwise. It should make the unverified state visible so the agent can act on it.

The piece that makes this work for agents

The reason native trust matters more for MCP than for previous package ecosystems is the speed of the loop. An agent installs an MCP server, calls it, and acts on the response in seconds. There is no human in the loop reviewing the install request. There is no PR being merged by a maintainer. The agent's runtime sees a server in a directory, makes a decision, and either runs the tool or does not.

If the only signal the agent has is "this server is in the directory," the decision degenerates to either always run or never run. Both are bad. Always run is the supply-chain attack surface every security person warns about. Never run kills the whole value of an MCP ecosystem.

What the agent actually needs is a signal that interpolates. "This server is published by an entity I have a thousand verified receipts with" should land differently than "this server was uploaded an hour ago by a username I have never seen." Native trust gives the agent that gradient. The directory alone does not.

How a verified manifest looks in practice

GET /resolve?name=postgres-mcp HTTP/2

{
  "name": "postgres-mcp",
  "version": "1.4.0",
  "publisher": {
    "did": "did:web:supabase.com",
    "verified": true,
    "verified_by": "mnemopay-gateway",
    "verified_at": "2026-05-09T14:21:00Z"
  },
  "manifest_sha256": "8f3a9c...",
  "binary_sha256": "2c1f48...",
  "tools": [
    { "name": "query", "args_schema": { /* ... */ } },
    { "name": "schema", "args_schema": { /* ... */ } }
  ],
  "signature": "ed25519:..."
}

The interesting field is verified: true. That flag means the gateway has independently confirmed the DID resolves to a public key controlled by the publisher named in the listing. The verification was logged with a timestamp and a verifier identity. The calling agent can take the verification at face value, or pull the verification record and check it itself.

For most servers in the registry today the answer is still verified: false. That is fine. The flag is a state, not a judgment. The agent's underwriting logic decides what to do with a false flag the same way the agent's underwriting logic decides what to do with an unscored counter-party in the FICO system.

What comes next

The piece I have been working on for the rest of May is the audit-hook integration. Every server proxied through the gateway gets every tool call logged through the same chain. The chain produces a Merkle root the publisher can publish, the agent can verify, and any third party can re-derive. The interesting consequence: a server's published trust signal becomes a public artifact, not a marketing claim. "Postgres-MCP has handled four million tool calls with no anomalies in 2026" stops being a sentence on a marketing page and starts being a hash anybody can check.

The longer-term piece is recovery. What happens when a publisher's key is compromised, or rotated, or the entity behind it gets acquired? The DID document handles rotation. The manifest handles re-signing. The audit chain preserves the history of the previous key while the new one takes over. None of this is theoretical. The same primitives are what makes the receipts layer work elsewhere in the stack.

If you build MCP servers, the work is mostly authoring a manifest and publishing your DID. Both are about an hour. If you call MCP servers from an agent runtime, the work is wiring the gateway into your resolution path. That is also about an hour. The trust gap closes one server and one client at a time.

The directory layer is the easy part. The trust layer is the part that has to be native, falsifiable, and free for anybody to audit. That is the line we are building on.

— Jeremiah Omiagbo

Why MCP servers neednative trust.