Universal Authentication for A2A and MCP using DIDs, JWT, and HTTPS

Abstract

A2A and MCP are two emergent standards for AI-based systems to communicate. Early work on A2A and MCP focused on using services that were under the control of the developer, or that the developer could easily acquire access keys for. An Agentic Web is forming that will flourish when any two agents, anywhere in the world, can establish trust and communicate with each other.

Currently, the most popular way of establishing trust is by using well known OAuth servers from Google, Facebook, and Microsoft. This is a practical solution for some use cases, but it is far from a universal solution. In most cases, OAuth necessitates a third-party for authentication which greatly increases the attack surface and presents an unnecessary complication.

The cryptographic community has been working on decentralized solutions that don't require agents to both agree and rely on a third-party authentication service. Cryptographic solutions allow each party to self-authenticate using HTTPS, challenges, and public key cryptography.

W3C Decentralized Identifiers (DIDs)

The DID specification started development in 2019, was funded by the Department of Homeland Security, and eventually became a W3C standard. Although the DID specification predates A2A, the protocol provides a foundation for global identity scoped to individuals, businesses, and governments, and fine-grained authentication for services (e.g. agents) representing those entities.

This document proposes a lightweight solution that uses DIDs, JWT, and HTTP to authenticate A2A agents, MCP servers, and any other networked service.

Flexible Network Architecture

The main components of the network architecture are:

Network Architecture

The above diagram shows the many different ways agentic services can interact:

Terminology

DID and Agent Discovery

Before any two agents can communicate, they must discover each other. There are many different efforts underway to index agents and agentic services such as MCP. For universal authentication, the only requirement is that an agent represented by one DID discovers the DID of another agent.

For example, the Matchwise iPhone app can share a person's geolocation (DID+geo) to a presence service, which returns a list of DIDs that represent other people who are nearby. Another example is an MCP matching service, where people submit their DID and brief description, and the matching service returns a list of DIDs that should be considered for matching.

Resolving an Agent DID from a DID Document

  1. A (user) agent discovers the DID of a peer agent and wants to communicate with the peer.
  2. The user agent fetches the DID document of the peer from a DID document repository.
  3. The DID document provides a list of agents in the "service" array. The DID from step #1 includes the "id" of the service to communicate with as a fragment. For example, did:web:example.com:mike#friends which indicates to communicate with the "friends" agent in the DID document at https://example.com/mike/did.json.
  4. The user agent evaluates the service record to discover the HTTP endpoint of the peer agent and the protocol to use, such as MCP or A2A.
  5. The user agent establishes a connection using the discovered endpoint and protocol to begin HTTP communication.

Authentication Flow

Authentication Flow
  1. The user agent sends an HTTP request to the peer agent that does not include an Authorization header, or includes an invalid Authorization header.
  2. The peer agent requires authentication, and responds with an HTTP 401 and a WWW-Authenticate header that includes an Agentic challenge
  3. The user agent uses a private key of the verification method that is associated with the user agent in the DID document to sign the challenge. The signed challenge is encapsulated in a JWT and sent as the Authorization header in the HTTP request to the peer agent.
  4. The peer agent validates the signed challenge and responds with an HTTP 200 and a JWT in the response body.
  5. The user agent validates the JWT and uses the public key in the JWT to decrypt the message.

For details on the schema of the agentic challenge, agent key resolution, and generation of the JWT, please review the Agentic Profile Auth Github repository.

DID Document Example

{
    "@context": [
        "https://www.w3.org/ns/did/v1",
        "https://w3id.org/security/suites/jws-2020/v1",
        "https://agenticprofile.org/ns/agentic-profile/v1"
    ],
    "id": "did:web:iamagentic.ai:15",
    "name": "Mike Prince",
    "media": {
        "images": [
            "512x512",
            "96x96"
        ]
    },
    "verificationMethod": [],
    "service": [
        {
            "id": "#connect",
            "name": "Connect",
            "type": "A2A/connect",
            "serviceEndpoint": "https://api.matchwise.ai/agents/connect",
            "capabilityInvocation": [
                "did:web:api.matchwise.ai#system-key"
            ]
        },
        {
            "id": "#presence",
            "name": "Presence",
            "type": "A2A/presence",
            "serviceEndpoint": "https://api.matchwise.ai/agents/presence",
            "capabilityInvocation": [
                "did:web:api.matchwise.ai#system-key"
            ]
        }
    ]
}

The above is a DID document for the individual/entity "Mike Prince" and has the following properties:

An important take-away from using DID documents that describe multiple agents is that the DID for this document establishes a globally unique identifier that can be used to build reputation and trust.

References