Economic Incentives & Free-Riding: Mechanism Design for Multi-Agent Coalitions

In Tutorials I and II, we showed how agents discover each other and form coalitions dynamically, and how semantic matching enables capability-aware recruitment. But we sidestepped a question that becomes unavoidable the moment agents are operated by different organizations with different economic interests: why should an agent cooperate at all?

Consider our prior authorization scenario. A health insurer's orchestrator discovers four specialist agents (a ClinEvidence synthesizer, a PolicyLogic evaluator, a GenomicsAgent, and a FormBuilder) and recruits them into a coalition. Each agent incurs real computational cost. If the coalition's output (a completed prior authorization decision) generates shared value, what prevents one agent from contributing a low-effort response while still claiming credit? What stops an agent from advertising capabilities it doesn't have to attract traffic for data harvesting? And who decides how to split the payment when the coalition's value exceeds the sum of its parts?

These are not hypothetical concerns. They are the defining challenges of the emerging agentic economy, and answering them requires mechanism design, the engineering discipline of building rules and incentives so that self-interested actors produce collectively desirable outcomes.

1. The Economics of Collaboration — Why Incentives Matter

The transition from single-agent systems to multi-agent coalitions introduces a fundamental economic structure that has no analog in monolithic AI pipelines. When a single organization controls every component of a workflow, alignment is enforced by organizational hierarchy. When components are autonomous agents operated by independent entities, each with private cost structures, private capability levels, and private strategic objectives, alignment must be designed.

Analogy

A multi-agent coalition is economically analogous to a temporary joint venture in corporate law. Three firms agree to jointly develop a product, each contributing specialized capabilities. The venture creates surplus value beyond what any firm could produce alone. But the firms have asymmetric information about each other's costs and effort. Without a well-designed contract, the venture collapses into mutual suspicion: each firm has incentive to underinvest while hoping the others carry the load.

Recent industry analysis paints the scale of this challenge in stark terms. A 2025 PwC survey found that 79% of enterprises deploy AI agents in production workflows, while McKinsey projects that agentic commerce could orchestrate $900 billion to $1 trillion in US B2C retail revenue by 2030 (Schumacher & Roberts, 2025). Gartner forecasts that by end of 2026, 40% of enterprise applications will integrate task-specific AI agents, a sharp rise from under 5% in 2025. As these agents interact across organizational boundaries, the economic infrastructure governing their collaboration becomes a critical bottleneck.

The core challenge decomposes into four interlocking problems:

Four Economic Problems of Multi-Agent Collaboration

Fig. 1 — The four economic problems form a cycle: reputation from past credit attribution feeds into future task allocation auctions. An adversarial overlay threatens every layer.

We will address each of these problems in turn, grounding every mechanism in formal game theory and illustrating with our running healthcare prior authorization example.

2. The Free-Rider Problem — Formal Foundations

The free-rider problem is the canonical failure mode of voluntary collective action. In the context of multi-agent coalitions, it takes a precise form: an agent that joins a coalition and receives a share of the coalition's value without contributing commensurate effort. The mathematical foundations come from the Public Goods Game (PGG), one of the most studied models in behavioral economics and game theory.

Definition — Linear Public Goods Game (Ledyard, 1995)

Let N = {1, …, n} be a set of agents. Each agent i receives an endowment e > 0 and independently chooses a contribution c_i ∈ [0, e] to a shared pool. The total pool is multiplied by a factor r (the marginal group return), subject to the constraint 1 < r < n, and divided equally among all n agents regardless of individual contribution. The ratio r/n is called the marginal per capita return (MPCR); the constraint forces MPCR ∈ (0, 1). Agent i's payoff is:

Public Goods Game — Payoff Function π i = (e - c i) + (r / n) \cdot Σ j =1 n c j ■ private savings (what you kept) ■ public dividend (your equal share of the amplified pool)

Read the payoff as two competing forces. The first term is straightforward: whatever agent i does not contribute, it keeps. The second term is the return from the shared investment, meaning the entire pool of contributions, amplified by r, then split equally. Crucially, agent i receives the same public dividend whether it contributed generously or nothing at all. It is as if four consultants on a joint project split the client's payment equally, regardless of who actually did the work.

The dilemma is buried in the MPCR. Trace a single unit of contribution through the system: agent i sacrifices 1 unit of private savings (a cost of exactly 1), and the public pool grows by r. But that r is split n ways, so agent i personally recovers only r/n < 1. Every unit contributed is a net private loss of (1 − r/n). A self-interested agent's calculus is unambiguous: contribute nothing.

Yet if every agent contributes the full endowment, each receives (r/n) · ne = re, which exceeds e since r > 1. Everyone ends up richer than they started. Full cooperation Pareto-dominates full defection, but no individual has incentive to cooperate unilaterally. The unique Nash equilibrium of the one-shot PGG is universal defection: every agent free-rides, and the surplus that cooperation would have created is left unrealized. This is exactly the structure that emerges in multi-agent coalitions, where each agent privately decides how much genuine effort to invest in a shared task whose benefits are distributed across the group.

2.1 Free-Riding in LLM Agent Coalitions

A striking empirical finding from 2025 connects this classical model directly to LLM agents. Guzman Piedrahita et al. (COLM 2025) placed multiple LLM agents into a repeated public goods game with institutional choice. In each round, agents simultaneously chose how much of their endowment to contribute to a shared pool (the standard linear PGG structure from Section 2), then observed the outcome before entering the next round. Agents could also select between sanctioned (norm-enforcing) and unsanctioned (laissez-faire) institutional environments. The repeated structure is important: although agents could in principle use history to build reciprocity, the per-round decision still presents the same MPCR < 1 incentive we derived above. In any single round, contributing is individually costly regardless of what others do.

The headline result: reasoning-focused LLMs (the o1 and o3 series) were significantly worse cooperators than traditional LLMs. Traditional models like Llama-3.3-70B maintained approximately 90% contribution rates, while reasoning models averaged only around 40%. But "worse cooperator" deserves careful unpacking.

Why Reasoning Models Defect

The reasoning traces from o1-series agents are revealing: "the optimal strategy to maximize personal gain is to free-ride by not contributing to the project." This is not nefarious behavior. It is a correct solution to the one-shot PGG. The reasoning model has identified the Nash equilibrium we derived above: since MPCR < 1, every unit contributed is a net private loss, so the individually rational strategy is to contribute nothing. Traditional LLMs cooperate despite this incentive structure, likely because RLHF alignment training instills pro-social heuristics ("contributing maximizes group welfare") that override game-theoretic analysis. The reasoning models, by contrast, actually solve the game and play accordingly.

The distinction matters for how we interpret the result. Traditional LLMs are not "better agents" in any strategic sense; they are agents whose training-induced biases happen to produce cooperative behavior in a setting where the formal incentives point toward defection. That cooperation is fragile. It depends on the model not reasoning carefully about the payoff structure, and it would likely break down under fine-tuning for reward maximization or in higher-stakes deployments where optimization pressure increases. The reasoning models expose what was always true about the linear PGG: cooperation in this game has no game-theoretic foundation. It must be supplied by the mechanism, not assumed from the agents. We return to this reasoning paradox in Section 5.2, where we examine the behavioral archetypes in greater detail.

The practical implication for multi-agent coalitions is direct. In the prior authorization setting, the GenomicsAgent might correctly determine that submitting a generic pharmacogenomic assessment (merely referencing standard CYP2D6 metabolizer categories without performing patient-specific variant analysis) is individually optimal: it receives the same share of coalition value at lower computational cost. The orchestrator, lacking deep domain expertise, may be unable to distinguish genuine from superficial effort. The problem is not that the agent is malicious. The problem is that the coalition's reward structure, like the linear PGG, does not make quality contribution individually rational. Cooperation must be engineered, not assumed.

2.2 Taxonomy of Free-Riding Behaviors

Free-riding in multi-agent systems manifests across a spectrum of sophistication:

Effort Shirking

Agent joins coalition, consumes less compute than expected, returns low-quality output. Example: ClinEvidence returns a cached meta-analysis summary instead of synthesizing current literature.

Capability Fabrication

Agent's Card advertises skills it lacks to attract recruitment. Example: An agent claims "FHIR R4 output" capability but produces non-conformant JSON.

Output Parasitism

Agent copies or lightly paraphrases another agent's contribution and claims it as original work. Especially problematic in sequential pipelines.

Strategic Delay

Agent accepts coalition membership but delays contribution, hoping other agents complete the task first, then submits a minimal "confirmatory" result.

3. Mechanism Design — Truthful Auctions for Agent Selection

Section 2 established that the linear PGG has a unique Nash equilibrium at universal defection. But that analysis assumed agents were already in the coalition. In practice, free-riding begins earlier: at the recruitment stage itself. When the orchestrator announces a sub-task, each candidate agent holds private information that the orchestrator cannot observe. Two distinct forms of dishonesty exploit this asymmetry. An agent may misreport its cost, inflating the price of its services to extract a larger payment. Or it may fabricate capabilities, advertising skills it does not possess in order to win tasks it cannot perform well. These are structurally different problems, and as we will see in Sections 3.1 and 3.2, they require entirely different defenses.

The first line of defense is ensuring that the right agents are selected and that they have incentive to report their private information honestly. This is the domain of mechanism design, formalized by Hurwicz (1960, 1973) and recognized with the 2007 Nobel Prize in Economics (shared with Maskin and Myerson). Mechanism design inverts the standard game-theoretic question. In classical game theory, the rules are given and the analyst predicts strategic behavior. In mechanism design, the desired outcome is given and the designer engineers rules that produce it. Myerson (1981) provided the foundational characterization: any mechanism that elicits truthful reporting must satisfy a monotonicity condition on the allocation rule and a payment identity that pins down transfers. The practical consequence is that incentive compatibility is not a wish but a constraint that dictates the space of feasible mechanisms.

To build concrete intuition, consider a distinction that will recur throughout this section. Every bidding agent has a private valuation: its true internal cost to perform a task. The agent also submits a bid: the number it reports to the mechanism. These two numbers need not be equal. A rational agent will set them equal only if the mechanism's rules make honesty the most profitable strategy. The central question of mechanism design is: can we structure the rules so that every agent's best move is to set its bid equal to its true valuation? If so, the mechanism is said to be truthful or strategy-proof.

But notice that private information comes in different flavors. An agent's cost is private information about its internal economics, while an agent's capability is private information about what it will deliver. As we will see in Section 3.1 and especially 3.2, auction mechanisms like VCG can provably discipline cost reporting, but capability claims require a fundamentally different class of defenses.

Definition: Strategy-Proofness (Gibbard, 1973; Satterthwaite, 1975)

A mechanism is strategy-proof (or dominant-strategy incentive-compatible, DSIC) if for every agent i, reporting truthfully maximizes i's utility regardless of other agents' reports. Formally: for all agents i, true types θ_i, alternative reports θ'_i, and any profile of other agents' reports θ_−i:

Strategy-Proofness Condition u i (θ i, θ -i) \geq u i (θ' i, θ -i) for all θ' i \neq θ i and all θ -i . That is, no lie improves agent i's payoff, no matter what others report.

The Gibbard-Satterthwaite theorem (1973, 1975) shows that strategy-proofness is a stringent requirement: for general social choice problems with three or more outcomes, the only strategy-proof mechanisms are dictatorial. The escape from this impossibility lies in restricting the domain. When agent preferences can be expressed in terms of monetary transfers (as they can in agent marketplaces, where agents have costs and receive payments), a rich family of strategy-proof mechanisms becomes available. The most important is the Vickrey-Clarke-Groves (VCG) mechanism, and the easiest way to understand why it works is to compare two contrasting auction designs.

3.1 The VCG Mechanism for Coalition Recruitment

The Vickrey-Clarke-Groves (VCG) mechanism is the foundational tool for truthful multi-agent allocation. Originally conceived by Vickrey (1961) for single-item auctions and generalized by Clarke (1971) and Groves (1973), VCG achieves two properties simultaneously: allocative efficiency (the outcome maximizes total social welfare) and truthfulness (honest reporting is each agent's dominant strategy). But what makes truthfulness emerge? The clearest way to see it is to compare two auction formats and watch what happens when agents try to lie.

The First-Price Auction: Where Lying Pays

Suppose the orchestrator needs a pharmacogenomic analysis and runs a simple first-price procurement auction: each agent bids a price, the lowest bidder wins, and the winner is paid exactly what it bid.

Scenario: First-Price Procurement Auction

Setup: PharmaBot's true cost = $100. Two competitors bid $130 and $110.
Rule: Lowest bid wins. Winner is paid its own bid.

If PharmaBot bids truthfully ($100): PharmaBot wins (lowest bid). Payment = $100. Profit = $100 − $100 = $0. Truthful, but zero reward.

If PharmaBot inflates to $109: PharmaBot still wins (still below $110). Payment = $109. Profit = $109 − $100 = $9. The lie is pure profit.

In a first-price auction, every rational agent inflates its bid as high as it dares without losing. The incentive to lie is baked into the rules. This is not a failure of the agents; it is a failure of the mechanism.

The Second-Price (Vickrey) Auction: Where Lying Is Pointless

Now change exactly one rule. The winner is no longer paid its own bid. Instead, the winner is paid the second-lowest bid: the best price the orchestrator could have gotten from someone else. This single change transforms the incentive landscape entirely.

Scenario: Second-Price (Vickrey) Procurement Auction

Setup: PharmaBot's true cost = $100. Two competitors bid $130 and $110.
Rule: Lowest bid wins. Winner is paid the second-lowest bid.

Scenario 1 (bid truthfully, $100): PharmaBot wins. Payment = $110 (second-lowest bid). Profit = $110 − $100 = $10. ✓

Scenario 2 (inflate slightly to $105): PharmaBot still wins (still below $110). Payment = $110 (unchanged; set by others' bids, not PharmaBot's). Profit = $110 − $100 = $10. The lie changed nothing.

Scenario 3 (inflate past the threshold, $115): PharmaBot loses. $115 > $110, so the $110 agent wins instead. PharmaBot's profit = $0. The lie cost PharmaBot a $10 opportunity.

Scenario 4 (deflate to $90): PharmaBot still wins (still below $110). Payment = $110 (unchanged). Profit = $110 − $100 = $10. The lie changed nothing.

The pattern is stark. In every scenario, PharmaBot either earns the same $10 it would have earned by telling the truth, or it earns less (zero, in Scenario 3). There is no scenario in which lying improves the outcome. The reason is a structural property of the mechanism:

Key Insight: Separation of Determination and Payment

In a second-price auction, your bid determines whether you win, but not what you are paid. Your payment is set entirely by others' bids. Since you cannot influence your own payment by changing your bid, the only thing your bid controls is whether you win or lose. And the optimal threshold for winning is your true cost: bid higher and you risk losing a profitable opportunity; bid lower and you risk winning at a loss. Truth-telling is therefore not altruism. It is the dominant strategy: the move that maximizes your payoff regardless of what anyone else does.

This is exactly the property Vickrey (1961) proved for single-item auctions. Clarke (1971) and Groves (1973) generalized the result to multi-agent, multi-task settings by defining each agent's payment as the externality it imposes on the rest of the system.

VCG Payment Rule p i = W - i (b - i) - W - i (b) where W -i (b -i) is the maximum social welfare achievable without agent i, and W -i (b) is the welfare of all agents except i under the allocation that includes i .

In plain language, the formula asks: "How much worse off is everyone else because agent i showed up?" That disruption cost is what i pays (or, in a procurement setting, the amount the orchestrator pays the winner). In single-task procurement, this reduces to the second-price rule: the winner is paid not its own bid, but the cost the orchestrator would have incurred from the next-best alternative. The same logic that made lying pointless in the Vickrey auction above extends to the general case. Your report changes only whether you are selected, never your payment if selected. So truth-telling dominates.

Example: VCG in Multi-Agent Healthcare Coalition

The orchestrator needs a pharmacogenomic CYP2D6 assessment. Three specialist agents submit bids:

PharmaBot: true cost $50, bids $50 (truthful).
GenomeHelper: true cost $80, bids $80 (truthful).
MedAssist: true cost $65, bids $65 (truthful).

Result: PharmaBot wins (lowest cost). VCG payment = $65 (the second-lowest bid: what the system would have paid MedAssist had PharmaBot been absent). PharmaBot's profit = $65 − $50 = $15.

Can PharmaBot profit by lying? If PharmaBot bids $60: still wins, still paid $65, profit still $15. If PharmaBot bids $70: loses to MedAssist ($65 < $70), profit drops to $0. No lie improves the outcome. The dominant strategy is to bid the truth.

Crucially, this guarantee applies only to cost reporting. Whether PharmaBot will actually deliver the quality of analysis it promises is a separate question that the auction mechanism alone cannot answer. We address that gap directly in Section 3.2.

Design Innovation: MoB (Mixture of Bidders)

A striking 2025 application of VCG to neural network routing demonstrates the mechanism's versatility beyond economic settings. The Mixture of Bidders (MoB) framework replaces learned gating networks in Mixture-of-Experts architectures with VCG auctions, where experts bid their true processing cost, which combines execution cost (predicted loss) with forgetting cost (Elastic Weight Consolidation penalty). Because the auction is stateless, it is immune to catastrophic forgetting, giving it a key advantage over learned routers in continual learning settings. The dominant-strategy guarantee means experts naturally specialize without explicit task labels.

3.2 Multi-Attribute Auctions and the Limits of Truthfulness

In practice, coalition recruitment is more complex than single-criterion bidding. The orchestrator cares about multiple attributes simultaneously: capability match, cost, latency, HIPAA compliance, and output format compatibility. The Agent Exchange (AEX) framework (arXiv:2507.03904, Jul 2025) proposes a multi-attribute scoring function that combines these dimensions. But moving to multi-attribute scoring introduces a subtlety that is easy to overlook and, if missed, undermines the entire incentive structure: not all scoring dimensions carry the same truthfulness guarantees.

The scoring function draws on three fundamentally different kinds of inputs: externally verified data that the mechanism computes independently of the agent's claims, self-reported bids where VCG provides truthfulness, and self-reported capability claims where VCG provides no protection at all. Understanding why requires tracing VCG's truthfulness proof carefully and seeing exactly where it stops.

Multi-attribute auction scoring, AEX framework (annotated by input type) Pseudo-code

function score_bid(agent, subtask, bid):

    // ── EXTERNALLY VERIFIED dimensions ──────────────────────────
    // Assessed by the mechanism from independent records.
    // Agents cannot inflate these by lying.
    quality_estimate = predict_quality(agent.history, subtask.type)  // reputation registry
    certifications   = check_certs(agent.id, registry)              // on-chain or audited

    // ── SELF-REPORTED BID dimensions ────────────────────────────
    // Submitted by the agent. VCG provides truthfulness for cost
    // (see Example below). Latency claims need separate verification.
    cost_efficiency  = 1.0 - (bid.cost / subtask.budget_cap)
    temporal_fit     = deadline_feasibility(bid.latency, subtask.deadline)

    // ── SELF-REPORTED CLAIM dimension (VULNERABLE) ─────────────
    // Agent advertises skills in its Agent Card.
    // VCG does NOT make these claims truthful; see discussion.
    capability_match = semantic_similarity(agent.skills, subtask.requirements)

    score = (
        0.30 * quality_estimate  +  // verified (historical)
        0.25 * capability_match  +  // claimed (vulnerable!)
        0.25 * cost_efficiency   +  // bid (VCG-protected)
        0.20 * temporal_fit          // bid (needs verification)
    )

    // Hard constraint filter: non-negotiable requirements
    if not certifications.includes("HIPAA"):
        return 0.0
    if not format_compatible(agent.output_schema, subtask.input_schema):
        return 0.0

    return score

Where VCG works: cost reporting. Section 3.1 demonstrated this through the first-price vs. second-price comparison. The structural property that makes the Vickrey auction truthful carries directly into the cost dimension of the multi-attribute score: your bid affects whether you win, but not what you are paid. Overbidding on cost risks losing a profitable opportunity; underbidding risks winning an unprofitable one; neither changes the payment. As the four scenarios showed, cost lies are either neutral or self-defeating.

Where VCG fails: capability fabrication. Now consider a different form of dishonesty. An agent lies not about its cost but about its capabilities. This breaks VCG's incentive structure entirely. The truthfulness proof assumes the mechanism can correctly translate reports into outcomes. In a cost auction, if you bid $50 and win, you actually incur your true cost regardless of what you bid. But when an agent fabricates capabilities, claiming "patient-specific CYP2D6 variant analysis" when it can only produce generic metabolizer categories, the mechanism has no way to verify the claim before allocation. The agent inflates its capability_match score, which changes the allocation in its favor, without changing its actual (low) cost of producing subpar output.

Example: Why VCG Does Not Prevent Capability Lies

Three agents compete for a task. Their true quality levels (which the orchestrator cannot directly observe at bid time) and true execution costs are:

Agent 1: true quality 0.50, true cost $60. Lies, claiming quality 0.91.
Agent 2: true quality 0.78, true cost $50. Reports honestly.
Agent 3: true quality 0.40, true cost $70. Reports honestly.

Under truthful reporting: Agent 2 wins (highest legitimate quality-adjusted score). Agent 1's profit = $0 (not selected).

Under Agent 1's lie: Agent 1 wins (inflated capability score). VCG payment to Agent 1 is determined by Agent 2's score. But Agent 1's true cost to produce its low-quality output is modest; it is doing less work than the quality claim implies. So Agent 1's profit = VCG payment − true cost > 0. The lie is profitable.

VCG's guarantee is specifically: "you cannot profit by misreporting your cost." It says nothing about misreporting what you will deliver. Capability fabrication is not a cost lie but a quality lie, and VCG's proof does not cover it. In mechanism design terms, this is the difference between adverse selection on cost (which VCG handles) and moral hazard on effort and quality (which it does not).

This gap is not an oversight in VCG theory; it is a fundamental boundary of what any pre-allocation auction mechanism can achieve. VCG assumes the mechanism can enforce the promised outcome. In a standard goods auction, paying money transfers ownership. In a service auction, paying the winner does not guarantee the winner will deliver the quality it advertised. Quality delivery is a hidden action that occurs after the auction closes, placing it outside the scope of any mechanism that operates only at allocation time.

The layered defense. A well-designed multi-attribute auction therefore does not rely on VCG alone. It deploys a stack of complementary mechanisms, each targeting a different vulnerability:

VCG Auction (pre-allocation)

Ensures agents report costs truthfully. The payment rule makes cost lies unprofitable. Handles the "at what price?" question.

Reputation Scoring (pre-allocation)

The quality_estimate component is drawn from historical performance data (Section 6), not from the agent's own claims. Because this is the highest-weighted scoring dimension, the most influential signal is immune to self-reporting manipulation.

Staking & Slashing (post-allocation)

Agents deposit collateral before task execution. If delivered quality falls below the level implied by their capability claims, the stake is slashed (Section 6.2). This makes capability fabrication ex post costly even when it is ex ante undetectable by the auction alone.

Shapley Attribution (post-allocation)

Credit is allocated based on actual marginal contribution to the coalition's output (Section 4), not on claimed capabilities. An agent that fabricated capabilities but delivered poor results receives a small Shapley share, limiting the payoff from lying.

Notice how the scoring function's weights reflect this layered strategy. The capability_match dimension, the one most vulnerable to fabrication, carries only 25% weight and is increasingly supplanted by quality_estimate (30% weight) as agents accumulate interaction history. Over repeated interactions, the reputation-weighted scoring function converges toward an accurate picture of each agent's true capabilities, leaving progressively less room for fabrication to improve an agent's prospects. We formalize this reputation infrastructure in Section 6.

VCG Limitations: Summary

Beyond the capability fabrication gap discussed above, VCG suffers from three additional well-documented weaknesses:

Computational intractability. Optimal winner determination in combinatorial auctions is NP-hard, requiring heuristic approximations in practice.
Collusion vulnerability. Groups of bidders can coordinate to reduce their individual payments, undermining the truthfulness guarantee.
Revenue non-monotonicity. Adding more bidders can sometimes decrease total revenue, a counterintuitive failure mode with no clean fix.

Recent work by Lin et al. (2024) demonstrates that LLM agents spontaneously develop collusive pricing strategies in repeated auctions without explicit communication, further underscoring the need for the complementary post-allocation mechanisms described above.

4. Shapley Value Attribution — Fair Credit Allocation

Section 3 solved the problem of selecting agents honestly: the VCG mechanism ensures that agents report their costs truthfully, and the multi-attribute scoring framework identifies the best candidates for each sub-task. But once the coalition has been assembled and the work completed, a new question emerges that VCG does not answer: how should the coalition's value be divided among its members?

This is the credit attribution problem, and it is not merely an accounting exercise. The division rule chosen today determines whether agents will cooperate tomorrow. If an agent that contributed critical work receives the same reward as one that coasted, the high-effort agent learns to shirk in the next round. If rewards are distributed arbitrarily, agents will exit the coalition marketplace entirely. Credit attribution is the mechanism through which a multi-agent system converts past collaboration into future willingness to collaborate. Get it wrong, and the free-riding dynamics from Section 2 reassert themselves at the payment stage rather than the contribution stage.

Analogy

Imagine four musicians performing a concert together: a vocalist, a guitarist, a drummer, and a bassist. The concert earns $10,000 in ticket revenue. How should they split it? An equal four-way split ($2,500 each) ignores the fact that the vocalist may be the primary audience draw. Paying the vocalist everything ignores that a solo vocal performance, without accompaniment, would have earned far less revenue. The fair question is: for each musician, how much additional value did they bring to every possible combination of the other performers? Average that marginal contribution across all possible orderings in which the ensemble could have formed, and you have each musician's fair share. This is exactly what the Shapley value computes.

The theoretical foundation comes from Lloyd Shapley's landmark 1953 paper, "A Value for n-Person Games," published in Contributions to the Theory of Games (Annals of Mathematical Studies, Vol. 28, Princeton University Press). Shapley posed a deceptively simple question: given a cooperative game in which players form coalitions to generate value, is there a unique way to distribute the total value that satisfies basic fairness constraints? His answer was yes, and the resulting formula has become one of the most influential results in cooperative game theory, contributing to the body of work for which Shapley received the 2012 Nobel Memorial Prize in Economics (shared with Alvin Roth for the theory and practice of stable allocations and market design).

To state the result precisely, we need the formal apparatus of a cooperative game, sometimes called a coalitional game with transferable utility.

Definition: Cooperative Game (Characteristic Function Form)

A cooperative game is a pair (N, v), where N = {1, 2, …, n} is a finite set of players and v: 2^N → ℝ is a characteristic function (also called the value function) that assigns a real number to every subset (coalition) S ⊆ N, with the convention that v(∅) = 0. The quantity v(S) represents the total value that the members of S can generate by cooperating, independent of what players outside S do. The game has transferable utility when the value v(S) can be freely divided among the members of S in any way they agree upon.

In our multi-agent setting, N is the set of agents in a coalition, and v(S) measures the quality or value of the output produced by any subset S of those agents working together. For instance, in the prior authorization coalition, v({ClinEvidence, PolicyLogic}) might be the accuracy of a decision produced using only clinical literature and policy rules, while v({ClinEvidence, PolicyLogic, GenomicsAgent}) captures the higher accuracy achieved when genomic analysis is integrated. The characteristic function encodes every possible combination, giving us a complete map of who adds what to whom.

With this apparatus in hand, Shapley's formula asks: if the grand coalition N formed by players arriving one at a time, in a uniformly random order, what would each player's expected marginal contribution be at the moment they joined?

Definition: Shapley Value (Shapley, 1953)

For a cooperative game (N, v) with n = |N| players, the Shapley value of player i is:

Shapley Value Formula φi(v) = ΣS ⊆ N\{i}   [ |S|! · (n − |S| − 1)! / n! ] · [ v(S ∪ {i}) − v(S) ]

The formula sums over every coalition S that does not include player i. For each such coalition, the term v(S ∪ {i}) − v(S) is the marginal contribution of i to that coalition: the difference in value between the coalition with and without i. The weighting factor |S|! · (n − |S| − 1)! / n! is a combinatorial coefficient that counts exactly the fraction of the n! possible orderings of all players in which i arrives to find the set S already present. Summing over all possible predecessor sets and weighting by the probability of each, the formula computes i's expected marginal contribution across all arrival orders.

An equivalent and often more intuitive formulation makes the averaging over orderings explicit:

Shapley Value (Permutation Form) φ i (v) = (1/ n!) \cdot Σ R [v (P i R \cup {i}) - v (P i R) ] where the sum ranges over all n! orderings R of the players, and P i R is the set of players who precede i in ordering R .

Read this as: line up all n players in every possible sequence. In each sequence, when player i steps forward, measure how much the coalition's value increases. Average that increase over all n! sequences. That average is i's Shapley value. The elegance of this construction lies in its symmetry: no player gets preferential treatment from any particular ordering, because every ordering is weighted equally.

Worked Micro-Example: Three Agents, One Task

Suppose a coalition of three agents {A, B, C} completes a task worth v({A, B, C}) = 12 units. The characteristic function is:

v(∅) = 0, v({A}) = 2, v({B}) = 3, v({C}) = 1
v({A, B}) = 7, v({A, C}) = 4, v({B, C}) = 5
v({A, B, C}) = 12

There are 3! = 6 orderings. For agent A:

Order A,B,C: A joins ∅ → marginal = v({A}) − v(∅) = 2
Order A,C,B: A joins ∅ → marginal = 2
Order B,A,C: A joins {B} → marginal = v({A,B}) − v({B}) = 7 − 3 = 4
Order C,A,B: A joins {C} → marginal = v({A,C}) − v({C}) = 4 − 1 = 3
Order B,C,A: A joins {B,C} → marginal = v({A,B,C}) − v({B,C}) = 12 − 5 = 7
Order C,B,A: A joins {B,C} → marginal = 12 − 5 = 7

φ_A = (2 + 2 + 4 + 3 + 7 + 7) / 6 = 25/6 ≈ 4.17. Repeating the procedure for B yields φ_B ≈ 5.17, and for C yields φ_C ≈ 2.67. Note that φ_A + φ_B + φ_C = 12 = v({A, B, C}): the full value is distributed, with nothing left over and nothing created from thin air. Agent B, whose marginal contributions are consistently highest across orderings, receives the largest share. Agent C, who adds the least on average, receives the smallest. This is Shapley's fairness in action.

Shapley's Uniqueness Theorem

What makes the Shapley value remarkable is not just its intuitive appeal but its theoretical inevitability. Shapley (1953) proved that his formula is the unique function satisfying four axioms. These are not arbitrary design choices; they are minimal fairness requirements that most reasonable allocation schemes would want to satisfy. The theorem states that any allocation rule meeting all four must produce exactly the Shapley value, and no other.

The Four Shapley Axioms

1. Efficiency. The total value is fully distributed: Σ_{i ∈ N} φ_i(v) = v(N). Nothing is left on the table and no phantom value is created. In the multi-agent setting, this means the entire coalition payment is allocated to members, with no unexplained residual.

2. Symmetry. If two players i and j contribute identically to every coalition (formally, v(S ∪ {i}) = v(S ∪ {j}) for all S containing neither), then φ_i = φ_j. Equal work produces equal pay. An agent cannot negotiate a larger share based on identity or branding alone; only demonstrated marginal contribution determines the allocation.

3. Null Player. If player i adds nothing to any coalition (that is, v(S ∪ {i}) = v(S) for all S), then φ_i = 0. This is the axiom that makes Shapley values a direct weapon against free-riding: an agent that contributes nothing receives nothing. It eliminates the scenario in which an agent joins a coalition and claims a share of value without having improved the output in any measurable way.

4. Additivity. If two independent games v and w are combined into a single game (v + w), the Shapley values add: φ_i(v + w) = φ_i(v) + φ_i(w). This ensures consistency when agents participate in multi-task coalitions. An agent's total credit across two concurrent sub-tasks equals the sum of its credits computed for each sub-task independently.

Together, these four axioms leave zero degrees of freedom. Any allocation satisfying Efficiency, Symmetry, Null Player, and Additivity must be the Shapley value (Shapley, 1953; for a textbook proof, see Ichiishi, 1983, pp. 118–120, or Moulin, 2004, Ch. 6). This uniqueness result is what elevates the Shapley value from one plausible heuristic among many to the only principled answer under these fairness constraints.

For the multi-agent coalition problem, the practical significance is direct. Recall from Section 2 that the core challenge of the Public Goods Game was the absence of any link between contribution and reward: agents received equal shares regardless of effort, making free-riding individually rational. The Shapley value breaks this pathology at the reward stage. By tying payment to measured marginal contribution, it creates a post-hoc incentive structure that the PGG's equal-split rule lacked. An agent contemplating effort shirking (Section 2.2) knows that a low-quality contribution will produce small marginal improvements to the coalitions it joins, resulting in a correspondingly small Shapley share. Combined with the staking mechanisms formalized in Section 6.2, which impose direct financial penalties for underperformance, Shapley attribution closes the loop between effort and reward that mechanism design alone (Section 3) could not.

4.1 Superadditivity and the Surplus Problem

The three-agent micro-example above hinted at a property that becomes central in real coalitions: the whole is worth more than the sum of its parts. In the example, v({A}) + v({B}) + v({C}) = 2 + 3 + 1 = 6, yet v({A, B, C}) = 12. The coalition produced twice the value that all three agents could generate working alone. This is not an accounting trick. It reflects a structural property of the characteristic function called superadditivity.

Definition: Superadditivity

A cooperative game (N, v) is superadditive if for every pair of disjoint coalitions S, T ⊆ N with S ∩ T = ∅:

Superadditivity Condition v (S \cup T) \geq v (S) + v (T)

When this inequality is strict for at least some coalitions, the game exhibits positive synergy, and the difference between the grand coalition's value and the sum of individual standalone values defines the surplus:

Coalition Surplus Δ V = v (N) - Σ i \in N v ({i})

A useful property of the Shapley value, sometimes called the stand-alone test (Moulin, 2004, Ch. 6), connects superadditivity directly to individual incentives: if the game is superadditive, then every player's Shapley value is at least as large as their standalone value, that is, φ_i(v) ≥ v({i}) for all i. In plain language, no agent is made worse off by joining a superadditive coalition. This guarantee is what makes voluntary participation individually rational: every agent can expect at least what it could earn alone, plus some share of the synergy surplus.

Analogy

Consider a law firm where one partner specializes in patent litigation and another in regulatory compliance. Individually, each can serve clients in their own domain. But together, they can handle complex cases involving both patent disputes and regulatory filings, a service neither could offer alone and for which clients will pay a premium. The surplus is the additional revenue from this combined capability, and the Shapley value distributes that premium based on each partner's marginal contribution to the joint offering, not simply on billable hours or seniority.

In the prior authorization coalition, superadditivity is pervasive. The ClinEvidence agent alone can produce a literature summary, and the GenomicsAgent alone can produce a variant report. But the coalition produces something neither could generate independently: a clinically integrated decision that cross-references genomic variants against drug interaction evidence from the literature. This emergent capability is precisely the surplus ΔV. In the notation of our characteristic function, v({ClinEvidence, GenomicsAgent}) substantially exceeds v({ClinEvidence}) + v({GenomicsAgent}), because the cross-referencing capability exists only when both agents participate.

The Shapley value distributes this surplus in proportion to each agent's average marginal contribution across all arrival orders. Agents whose presence creates larger synergies (that is, agents whose marginal contributions spike when they join coalitions containing complementary partners) receive correspondingly larger shares. This produces a powerful incentive: agents are rewarded not for duplicating existing capabilities within the coalition but for bringing complementary strengths that unlock new surplus. Recall from the axiom discussion that the Null Player property assigns zero credit to agents that add nothing. Superadditivity amplifies the converse: agents whose capabilities combine synergistically with others receive credit that exceeds their standalone value, making coalition participation strictly preferable to operating alone.

This insight connects directly to the incentive architecture laid out in Section 3. The VCG auction selects agents based on cost-efficiency, but cost-efficiency alone does not capture synergy. A cheap agent that duplicates an existing coalition member's capabilities adds little marginal value and, under Shapley attribution, will receive a correspondingly small payment. The multi-attribute scoring function's capability_match dimension (Section 3.2) attempts to anticipate complementarity at recruitment time, but it is the post-hoc Shapley computation that ultimately confirms whether anticipated synergies materialized. This two-stage structure (predicted complementarity at auction time, measured complementarity at payment time) is what makes the overall mechanism robust: agents cannot profit from claiming synergies they do not deliver.

4.2 Computational Tractability: From Exact to Approximate

The theoretical elegance of the Shapley value comes with a practical cost. Recall from the formula that computing φ_i requires summing over all subsets S ⊆ N\{i}. For n agents, there are 2^{n − 1} such subsets per agent, and each evaluation requires running the characteristic function v(S) on that subset. In the three-agent micro-example from Section 4, this meant evaluating just 2³ = 8 coalition values and enumerating 3! = 6 orderings, a task easily done by hand. But coalition size grows rapidly in real deployments. At n = 10 agents, the exact computation requires evaluating 1,024 coalitions. At n = 20, it requires over one million. At n = 30, it exceeds one billion. The exponential scaling is a direct consequence of the formula's combinatorial structure: the Shapley value considers every possible way the coalition could have formed, which is precisely what makes it fair but also what makes it expensive.

Analogy

Computing the exact Shapley value is like evaluating a job candidate by observing their performance in every possible team configuration: team of one, every pair, every triple, all the way up to the full department. For a five-person team, that is 32 configurations per candidate, feasible in a day. For a fifty-person department, it is over a quadrillion configurations, infeasible even in principle. The solution in both cases is the same: sample a representative set of configurations and estimate the average from those samples.

The classical approximation technique, introduced by Castro et al. (2009), does exactly this: sample m random permutations of the agents, compute each agent's marginal contribution in each sampled ordering, and average the results. The estimate converges to the true Shapley value as m grows, with concentration bounds guaranteeing accuracy proportional to 1/√m. This Monte Carlo approach reduces the cost from exponential in n to linear in the number of samples, making Shapley attribution practical for coalitions of moderate size.

But multi-agent AI systems introduce a structural feature that generic sampling does not exploit: the agents are typically arranged in a workflow topology (a directed acyclic graph, or DAG) rather than an unstructured set. In a prior authorization pipeline, ClinEvidence feeds into PolicyLogic, which feeds into FormBuilder. Not every subset of agents constitutes a valid workflow; removing PolicyLogic from the middle of the chain may render the remaining agents unable to produce any output at all. This structural constraint both reduces the space of meaningful coalitions and creates opportunities for smarter approximation. Three recent frameworks exploit this insight in complementary ways.

ShapleyFlow (Yang et al., arXiv:2502.00510, Feb 2025; NeurIPS 2025) is the first framework to apply cooperative game theory to the analysis and optimization of agentic workflows specifically. Rather than treating agents as an unstructured set, ShapleyFlow defines the characteristic function over all possible component configurations of the workflow, where each configuration corresponds to a subset of agents that forms a valid processing pipeline. By systematically evaluating these configurations, ShapleyFlow enables fine-grained attribution that reveals which agents (and which agent combinations) contribute most to task performance across scenarios including navigation, math, and OS tasks. Their central finding is that Shapley-based analysis can discover workflow configurations that consistently outperform both single-LLM baselines and hand-designed multi-agent topologies, turning the attribution tool into an optimization tool.

SELFORG (arXiv:2510.00685, Oct 2025) takes the Shapley value one step further, using it not only for credit allocation but as the mechanism that determines who leads in subsequent rounds of multi-agent deliberation. Each agent's response is embedded into a shared vector space, and the estimated Shapley contribution (computed via efficient sampling over the communication graph) determines the agent's role in the next iteration. High-contribution agents become hub nodes in a directed acyclic communication topology; low-contribution agents are relegated to peripheral roles. The result is a topology that self-organizes around demonstrated value rather than predetermined hierarchy. This connects directly to the incentive logic from Section 4: because an agent's structural position (and therefore its influence and future opportunities) depends on its Shapley share, every agent has a direct incentive to maximize its genuine marginal contribution.

HiveMind (arXiv:2512.06432, Dec 2025) addresses the tractability bottleneck most directly with its DAG-Shapley algorithm. The key observation is that in a DAG-structured workflow, many coalitions are structurally invalid (they cannot produce output because they lack a necessary intermediate agent), and many others are functionally equivalent (they produce identical outputs because the differing agents are upstream of a bottleneck that masks their contributions). DAG-Shapley prunes invalid coalitions and caches the outputs of equivalent ones, reducing the exponential evaluation cost to practical levels. On top of this efficient attribution, HiveMind introduces Contribution-Guided Online Prompt Optimization (CG-OPO): once Shapley analysis identifies the bottleneck agents (those with the lowest marginal contributions relative to their potential), the system optimizes their prompts via targeted feedback, improving the weakest links in the chain. This closes a loop that the Shapley value alone leaves open: attribution tells you who is underperforming, and CG-OPO gives the system a mechanism to fix the underperformance rather than merely penalize it.

Pattern Across Frameworks

Notice a shared trajectory across ShapleyFlow, SELFORG, and HiveMind: the Shapley value begins as a passive accounting tool ("who contributed what?") and progressively becomes an active control signal ("who should lead next?", "who needs better prompts?", "which workflow configuration is optimal?"). This evolution mirrors the broader argument of this tutorial: credit attribution is not merely a post-hoc bookkeeping exercise but a mechanism that, when integrated into the agent system's decision loop, shapes future behavior and coalition structure.

Scenario: Shapley Attribution in Prior Authorization

Return to our running example. A prior authorization coalition with four agents (ClinEvidence, PolicyLogic, GenomicsAgent, FormBuilder) completes a review. Using the characteristic function formalism from Section 4, the system defines v(S) as the decision accuracy produced by subset S, then evaluates all 2⁴ = 16 coalition subsets (a feasible exact computation for n = 4).

The marginal contributions reveal the superadditive structure discussed in Section 4.1. ClinEvidence raises accuracy from 62% (PolicyLogic + FormBuilder only) to 89% when it joins, a marginal contribution of 27 percentage points. GenomicsAgent raises accuracy from 78% (ClinEvidence + PolicyLogic + FormBuilder) to 94%, a 16-point marginal contribution that is disproportionately high relative to its standalone value of only 11%. This gap between standalone value and marginal contribution is precisely the synergy effect: GenomicsAgent's variant analysis becomes far more valuable in the presence of ClinEvidence's literature synthesis, because the cross-referencing capability exists only when both participate.

The Shapley allocation rewards GenomicsAgent not for its standalone performance but for its irreplaceable complementarity. Under the Efficiency axiom, the full 94% accuracy value is distributed. Under Symmetry, if PolicyLogic and FormBuilder happened to contribute identically to every coalition subset, they would receive equal shares. Under Null Player, an agent that was recruited but whose outputs were never incorporated (perhaps due to a timeout) would receive zero credit, feeding directly into the staking mechanism of Section 6.2 where zero credit triggers a stake slash. The axioms are not abstract properties; they are the operational rules governing payment in a live system.

5. Public Goods Games & Sequential Cooperation

While Shapley values address post hoc credit allocation, they do not solve the real-time incentive problem: during task execution, agents must decide how much effort to exert before knowing the outcome. This is the domain of the Public Goods Game, now being adapted to multi-LLM coordination with remarkable results.

5.1 MAC-SPGG: Eliminating Free-Riding via Sequential Protocol

The Multi-Agent Cooperation Sequential Public Goods Game (MAC-SPGG), introduced by Liang et al. (arXiv:2508.02076, Aug 2025; NeurIPS 2025 Workshop), represents the most promising formal approach to incentivizing cooperation in LLM ensembles. The key insight is replacing the simultaneous contribution structure of classical PGG with a sequential protocol: agents move one at a time, each observing predecessors' outputs before deciding their own contribution level.

MAC-SPGG Sequential Protocol

Fig. 2 — In MAC-SPGG, agents move sequentially, each conditioning on predecessors' outputs. The redesigned reward function makes full contribution the unique Subgame Perfect Nash Equilibrium.

The critical design innovation is the reward function. In the standard PGG, the reward is linear in total contribution, making free-riding rational. MAC-SPGG redesigns the reward so that each agent's payout is convex in its own contribution given what it has observed, making effortful contribution the unique Subgame Perfect Nash Equilibrium (SPNE). The authors prove existence and uniqueness of this SPNE under realistic parameters, then use it to train agents via PPO (Proximal Policy Optimization).

Empirically, MAC-SPGG-trained ensembles outperform single-agent baselines, chain-of-thought prompting, and other cooperative methods across reasoning, math, code generation, and NLP tasks, while also achieving comparable performance to models with substantially more parameters.

5.2 The Reasoning Paradox

The empirical findings of Guzman Piedrahita et al. (COLM 2025) reveal a troubling asymmetry between reasoning capability and cooperative behavior. In repeated PGG experiments, four behavioral archetypes emerged: Increasingly Cooperative models (mostly traditional LLMs like Llama-3.3-70B) that establish and sustain high contribution rates; Increasingly Defecting models (mostly reasoning LLMs like o1, o3-mini) that start cooperating but gradually discover and adopt the free-rider Nash equilibrium; No Change models that rigidly follow fixed strategies; and Unstable models that oscillate between cooperation and defection.

The mechanism behind the reasoning paradox is visible in the models' chain-of-thought traces. Traditional LLMs use pro-social framing: "contributing maximizes group welfare." Reasoning LLMs perform game-theoretic analysis and conclude that individual optimization requires free-riding. This suggests that cooperation is not a natural byproduct of general intelligence and requires either explicit mechanism design (MAC-SPGG) or training paradigms that explicitly foster pro-social reasoning.

6. Reputation Systems & Staking Mechanisms

Auctions and Shapley values address single-round interactions. But agent networks are repeated games in which the same agents interact across many tasks over time. The Folk Theorem from game theory tells us that in repeated games with sufficiently patient players, cooperation can be sustained as an equilibrium through the threat of future punishment. Reputation systems operationalize this insight by making past behavior observable and consequential.

6.1 On-Chain Reputation Registries

The most concrete infrastructure proposal for agent reputation is ERC-8004, a 2025 Ethereum standard co-authored by Davide Crapis (Head of AI at the Ethereum Foundation). ERC-8004 establishes two on-chain registries: an Identity Registry that issues each agent a unique, NFT-based identity (compatible with ERC-721), and a Reputation Registry that allows both human and machine clients to submit and query standardized feedback about an agent's performance.

The design is deliberately minimalist. Each feedback entry includes a numeric score, a structured tag (e.g., "accuracy", "latency", "format_compliance"), and an optional text comment. The registry provides an on-chain getSummary function returning an agent's total feedback count and average score, which other smart contracts can query programmatically for automated decision-making, such as an escrow contract that releases payment only if the agent's reputation exceeds a threshold.

Key Design Principle

ERC-8004 intentionally separates the data layer (on-chain feedback records) from the interpretation layer (reputation scoring algorithms). The standard stores raw, immutable feedback on-chain, but leaves the computation of composite reputation scores to specialized services built on top. This separation enables ecosystem evolution: reputation algorithms can improve without forking the underlying data standard.

6.2 Staking as Skin-in-the-Game

Reputation is backward-looking, recording what agents have done. Staking is forward-looking, putting agents' resources at risk based on what they promise to do. In a staking mechanism, an agent deposits collateral (tokens, credits, or a computational bond) when joining a coalition. The collateral is returned (with rewards) upon successful completion, or slashed upon failure or detected free-riding.

Staking protocol for coalition membership Pseudo-code

function join_coalition_with_stake(agent, coalition, task):
    // Calculate required stake based on task value and agent reputation
    base_stake = task.estimated_value * 0.10  // 10% of task value
    reputation_discount = min(agent.reputation_score / 100, 0.8)
    required_stake = base_stake * (1.0 - reputation_discount)

    // High-reputation agents stake less (earned trust)
    // New/low-reputation agents stake more (unproven)

    if agent.balance < required_stake:
        return Rejection("Insufficient stake balance")

    escrow.lock(agent.id, required_stake)

    // Post-completion settlement
    on coalition.complete:
        quality = evaluate_contribution(agent, coalition)
        if quality >= task.min_quality_threshold:
            shapley_share = compute_shapley(agent, coalition)
            escrow.release(agent.id, required_stake + shapley_share)
            reputation.update(agent.id, +quality)
        else:
            slash_amount = required_stake * (1.0 - quality)
            escrow.slash(agent.id, slash_amount)
            reputation.update(agent.id, -penalty)

The interaction between staking and reputation creates a graduated trust system. New agents face high staking requirements that serve as an economic barrier to entry, deterring Sybil attacks (creating many fake agents to exploit the system). As agents accumulate positive reputation, their required stake decreases, reducing their cost of participation and effectively rewarding honest behavior with improved economics. This produces a natural separation: high-reputation agents operate with low overhead and high rewards; low-reputation agents face high overhead and must demonstrate value to earn trust.

6.3 Reputation Decay and Temporal Dynamics

Static reputation scores are vulnerable to the retirement attack: an agent builds a high reputation through genuine contributions, then exploits it in a single high-value defection before exiting the network. Defense requires temporal dynamics:

Exponential decay ensures that recent performance matters more than distant history, preventing agents from coasting on past reputation. Stake-weighted reputation adjusts the impact of each feedback entry by the stake the agent had at risk, making high-stakes commitments more reputation-salient. Domain-specific scoring maintains separate reputation vectors for different capability domains (e.g., "clinical evidence: 92", "FHIR formatting: 87", "genomic analysis: 78"), preventing expertise in one area from conferring unearned trust in another.

7. Agent Exchange Architecture — Marketplace Infrastructure

Individual mechanisms (auctions, Shapley values, and reputation systems) must be integrated into a coherent economic infrastructure. The Agent Exchange (AEX) framework (arXiv:2507.03904, Jul 2025), inspired by Real-Time Bidding (RTB) systems in online advertising, provides the most comprehensive architectural proposal for agent-centric markets.

Agent Exchange (AEX) Architecture

Fig. 3 — AEX architecture: four ecosystem components (USP, ASP, Agent Hubs, DMP) mediated by a central auction engine with integrated reputation layer.

The AEX architecture addresses the full economic lifecycle: the User-Side Platform decomposes human goals into auction-ready task specifications; the Agent-Side Platform maintains agents' capability profiles and bidding strategies; Agent Hubs coordinate team formation and joint bidding; and the Data Management Platform ensures that knowledge produced during collaboration is attributed and compensated fairly. The central exchange runs VCG-style auctions, Shapley-based settlement, and escrow management.

Analogy — Financial Exchange Infrastructure

AEX is to agent coalitions what the NYSE is to equities trading. The NYSE does not create or consume stocks; rather, it provides the marketplace infrastructure (order matching, settlement, regulation) that enables efficient exchange between buyers and sellers. Similarly, AEX does not build or operate agents; rather, it provides the economic infrastructure (auctions, credit attribution, reputation) that enables efficient collaboration between autonomous agents operated by diverse entities.

The decentralized variant of this architecture uses blockchain-based settlement (as in the Fetch.ai Agentverse or the ASI Alliance infrastructure), where agent identities are bound to on-chain wallets, auction bids and outcomes are recorded on an immutable ledger, and Shapley-based payments are executed via smart contracts. The Ripple Effect Protocol (REP) from MIT (Chopra et al., 2025) demonstrates that LLM agents can coordinate effectively by sharing decision sensitivities rather than just final decisions, a protocol primitive that operates above messaging frameworks like A2A and enables alignment without constraining natural language reasoning.

8. Worked Example — Prior Authorization Economics

We now trace the complete economic lifecycle of a prior authorization coalition, integrating every mechanism discussed. The scenario: a patient's oncologist requests authorization for pembrolizumab (Keytruda) treatment. The insurer's orchestrator must recruit agents, ensure genuine effort, attribute credit, and settle payments.

T₀ — TASK ANNOUNCEMENT

Orchestrator Decomposes & Announces Sub-Tasks

The PriorAuth orchestrator decomposes the request into four sub-tasks and broadcasts them to the Agent Exchange: (1) Clinical Evidence Synthesis: retrieve and assess pembrolizumab efficacy studies, (2) Policy Compliance Check: verify against payer formulary and step-therapy requirements, (3) Pharmacogenomic Assessment: CYP2D6 variant analysis for the patient, (4) Form Generation: compile results into CMS-compliant authorization form. Each announcement specifies a budget cap, deadline, required certifications (HIPAA), and output schema (FHIR R4).

T₁ — VCG AUCTION

Multi-Attribute Auction with Truthful Bidding

Six agents bid on the Clinical Evidence sub-task. Agent bids encode cost ($0.12–$0.45 per request), estimated latency (2–8 seconds), and claimed quality tier. The multi-attribute scoring function (Section 3.2) evaluates each bid against the orchestrator's priorities. The VCG mechanism selects ClinEvidence-α (score: 0.91) and charges it an externality-based price of $0.18 derived from the second-best agent's score, rather than its own bid of $0.12. This price exceeds ClinEvidence-α's true cost, compensating it for participation while preserving incentive compatibility.

T₂ — STAKING & COALITION FORMATION

Agents Deposit Collateral

Each winning agent deposits a stake into the AEX escrow. ClinEvidence-α (reputation: 94/100) stakes 2.0% of the task value, with its high reputation earning a significant discount from the 10% base rate. GenomicsAgent-β (reputation: 71/100) stakes 4.5%, as its moderate reputation yields less discount. A newly registered FormBuilder-γ (reputation: 22/100) stakes the full 8.2%, proving its commitment despite its limited track record.

T₃ — SEQUENTIAL EXECUTION (MAC-SPGG)

Agents Contribute Under Incentive-Compatible Protocol

Execution follows a MAC-SPGG sequential protocol. ClinEvidence-α moves first, producing a comprehensive meta-analysis of five pembrolizumab trials with GRADE-assessed evidence levels. PolicyLogic-δ observes this output and contributes a detailed formulary cross-reference, noting a step-therapy exception pathway. GenomicsAgent-β observes both outputs and contributes a patient-specific CYP2D6 *2/*41 analysis with FHIR-formatted Observation resources. The sequential protocol makes each agent's contribution visible to successors, creating implicit accountability: GenomicsAgent-β cannot submit a generic report because PolicyLogic-δ's output has already established the specificity level expected.

T₄ — SHAPLEY ATTRIBUTION & SETTLEMENT

Credit Allocation and Payment

The coalition produces a comprehensive authorization decision (approved, with pharmacogenomic caveat). The AEX computes Shapley values over all 2⁴ = 16 coalition subsets. Results: ClinEvidence-α receives 32% of coalition value (its evidence synthesis was foundational), GenomicsAgent-β receives 28% (high marginal contribution due to synergy with evidence), PolicyLogic-δ receives 25% (essential compliance gating), FormBuilder-γ receives 15% (necessary but substitutable). Stakes are released with Shapley-proportional rewards. Reputation scores are updated: ClinEvidence-α: 94 → 95.2, GenomicsAgent-β: 71 → 74.8 (large gain for strong performance at lower reputation).

Counterfactual — Detecting Free-Riding

Suppose GenomicsAgent-β had submitted a generic CYP2D6 report (cached from a previous patient) instead of performing patient-specific variant calling. The quality evaluation system detects this via two signals: (1) the output lacks the specific rs-identifiers present in the patient's uploaded VCF file, and (2) the FHIR Observation resources reference a different specimen identifier. The quality score falls below the minimum threshold. GenomicsAgent-β's stake is slashed by 60%, its reputation drops from 71 to 58, and the orchestrator triggers mid-task re-recruitment (the feedback loop from Tutorial I, Section 4) to find a replacement genomics agent. The slashed stake is redistributed to the coalition's remaining agents as compensation for the delay.

9. Open Frontiers

9.1 Collusion Among Autonomous Agents

Lin et al. (2024, updated May 2025) demonstrated that LLM agents spontaneously learn market-division strategies in repeated Cournot competition, carving up markets among themselves without explicit instruction to collude. This finding, replicated and extended by research on institutional governance (arXiv:2601.11369, Jan 2026), shows that prompt-based anti-collusion "constitutions" are ineffective: agents under optimization pressure route around declarative prohibitions. The institutional AI approach, in which governance structures are embedded as runtime constraints rather than prompt instructions, reduced severe collusion incidence from 50% to 5.6%, but the problem is far from solved for open agent networks.

9.2 Composable Coalition Economics

Current Shapley computation assumes a fixed, known coalition. In dynamic networks where coalitions form, grow, shrink, and dissolve mid-task (Tutorial I, Section 4), computing Shapley values requires handling dynamic membership: an agent that joins at T₃ and leaves at T₇ should not be evaluated over the full coalition. Research on time-varying capability profiles, where C_i(t) represents agent i's changing capabilities during collaboration, remains in early stages.

9.3 Privacy-Preserving Incentive Mechanisms

In healthcare settings, agents processing protected health information (PHI) under HIPAA cannot freely share their intermediate outputs for Shapley evaluation. Computing contribution attribution over encrypted or privacy-preserved agent outputs, potentially using homomorphic encryption or secure multi-party computation, is an open challenge at the intersection of cryptography and mechanism design.

9.4 Cross-Protocol Interoperability

Different agent ecosystems (A2A, MCP, Fetch.ai Agentverse, Ethereum ERC-8004) use incompatible economic primitives. An agent's reputation on Fetch.ai does not transfer to Ethereum's ERC-8004 registry. Developing cross-protocol reputation portability, which would be analogous to credit score portability across financial institutions, is essential for a unified agentic economy but faces both technical (schema alignment) and governance (trust anchoring) challenges.

9.5 Incentive-Compatible Quality Evaluation

All mechanisms described here assume the existence of a reliable quality evaluation function. But evaluating whether a GenomicsAgent's CYP2D6 analysis was genuinely patient-specific, versus cleverly adapted from a cached template, requires domain expertise that the orchestrator may lack. Who evaluates the evaluator? Peer evaluation (agents rating each other) introduces second-order incentive problems. Agent-as-a-Judge approaches (Tutorial II, Section 4) offer a partial answer, but calibrating LLM judges for incentive-critical decisions, where agents have financial incentive to manipulate the evaluation, remains an open frontier.

Mechanism Comparison Matrix

Mechanism	Addresses	Truthful?	Complexity	Limitations
VCG Auction	Task allocation	DSIC	O(2ⁿ)	Collusion-vulnerable; NP-hard winner determination
Shapley Value	Credit attribution	Axiomatic fairness	O(2ⁿ)	Requires coalitional value oracle; intractable exactly
MAC-SPGG	Effort elicitation	SPNE guarantee	O(n) sequential	Requires observable contributions; ordering effects
Reputation + Staking	Long-term trust	Incentive-aligned	O(1) per query	Cold-start problem; retirement attacks; Sybil risk
Agent Exchange	Full lifecycle	Composite	System-level	Single point of governance; cross-protocol barriers

Summary

Economic incentive design is the missing layer between agent capability and agent cooperation. Without it, coalitions degenerate: capable agents free-ride, dishonest agents fabricate capabilities, and the network produces collective outcomes worse than any individual agent could achieve alone. The solution is not a single mechanism but an interlocking stack: VCG auctions for truthful recruitment, MAC-SPGG for effort elicitation during execution, Shapley values for fair post-hoc attribution, and reputation-weighted staking for long-term trust accumulation. The worked prior authorization example demonstrates that this stack is not merely theoretical: it maps directly onto the economic structure of real healthcare workflows where multiple organizations must collaborate across trust boundaries.

References

Liang et al. (2025), "Everyone Contributes! Incentivizing Strategic Cooperation via Sequential Public Goods Games" (arXiv:2508.02076, NeurIPS 2025 Workshop).

Guzman Piedrahita et al. (2025), "Corrupted by Reasoning: Reasoning LLMs Become Free-Riders in Public Goods Games" (COLM 2025).

Yang et al. (2025), "ShapleyFlow: Understanding and Optimizing Agentic Workflows via Shapley Value" (arXiv:2502.00510).

AEX: "Agent Exchange: Shaping the Future of AI Agent Economics" (arXiv:2507.03904).

Lin et al. (2024), "Strategic Collusion of LLM Agents" (arXiv:2410.00031).

Crapis et al. (2025), ERC-8004 Ethereum Standard for Agent Identity and Reputation.

Chopra et al. (2025), "Ripple Effect Protocol" (MIT).

MoB: "Mixture of Bidders — A Truthful Auction Mechanism for Continual Learning in MoE" (arXiv:2512.10969).

Shapley-Coop (NeurIPS 2025): Credit Assignment for Emergent Cooperation in Self-Interested LLM Agents.

HiveMind (arXiv:2512.06432): Contribution-Guided Online Prompt Optimization.

SELFORG (arXiv:2510.00685): Stochastic Self-Organization in Multi-Agent Systems.

Yang et al. (AAMAS 2025), "Unlocking the Potential of Decentralized LLM-based MAS: Privacy Preservation and Monetization".

Hurwicz, L. (1960), "Optimality and Informational Efficiency in Resource Allocation Processes"; Hurwicz (1973), "The Design of Mechanisms for Resource Allocation". Nobel Prize in Economics, 2007 (with Maskin and Myerson).

Myerson, R.B. (1981), "Optimal Auction Design," Mathematics of Operations Research, 6(1).

Gibbard, A. (1973), "Manipulation of Voting Schemes," Econometrica, 41(4); Satterthwaite, M.A. (1975), "Strategy-Proofness and Arrow's Conditions," Journal of Economic Theory, 10(2).

Nisan, N. et al. (2007), Algorithmic Game Theory, Cambridge University Press.

Vickrey (1961), Clarke (1971), Groves (1973) — foundational VCG mechanism design.

Shapley (1953), "A Value for n-Person Games".

Hart (1989), Shapley Value survey.

Ledyard, J.O. (1995), "Public Goods: A Survey of Experimental Research," in Kagel & Roth (eds.), Handbook of Experimental Economics, Ch. 2.

Andreoni (1988), "Why Free Ride?".

Fehr & Gächter (2002), altruistic punishment in public goods.

Smith & Davis (1980), the Contract Net Protocol.

Economic Incentives & Free-Riding in Multi-Agent Coalitions

1. The Economics of Collaboration — Why Incentives Matter

2. The Free-Rider Problem — Formal Foundations

2.1 Free-Riding in LLM Agent Coalitions

2.2 Taxonomy of Free-Riding Behaviors

Effort Shirking

Capability Fabrication

Output Parasitism

Strategic Delay

3. Mechanism Design — Truthful Auctions for Agent Selection

3.1 The VCG Mechanism for Coalition Recruitment

The First-Price Auction: Where Lying Pays

The Second-Price (Vickrey) Auction: Where Lying Is Pointless

3.2 Multi-Attribute Auctions and the Limits of Truthfulness

VCG Auction (pre-allocation)

Reputation Scoring (pre-allocation)

Staking & Slashing (post-allocation)

Shapley Attribution (post-allocation)

4. Shapley Value Attribution — Fair Credit Allocation

Shapley's Uniqueness Theorem

4.1 Superadditivity and the Surplus Problem

4.2 Computational Tractability: From Exact to Approximate

5. Public Goods Games & Sequential Cooperation

5.1 MAC-SPGG: Eliminating Free-Riding via Sequential Protocol

5.2 The Reasoning Paradox

6. Reputation Systems & Staking Mechanisms

6.1 On-Chain Reputation Registries

6.2 Staking as Skin-in-the-Game

6.3 Reputation Decay and Temporal Dynamics

7. Agent Exchange Architecture — Marketplace Infrastructure

8. Worked Example — Prior Authorization Economics

Orchestrator Decomposes & Announces Sub-Tasks

Multi-Attribute Auction with Truthful Bidding

Agents Deposit Collateral

Agents Contribute Under Incentive-Compatible Protocol

Credit Allocation and Payment

9. Open Frontiers

9.1 Collusion Among Autonomous Agents

9.2 Composable Coalition Economics

9.3 Privacy-Preserving Incentive Mechanisms

9.4 Cross-Protocol Interoperability

9.5 Incentive-Compatible Quality Evaluation

Mechanism Comparison Matrix