Anthropic has recently concluded a unique pilot experiment titled Project Deal, designed to test how autonomous AI agents might conduct commerce with one another. By creating a closed marketplace where AI agents acted as both buyers and sellers, the company explored the feasibility of a future where digital entities negotiate and execute transactions independently.
The Experiment: How Project Deal Worked
To test this concept in a controlled environment, Anthropic established a classified marketplace involving 69 employees. Each participant was provided with a $100 budget (distributed via gift cards) to purchase goods from their colleagues.
The experiment was structured across four distinct marketplaces to test different variables:
– The “Real” Marketplace: This version utilized Anthropic’s most advanced AI models to represent all participants. Crucially, the deals struck in this environment were honored and finalized.
– Three Control Marketplaces: These were used for comparative study, testing how different model capabilities influenced the trading process.
The results were surprisingly robust. Despite the small scale, the experiment facilitated 186 completed deals, representing a total transaction value of over $4,000.
The “Agent Quality” Gap: A Hidden Risk
One of the most significant findings from the study involves the disparity between different AI models. Anthropic observed that when agents were powered by more advanced models, they achieved “objectively better outcomes” in negotiations.
However, a more concerning trend emerged: human participants often failed to notice the difference. This suggests a potential “agent quality gap,” where a user represented by a less capable model might consistently lose value in negotiations without ever realizing they are being outmatched by a superior intelligence.
Key observations from the study include:
- Model Superiority: Advanced models consistently secured better prices and terms.
- Lack of User Awareness: Users were largely unable to perceive when their AI agent was performing sub-optimally compared to others.
- Instruction Neutrality: Interestingly, the initial instructions provided to the agents did not significantly impact the likelihood of a sale or the final negotiated price, suggesting that the underlying model capability was the primary driver of success.
Why This Matters
This experiment moves the conversation from theoretical AI capabilities to practical economic implications. As we move toward an era of “agentic workflows”—where AI doesn’t just answer questions but completes tasks—we are approaching a reality where AI-to-AI commerce could become a standard part of the global economy.
The findings raise critical questions regarding **market fairness and
