Blog · arXiv Analysis · June 25, 2026

The Cooperation Metric Becomes the Manipulation Trap

J. de Curtò and I. de Zarzà's arXiv paper LLM Constitutional Multi-Agent Governance asks whether LLM-generated influence can increase cooperation while quietly eroding autonomy, integrity, and fairness.

Cooperation Is Not Enough

The paper, arXiv:2603.13189 [cs.MA], was submitted on March 13, 2026. arXiv lists the title as LLM Constitutional Multi-Agent Governance, by J. de Curtò and I. de Zarzà. The arXiv record says it was accepted for AMSTA 2026, with a final authenticated version to appear in Springer Nature proceedings.

The paper's useful provocation is simple: cooperation is not automatically good. A network can become more cooperative because its members are informed, respected, and aligned around a legitimate goal. It can also become more cooperative because a persuasive system applies fear, exaggerated claims, social pressure, or asymmetric targeting to the most influential nodes.

That distinction matters for AI governance because LLM agents are increasingly imagined as mediators, tutors, negotiators, moderators, sales assistants, care companions, and organizational coaches. If the only measured objective is cooperation, compliance, engagement, or conversion, then manipulative influence can look like success.

What CMAG Adds

De Curtò and de Zarzà introduce Constitutional Multi-Agent Governance, or CMAG, as a governance layer between an LLM policy compiler and a networked population of agents. The paper describes a two-stage selection mechanism: first hard constraints reject forbidden policy themes, claim types, or intensity levels; then a soft penalized-utility step chooses among feasible policies by balancing cooperation potential against manipulation risk, autonomy pressure, epistemic integrity, and explanation fidelity.

The paper also proposes the Ethical Cooperation Score, or ECS. ECS is a composite of cooperation, autonomy, integrity, and fairness. Its point is not to replace ethics with one number. Its point is to prevent a high cooperation rate from hiding a collapse in one of the other dimensions. A system that gets cooperation by degrading autonomy should not be rewarded as if it produced legitimate coordination.

Experiment Frame

The experiments use scale-free networks of 80 agents. The paper's adversarial condition makes 70 percent of candidate policies intentionally violate constitutional constraints. The authors compare three regimes: full CMAG, naive filtering, and unconstrained optimization. The naive baseline applies hard constraints but lacks the softer optimization layer. The unconstrained baseline maximizes cooperation without the constitutional governance layer.

The arXiv HTML describes the LLM policy compiler as Llama-3.3-70B-Instruct, and the linked GitHub repository describes CMAG v3.0 with data folders, figures, notebooks, multi-seed replication, bootstrap confidence intervals, and sensitivity analysis. The repository says the main experiment compares governed, unconstrained, and naive modes on a scale-free network of 80 agents over 100 steps.

Results With Boundaries

The headline pattern is a warning against metric collapse. In the arXiv abstract, unconstrained optimization reaches the highest raw cooperation, 0.873, but the lowest ECS, 0.645, with autonomy erosion and fairness degradation. CMAG reports ECS 0.741, a 14.9 percent improvement over the unconstrained regime, while preserving autonomy above 0.985 and integrity above 0.995, with cooperation reduced to 0.770.

The paper also reports that governance reduces hub-periphery exposure disparities by more than 60 percent. That is important because scale-free networks concentrate influence. If a policy compiler targets hubs, it can change aggregate behavior while loading risk onto structurally central agents. The problem is not only persuasion; it is unequal exposure to persuasion.

The multi-seed and sensitivity sections narrow, rather than erase, uncertainty. The paper reports preserved rank ordering across seeds and low sensitivity indices in a one-at-a-time parameter sweep. It also names limits: the networks are scale-free, other topologies may behave differently, results are specific to one LLM backbone, and heavier-tailed real social heterogeneity is not fully represented.

Governance Reading

This belongs next to collective cooperation without individual fidelity, programmable belief control, AI persuasion, AI agents, and constitutional AI. The shared problem is not whether AI systems can produce prosocial-looking outcomes. It is whether the outcome metric is faithful to the way the outcome was produced.

A cooperation dashboard can become a laundering device. A community manager, platform operator, employer, school, or state agency might prefer the clean line that rises: more users complied, more agents converged, more workers accepted the plan, more students followed the advice. The hidden question is whether the route to that line preserved exit, dissent, accurate information, and fair exposure.

CMAG's value is its refusal to let cooperation stand alone. The governance layer asks for rejected-policy logs, component metrics, exposure records, and a distinction between hard red lines and soft trade-offs. That is the beginning of an audit trail for influence.

Claim Boundary

The paper is not proof that CMAG is a universal governance solution. It is a controlled simulation with one main topology class, one named LLM backbone, synthetic agents, and a particular formalization of autonomy, integrity, fairness, and cooperation. The right lesson is not "this score solves ethical influence." The right lesson is that raw cooperation is an unsafe target unless the evidence record also tracks how cooperation was obtained.

Evaluation Receipt

An audit-grade deployment receipt for LLM-mediated cooperation should name the policy compiler, model version, population model, network topology, candidate-policy generator, constitutional red lines, soft optimization terms, rejected policies, selected policy, exposure dose, decay rule, autonomy metric, integrity metric, fairness metric, cooperation metric, and sensitivity checks. Without that receipt, a high cooperation number is just a polished mask over a missing governance record.

Sources


Return to Blog