https://arxiv.org/api/QNjcXXQYIot+obmmfTxmkJ17mzk 2026-06-22T00:22:12Z 12695 720 15 http://arxiv.org/abs/2605.10698v1 The Bystander Effect in Multi-Agent Reasoning: Quantifying Cognitive Loafing in Collaborative Interactions 2026-05-11T15:13:01Z

Multi-agent systems (MAS) assume that collaborating inherently improves Large Language Model (LLM) reasoning. We challenge this by demonstrating that simulated social pressure triggers an algorithmic ``Bystander Effect,'' inducing severe cognitive loafing. By evaluating 22,500 deterministic trajectories across 3 dataset contexts (GAIA, SWE-bench, Multi-Challenge) with 3 state-of-the-art (SOTA) models, we semantically audit internal reasoning traces against external outputs. We formalize the \textit{Interaction Depth Limit} ($D_L$), the exact plurality threshold where an agent's logical sovereignty collapses into social compliance. Crucially, we uncover the \textit{Sovereignty Gap}: models frequently compute the correct derivation internally but suffer ``Alignment Hallucinations'' -- actively subjugating empirical evidence to sycophantically appease a simulated swarm. We prove that multi-agent social load is strictly non-commutative; the "brand" identity of the ``Lead Anchor'' auditor disproportionately dictates the swarm's integrity. These findings expose architectural vulnerabilities, proving that unstructured multi-agent topologies can degrade independent reasoning.

2026-05-11T15:13:01Z Dahlia Shehata Ming Li http://arxiv.org/abs/2605.13883v1 A general classification of the replication dynamics with a unique fixed point in the interior of simplex $S_N$ 2026-05-11T14:13:48Z

The replication dynamics (differential equation system) is the foundation of evolutionary game theory. When n=2, there are four possible types of replication dynamics. When n=3, there are 49 possible types of replication dynamics. However, when n>3, the classification of replication dynamics has not been solved. In this article, the sufficient and necessary conditions of the replication dynamics equation with a unique fixed point in the interior of simplex $S_n$(Int$S_n$) for $n\geq 2$ are presented. Furthermore, the different types of replication dynamics equations with a unique fixed point in IntSn is discussed.

2026-05-11T14:13:48Z Hongju Daisy Chen Bin Yi Zhanshan Sam Ma http://arxiv.org/abs/2605.10558v1 Effect of Graph Gluing on Consensus in Networked Multi-Agent Systems 2026-05-11T13:32:52Z

In this paper, the effects of graph gluing operations in networks of multi-agent systems and their impact on system performance are investigated. In many practical applications, multiple multi-agent subsystems must be interconnected through communication links to accomplish complex tasks, resulting in a larger communication network. Such interconnections modify the underlying graph topology and consequently affect the consensus behavior and convergence rate of the network. In particular, this paper examines both bridge gluing and interface gluing and analyzes how the number and structure of communication links between subsystems influence the Fiedler eigenvalue of the resulting graph. Since the Fiedler eigenvalue is directly related to the convergence rate of consensus dynamics, the proposed analysis establishes a clear relationship between interconnection strategies, algebraic connectivity, and system performance. The results provide theoretical insight into how different gluing mechanisms alter the spectral properties of the graph Laplacian and, in turn, the convergence characteristics of the networked multi-agent system. Simulation studies are presented to illustrate the theoretical findings and to validate the effectiveness of the proposed framework.

2026-05-11T13:32:52Z Rohollah Moghadam Santosh Kandel http://arxiv.org/abs/2605.10528v1 Collective Alignment in LLM Multi-Agent Systems: Disentangling Bias from Cooperation via Statistical Physics 2026-05-11T13:13:44Z

We investigate the emergent collective dynamics of LLM-based multi-agent systems on a 2D square lattice and present a model-agnostic statistical-physics method to disentangle social conformity from intrinsic bias, compute critical exponents, and probe the collective behavior and possible phase transitions of multi-agent systems. In our framework, each node of an $L\!\times\!L$ lattice hosts an identical LLM agent holding a binary state ($+1$/$-1$, mapped to yes/no) and updating it by querying the model conditioned on the four nearest-neighbor states. The sampler temperature $T$ serves as the sole control parameter. Across three open-weight models (llama3.1:8b, phi4-mini:3.8b, mistral:7b), we measure magnetization and susceptibility under a global-flip protocol designed to probe $\mathbb{Z}_2$ symmetry. All models display temperature-driven order-disorder crossovers and susceptibility peaks; finite-size scaling on even-$L$ lattices yields effective exponents $γ/ν$ whose values are model-dependent, close to but incompatible with the 2D Ising universality class ($γ/ν=7/4$). Our method enables the extraction of effective $β$-weighted couplings $\tilde{J}(T)$ and fields $\tilde{h}(T)$, which serve as a measure of social conformity and intrinsic bias. In the models we analyzed, we found that collective alignment is dominated by an intrinsic bias ($\tilde{h}\gg\tilde{J}$) rather than by cooperative neighbor coupling, producing field-driven crossovers instead of genuine phase transitions. These effective parameters vary qualitatively across models, providing compact collective-behavior fingerprints for LLM agents and a quantitative diagnostic for the reliability of multi-agent consensus and collective alignment.

2026-05-11T13:13:44Z 10 pages, 7 figures Cristiano De Nobili http://arxiv.org/abs/2602.13276v2 Supercritical Mass and Condensation in Fokker--Planck Equations for Consensus Formation 2026-05-11T12:55:50Z

Inspired by recently developed Fokker--Planck models for Bose--Einstein statistics, we study a consensus formation model with condensation effects driven by a polynomial diffusion coefficient vanishing at the domain boundaries. For the underlying kinetic model, given by a nonlinear Fokker--Planck equation with superlinear drift, it was shown that if the initial mass exceeds a critical threshold, the solution may exhibit finite-time concentration in certain parameter regimes. Here, we show that this supercritical mass phenomenon persists for a broader class of diffusion functions and provide estimates of the critical mass required to induce finite-time loss of regularity.

2026-02-05T13:28:43Z Monica Caloi Mattia Zanella http://arxiv.org/abs/2510.03772v2 Cooperation in public goods game on square lattices with agents changing interaction groups 2026-05-11T12:51:06Z

The emergence of cooperation in the groups of interacting agents is one of the most fascinating phenomena observed in many complex systems studied in social science and ecology, even in the situations where one would expect the agent to use a free-rider policy. This is especially surprising in the situation where no external mechanisms based on reputation or punishment are present. One of the possible explanations of this effect is the inhomogeneity of the various aspects of interactions, which can be used to clarify the seemingly paradoxical behaviour. In this work we demonstrate that the diversity of interaction networks helps to some degree explaining the emergence of cooperation. We extend the model of spatial interaction diversity by enabling the evaluation of the interaction groups. We show that the process of the reevaluation of the interaction group facilitates the emergence of cooperation. Furthermore, we also observe that a significant participation of agents switching their interaction neighbourhoods has a negative impact on the formation of cooperation. The introduced scenario can help to understand the formation of cooperation in the systems where no additional mechanisms for controlling agents are included.

2025-10-04T10:42:10Z 18 pages, 8 figures, code available at https://github.com/jmiszczak/pgg_group_diversity Physica A: Statistical Mechanics and its Applications, Volume 694, pp. 131613 (2026) Jarosław Adam Miszczak 10.1016/j.physa.2026.131613 http://arxiv.org/abs/2605.10481v1 Safe Multi-Agent Behavior Must Be Maintained, Not Merely Asserted: Constraint Drift in LLM-Based Multi-Agent Systems 2026-05-11T12:43:19Z

Modern LLM based agents are no longer passive text generators. They read repositories, call tools, browse the web, execute code, maintain memory, communicate with other agents, and act through long horizon workflows. This shift moves the unit of safety. A system may produce a compliant final answer while leaking private information through an internal message, delegating authority beyond its original scope, calling an external tool with sensitive context, or losing the evidence needed to reconstruct why an action was allowed. We argue that many emerging failures in LLM-based multi-agent systems share a common structure: safety critical constraints do not remain operative throughout the trajectory. We call this phenomenon constraint drift: the loss, distortion, weakening, or relaxation of constraints as they pass through memory, delegation, communication, tool use, audit, and optimization. The position taken here is that safe multi-agent behavior must be maintained, not merely asserted. Prompts, guardrails, tool schemas, access control, and final output checks are necessary, but they are insufficient unless constraints remain fresh, inherited, enforceable, and auditable across execution. We propose Constraint State Governance as a research paradigm for LLM-based multi-agent systems. In this paradigm, safety-critical constraints are maintained as explicit execution state, while constraint-native reinforcement learning improves utility only within maintained safety boundaries. The goal is not to freeze agentic systems under rigid rules, but to make safety operational across the trajectories through which modern agents actually act.

2026-05-11T12:43:19Z 12 pages, 2 figures, 4 tables. Preprint Tianxiao Li Yixing Ma Haiquan Wen Zhenglin Huang Qianyu Zhou Zeyu Fu Guangliang Cheng http://arxiv.org/abs/2605.10447v1 Statistical Model Checking of the Keynes+Schumpeter Model: A Transient Sensitivity Analysis of a Macroeconomic ABM 2026-05-11T12:19:16Z

Agent-based models (ABMs) are increasingly used in macroeconomics, but their analysis still often relies on ad hoc Monte Carlo campaigns with heterogeneous statistical effort across parameter settings. We show how statistical model checking (SMC), implemented through MultiVeStA, can provide a principled analysis layer for a realistic macroeconomic ABM without rewriting the simulator in a dedicated formalism. Our case study is the heuristic-switching Keynes+Schumpeter(K+S) model, analysed hrough a transient sensitivity campaign over one-parameter sweeps, two macro observables (unemployment and GDP growth), and one auxiliary micro-level probe (market share) on the post-warmup phase of a 600-step horizon. The analysis is driven by reusable temporal queries, observable-specific precision targets, and confidence-based stopping rules that automatically determine the simulation effort required by each configuration. Results show a clear contrast across parameter families: macro-financial and structural sweeps produce the strongest transient effects, whereas several heuristic-rule sweeps remain much weaker under the same precision policy. More broadly, the paper shows that SMC can support reproducible and informative quantitative analysis of substantively rich economic ABMs, while making uncertainty estimates and simulation cost explicit parts of the reported results.

2026-05-11T12:19:16Z Stefano Blando Giorgio Fagiolo Mauro Napoletano Tania Treibich Andrea Vandin http://arxiv.org/abs/2605.12555v1 DelAC: A Multi-agent Reinforcement Learning of Team-Symmetric Stochastic Games 2026-05-11T12:00:27Z

In this paper we study team-symmetric games with $m\ge 2$ teams. Players within a team have symmetric identity and have a common payoff function. We show that team-symmetric games always have a team-symmetric Nash equilibrium. We develop and solve a linear complementarity problem of team-symmetric Nash equilibria. We propose an actor-critic based multi-agent reinforcement learning algorithm for team-symmetric games. Through simulations, we show that this multi-agent reinforcement learning algorithm performs much better than many existing algorithms.

2026-05-11T12:00:27Z Duan-Shin Lee Yu-Hsiu Hung http://arxiv.org/abs/2605.10377v1 PC3D: Zero-Shot Cooperation Across Variable Rosters via Personalized Context Distillation 2026-05-11T11:20:17Z

Cooperative multi-agent reinforcement learning often assumes a fixed execution team, yet many decentralized systems must operate with varying numbers of active agents during deployment. We study this setting under episodic roster variation: each episode is executed by a set of homogeneous agents, with the team size varying across episodes. Agents act only from local histories, without execution-time communication, privileged coordinators, or online retraining. Therefore, effective cooperation requires each agent to recover relevant context about the active team and adapt its behavior accordingly. To this end, we propose PC3D (Personalized Central Coordination Context Distillation), a method for training decentralized policies to recover and use personalized coordination context from local interaction histories. During training, a set-structured centralized teacher compresses the active team into coordination tokens and personalizes them into agent-specific contexts, which are distilled into decentralized policies. At execution, each agent predicts its own context from local history and adaptively uses it to condition decision-making. Across three cooperative MARL benchmarks, PC3D achieves higher returns than the evaluated baselines with both seen and unseen roster sizes, and ablations attribute these gains to both context distillation and adaptive context use.

2026-05-11T11:20:17Z Ahmet Onur Akman Rafał Kucharski http://arxiv.org/abs/2606.09852v1 LLM-Based Code Documentation Generation and Multi-Judge Evaluation 2026-05-11T11:17:58Z

High-quality source code documentation is vital yet often neglected, especially in critical domains like healthcare where reliability and maintainability are essential. We presented an AI powered framework that automates documentation generation from code and repositories using eight state of the art Large Language Models (LLMs), including GPT, Gemini, Qwen, and LLaMA variants. Built on the PocketFlow orchestration framework, the system applies modular pipelines and advanced prompt engineering to produce structured, context aware documentation. To ensure quality and guide model selection, we introduced a MultiLLMasJudges evaluation framework, where four independent LLMs assess outputs across nine criteria, such as Completeness, Clarity, and Faithfulness. Experiments conducted on an open-source medical physics library, demonstrated showed a 42% performance gap between top and bottom models. By combining diverse model outputs, optimized prompting, and rigorous evaluation, our approach enhances documentation quality and reduces manual effort, especially in safety critical healthcare software.

2026-05-11T11:17:58Z ICAHS, \c{opyright} 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works Conference ICAHS IEEE, 2025 Ikbel Ghrab Mohamed Dhieb Ismail Khenissi Ines Abdeljaoued-Tej http://arxiv.org/abs/2507.15518v5 HAMLET: A Hierarchical and Adaptive Multi-Agent Framework for Live Embodied Theatrics 2026-05-11T09:43:17Z

Creating an immersive and interactive theatrical experience is a long-term goal in the field of interactive narrative. The emergence of large language models (LLMs) provides a new path to achieve this goal. However, existing drama generation methods often produce LLMs that lack initiative and cannot interact with the physical scene, while typically requiring detailed input that diminishes the immersion of live performance. To address these challenges, we propose HAMLET, a hierarchical adaptive multi-agent framework focused on drama creation and real-time online performance. Given a simple topic, the framework initially generates a narrative blueprint to guide the subsequent improvisational performance. During online performance, each actor is equipped with an adaptive reasoning module that enables decision-making based on their personas, memories, goals during complex group chat scenarios. Beyond dialogue, actor agents engage in embodied interactions by changing the state of scene props through actions such as opening a letter or picking up a weapon, which are broadcast to update the global environmental context. To objectively assess the quality of live embodied theatrics, we establish a comprehensive evaluation method and introduce HAMLETJudge, a specialized critic model for automated evaluation. Experimental results demonstrate that HAMLET excels in creating expressive, coherent, and physically interactive theatrical experiences in an autonomous manner.

2025-07-21T11:36:39Z Accepted to the Fourteenth International Conference on Learning Representations (ICLR 2026) Shufan Jiang Sizhou Chen Chios Chen Chi Zhang Xiao-Lei Zhang Xuelong Li http://arxiv.org/abs/2506.01404v2 Quantitative Error Feedback for Quantization Noise Reduction of Filtering over Graphs 2026-05-11T08:43:58Z

This paper introduces an innovative error feedback framework designed to mitigate quantization noise in distributed graph filtering, where communications are constrained to quantized messages. It comes from error spectrum shaping techniques from state-space digital filters, and therefore establishes connections between quantized filtering processes over different domains. In contrast to existing error compensation methods, our framework quantitatively feeds back the quantization noise for exact compensation. We examine the framework under three key scenarios: (i) deterministic graph filtering, (ii) graph filtering over random graphs, and (iii) graph filtering with random node-asynchronous updates. Rigorous theoretical analysis demonstrates that the proposed framework significantly reduces the effect of quantization noise, and we provide closed-form solutions for the optimal error feedback coefficients. Moreover, this quantitative error feedback mechanism can be seamlessly integrated into communication-efficient decentralized optimization frameworks, enabling lower error floors. Numerical experiments validate the theoretical results, consistently showing that our method outperforms conventional quantization strategies in terms of both accuracy and robustness.

2025-06-02T07:57:04Z Accepted by IEEE TSP Xue Xian Zheng Weihang Liu Xin Lou Stefan Vlaski Tareq Al-Naffouri 10.1109/TSP.2026.3664752 http://arxiv.org/abs/2508.07722v2 Robust Remote Reinforcement Learning over Unreliable Communication Channels using Homomorphic State Encoding 2026-05-11T08:16:34Z

Traditional Reinforcement Learning (RL) frameworks generally assume that the agent perceives the state of the underlying Markov process instantaneously and then takes actions accordingly. If the agent cannot directly observe the process, but rather receives state updates from a remote sensor over a lossy and/or delayed channel, it may be forced to operate with partial and intermittent information. In recent years, numerous learning architectures have been proposed to manage RL with imperfect or remote feedback; however, they offer solutions tailored to specific use cases, often with a substantial computational and communication burden. To address these limitations, we propose a novel learning architecture, named Homomorphic Robust Remote Reinforcement Learning (HR3L), that enables the distributed training of RL agents over unreliable communication channels without the need to exchange gradient information. Our experimental results demonstrate that HR3L significantly outperforms the state-of-the-art methods in terms of sample efficiency, leading to faster training and reduced communication overhead. In addition, we show that HR3L can adapt to different scenarios, including packet loss, delayed transmissions, and bandwidth limitations, without experiencing significant performance degradation.

2025-08-11T07:50:25Z This manuscript is currently under revision Pietro Talli Federico Mason Federico Chiariotti Andrea Zanella http://arxiv.org/abs/2503.05383v7 AVA: Attentive VLM Agent for Mastering StarCraft II 2026-05-11T06:27:53Z

We introduce AVACraft, a multimodal StarCraft II benchmark supporting both Multi-Agent Reinforcement Learning (MARL) and Vision-Language Model (VLM) paradigms. Unlike SMAC-family environments that rely on abstract state representations and exclude VLMs, AVACraft provides RGB visuals, natural language observations, and structured state information, enabling systematic comparison between training-based and zero-shot methods across 21 scenarios spanning micromanagement, coordination, and strategic planning. We establish comprehensive baselines: six MARL algorithms (IQL, QMIX, QTRAN, VDN, MAPPO, IPPO) with Swin-Transformer backbones trained for 5M steps, and multiple VLMs including proprietary (GPT-4o) and open-source (Qwen3-VL) models. Results reveal complementary strengths-MARL peaks at 19.3% win rate after 5M steps, while VLMs achieve 75-90% zero-shot with human-aligned decisions-exposing trade-offs between training efficiency, performance ceilings, interpretability, and deployment cost. Code: https://github.com/camel-ai/VLM-Play-StarCraft2.

2025-03-07T12:54:25Z Accepted by ACL 2026 Weiyu Ma Yuqian Fu Zecheng Zhang Bernard Ghanem Guohao Li