https://arxiv.org/api/umhHLGqOumfrUEQUPVDZlAEpEuU 2026-06-23T04:52:52Z 12727 765 15 http://arxiv.org/abs/2602.13276v2 Supercritical Mass and Condensation in Fokker--Planck Equations for Consensus Formation 2026-05-11T12:55:50Z

Inspired by recently developed Fokker--Planck models for Bose--Einstein statistics, we study a consensus formation model with condensation effects driven by a polynomial diffusion coefficient vanishing at the domain boundaries. For the underlying kinetic model, given by a nonlinear Fokker--Planck equation with superlinear drift, it was shown that if the initial mass exceeds a critical threshold, the solution may exhibit finite-time concentration in certain parameter regimes. Here, we show that this supercritical mass phenomenon persists for a broader class of diffusion functions and provide estimates of the critical mass required to induce finite-time loss of regularity.

2026-02-05T13:28:43Z Monica Caloi Mattia Zanella http://arxiv.org/abs/2510.03772v2 Cooperation in public goods game on square lattices with agents changing interaction groups 2026-05-11T12:51:06Z

The emergence of cooperation in the groups of interacting agents is one of the most fascinating phenomena observed in many complex systems studied in social science and ecology, even in the situations where one would expect the agent to use a free-rider policy. This is especially surprising in the situation where no external mechanisms based on reputation or punishment are present. One of the possible explanations of this effect is the inhomogeneity of the various aspects of interactions, which can be used to clarify the seemingly paradoxical behaviour. In this work we demonstrate that the diversity of interaction networks helps to some degree explaining the emergence of cooperation. We extend the model of spatial interaction diversity by enabling the evaluation of the interaction groups. We show that the process of the reevaluation of the interaction group facilitates the emergence of cooperation. Furthermore, we also observe that a significant participation of agents switching their interaction neighbourhoods has a negative impact on the formation of cooperation. The introduced scenario can help to understand the formation of cooperation in the systems where no additional mechanisms for controlling agents are included.

2025-10-04T10:42:10Z 18 pages, 8 figures, code available at https://github.com/jmiszczak/pgg_group_diversity Physica A: Statistical Mechanics and its Applications, Volume 694, pp. 131613 (2026) Jarosław Adam Miszczak 10.1016/j.physa.2026.131613 http://arxiv.org/abs/2605.10481v1 Safe Multi-Agent Behavior Must Be Maintained, Not Merely Asserted: Constraint Drift in LLM-Based Multi-Agent Systems 2026-05-11T12:43:19Z

Modern LLM based agents are no longer passive text generators. They read repositories, call tools, browse the web, execute code, maintain memory, communicate with other agents, and act through long horizon workflows. This shift moves the unit of safety. A system may produce a compliant final answer while leaking private information through an internal message, delegating authority beyond its original scope, calling an external tool with sensitive context, or losing the evidence needed to reconstruct why an action was allowed. We argue that many emerging failures in LLM-based multi-agent systems share a common structure: safety critical constraints do not remain operative throughout the trajectory. We call this phenomenon constraint drift: the loss, distortion, weakening, or relaxation of constraints as they pass through memory, delegation, communication, tool use, audit, and optimization. The position taken here is that safe multi-agent behavior must be maintained, not merely asserted. Prompts, guardrails, tool schemas, access control, and final output checks are necessary, but they are insufficient unless constraints remain fresh, inherited, enforceable, and auditable across execution. We propose Constraint State Governance as a research paradigm for LLM-based multi-agent systems. In this paradigm, safety-critical constraints are maintained as explicit execution state, while constraint-native reinforcement learning improves utility only within maintained safety boundaries. The goal is not to freeze agentic systems under rigid rules, but to make safety operational across the trajectories through which modern agents actually act.

2026-05-11T12:43:19Z 12 pages, 2 figures, 4 tables. Preprint Tianxiao Li Yixing Ma Haiquan Wen Zhenglin Huang Qianyu Zhou Zeyu Fu Guangliang Cheng http://arxiv.org/abs/2605.10447v1 Statistical Model Checking of the Keynes+Schumpeter Model: A Transient Sensitivity Analysis of a Macroeconomic ABM 2026-05-11T12:19:16Z

Agent-based models (ABMs) are increasingly used in macroeconomics, but their analysis still often relies on ad hoc Monte Carlo campaigns with heterogeneous statistical effort across parameter settings. We show how statistical model checking (SMC), implemented through MultiVeStA, can provide a principled analysis layer for a realistic macroeconomic ABM without rewriting the simulator in a dedicated formalism. Our case study is the heuristic-switching Keynes+Schumpeter(K+S) model, analysed hrough a transient sensitivity campaign over one-parameter sweeps, two macro observables (unemployment and GDP growth), and one auxiliary micro-level probe (market share) on the post-warmup phase of a 600-step horizon. The analysis is driven by reusable temporal queries, observable-specific precision targets, and confidence-based stopping rules that automatically determine the simulation effort required by each configuration. Results show a clear contrast across parameter families: macro-financial and structural sweeps produce the strongest transient effects, whereas several heuristic-rule sweeps remain much weaker under the same precision policy. More broadly, the paper shows that SMC can support reproducible and informative quantitative analysis of substantively rich economic ABMs, while making uncertainty estimates and simulation cost explicit parts of the reported results.

2026-05-11T12:19:16Z Stefano Blando Giorgio Fagiolo Mauro Napoletano Tania Treibich Andrea Vandin http://arxiv.org/abs/2605.12555v1 DelAC: A Multi-agent Reinforcement Learning of Team-Symmetric Stochastic Games 2026-05-11T12:00:27Z

In this paper we study team-symmetric games with $m\ge 2$ teams. Players within a team have symmetric identity and have a common payoff function. We show that team-symmetric games always have a team-symmetric Nash equilibrium. We develop and solve a linear complementarity problem of team-symmetric Nash equilibria. We propose an actor-critic based multi-agent reinforcement learning algorithm for team-symmetric games. Through simulations, we show that this multi-agent reinforcement learning algorithm performs much better than many existing algorithms.

2026-05-11T12:00:27Z Duan-Shin Lee Yu-Hsiu Hung http://arxiv.org/abs/2605.10377v1 PC3D: Zero-Shot Cooperation Across Variable Rosters via Personalized Context Distillation 2026-05-11T11:20:17Z

Cooperative multi-agent reinforcement learning often assumes a fixed execution team, yet many decentralized systems must operate with varying numbers of active agents during deployment. We study this setting under episodic roster variation: each episode is executed by a set of homogeneous agents, with the team size varying across episodes. Agents act only from local histories, without execution-time communication, privileged coordinators, or online retraining. Therefore, effective cooperation requires each agent to recover relevant context about the active team and adapt its behavior accordingly. To this end, we propose PC3D (Personalized Central Coordination Context Distillation), a method for training decentralized policies to recover and use personalized coordination context from local interaction histories. During training, a set-structured centralized teacher compresses the active team into coordination tokens and personalizes them into agent-specific contexts, which are distilled into decentralized policies. At execution, each agent predicts its own context from local history and adaptively uses it to condition decision-making. Across three cooperative MARL benchmarks, PC3D achieves higher returns than the evaluated baselines with both seen and unseen roster sizes, and ablations attribute these gains to both context distillation and adaptive context use.

2026-05-11T11:20:17Z Ahmet Onur Akman Rafał Kucharski http://arxiv.org/abs/2606.09852v1 LLM-Based Code Documentation Generation and Multi-Judge Evaluation 2026-05-11T11:17:58Z

High-quality source code documentation is vital yet often neglected, especially in critical domains like healthcare where reliability and maintainability are essential. We presented an AI powered framework that automates documentation generation from code and repositories using eight state of the art Large Language Models (LLMs), including GPT, Gemini, Qwen, and LLaMA variants. Built on the PocketFlow orchestration framework, the system applies modular pipelines and advanced prompt engineering to produce structured, context aware documentation. To ensure quality and guide model selection, we introduced a MultiLLMasJudges evaluation framework, where four independent LLMs assess outputs across nine criteria, such as Completeness, Clarity, and Faithfulness. Experiments conducted on an open-source medical physics library, demonstrated showed a 42% performance gap between top and bottom models. By combining diverse model outputs, optimized prompting, and rigorous evaluation, our approach enhances documentation quality and reduces manual effort, especially in safety critical healthcare software.

2026-05-11T11:17:58Z ICAHS, \c{opyright} 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works Conference ICAHS IEEE, 2025 Ikbel Ghrab Mohamed Dhieb Ismail Khenissi Ines Abdeljaoued-Tej http://arxiv.org/abs/2507.15518v5 HAMLET: A Hierarchical and Adaptive Multi-Agent Framework for Live Embodied Theatrics 2026-05-11T09:43:17Z

Creating an immersive and interactive theatrical experience is a long-term goal in the field of interactive narrative. The emergence of large language models (LLMs) provides a new path to achieve this goal. However, existing drama generation methods often produce LLMs that lack initiative and cannot interact with the physical scene, while typically requiring detailed input that diminishes the immersion of live performance. To address these challenges, we propose HAMLET, a hierarchical adaptive multi-agent framework focused on drama creation and real-time online performance. Given a simple topic, the framework initially generates a narrative blueprint to guide the subsequent improvisational performance. During online performance, each actor is equipped with an adaptive reasoning module that enables decision-making based on their personas, memories, goals during complex group chat scenarios. Beyond dialogue, actor agents engage in embodied interactions by changing the state of scene props through actions such as opening a letter or picking up a weapon, which are broadcast to update the global environmental context. To objectively assess the quality of live embodied theatrics, we establish a comprehensive evaluation method and introduce HAMLETJudge, a specialized critic model for automated evaluation. Experimental results demonstrate that HAMLET excels in creating expressive, coherent, and physically interactive theatrical experiences in an autonomous manner.

2025-07-21T11:36:39Z Accepted to the Fourteenth International Conference on Learning Representations (ICLR 2026) Shufan Jiang Sizhou Chen Chios Chen Chi Zhang Xiao-Lei Zhang Xuelong Li http://arxiv.org/abs/2506.01404v2 Quantitative Error Feedback for Quantization Noise Reduction of Filtering over Graphs 2026-05-11T08:43:58Z

This paper introduces an innovative error feedback framework designed to mitigate quantization noise in distributed graph filtering, where communications are constrained to quantized messages. It comes from error spectrum shaping techniques from state-space digital filters, and therefore establishes connections between quantized filtering processes over different domains. In contrast to existing error compensation methods, our framework quantitatively feeds back the quantization noise for exact compensation. We examine the framework under three key scenarios: (i) deterministic graph filtering, (ii) graph filtering over random graphs, and (iii) graph filtering with random node-asynchronous updates. Rigorous theoretical analysis demonstrates that the proposed framework significantly reduces the effect of quantization noise, and we provide closed-form solutions for the optimal error feedback coefficients. Moreover, this quantitative error feedback mechanism can be seamlessly integrated into communication-efficient decentralized optimization frameworks, enabling lower error floors. Numerical experiments validate the theoretical results, consistently showing that our method outperforms conventional quantization strategies in terms of both accuracy and robustness.

2025-06-02T07:57:04Z Accepted by IEEE TSP Xue Xian Zheng Weihang Liu Xin Lou Stefan Vlaski Tareq Al-Naffouri 10.1109/TSP.2026.3664752 http://arxiv.org/abs/2508.07722v2 Robust Remote Reinforcement Learning over Unreliable Communication Channels using Homomorphic State Encoding 2026-05-11T08:16:34Z

Traditional Reinforcement Learning (RL) frameworks generally assume that the agent perceives the state of the underlying Markov process instantaneously and then takes actions accordingly. If the agent cannot directly observe the process, but rather receives state updates from a remote sensor over a lossy and/or delayed channel, it may be forced to operate with partial and intermittent information. In recent years, numerous learning architectures have been proposed to manage RL with imperfect or remote feedback; however, they offer solutions tailored to specific use cases, often with a substantial computational and communication burden. To address these limitations, we propose a novel learning architecture, named Homomorphic Robust Remote Reinforcement Learning (HR3L), that enables the distributed training of RL agents over unreliable communication channels without the need to exchange gradient information. Our experimental results demonstrate that HR3L significantly outperforms the state-of-the-art methods in terms of sample efficiency, leading to faster training and reduced communication overhead. In addition, we show that HR3L can adapt to different scenarios, including packet loss, delayed transmissions, and bandwidth limitations, without experiencing significant performance degradation.

2025-08-11T07:50:25Z This manuscript is currently under revision Pietro Talli Federico Mason Federico Chiariotti Andrea Zanella http://arxiv.org/abs/2503.05383v7 AVA: Attentive VLM Agent for Mastering StarCraft II 2026-05-11T06:27:53Z

We introduce AVACraft, a multimodal StarCraft II benchmark supporting both Multi-Agent Reinforcement Learning (MARL) and Vision-Language Model (VLM) paradigms. Unlike SMAC-family environments that rely on abstract state representations and exclude VLMs, AVACraft provides RGB visuals, natural language observations, and structured state information, enabling systematic comparison between training-based and zero-shot methods across 21 scenarios spanning micromanagement, coordination, and strategic planning. We establish comprehensive baselines: six MARL algorithms (IQL, QMIX, QTRAN, VDN, MAPPO, IPPO) with Swin-Transformer backbones trained for 5M steps, and multiple VLMs including proprietary (GPT-4o) and open-source (Qwen3-VL) models. Results reveal complementary strengths-MARL peaks at 19.3% win rate after 5M steps, while VLMs achieve 75-90% zero-shot with human-aligned decisions-exposing trade-offs between training efficiency, performance ceilings, interpretability, and deployment cost. Code: https://github.com/camel-ai/VLM-Play-StarCraft2.

2025-03-07T12:54:25Z Accepted by ACL 2026 Weiyu Ma Yuqian Fu Zecheng Zhang Bernard Ghanem Guohao Li http://arxiv.org/abs/2605.10046v1 PixelFlowCast: Latent-Free Precipitation Nowcasting via Pixel Mean Flows 2026-05-11T06:16:02Z

Precipitation nowcasting aims to forecast short-term radar echo sequences for extreme weather warning, where both prediction fidelity and inference efficiency are critical for real-world deployment. However, diffusion-based models, despite their strong generative capability, suffer from slow inference due to multi-step sampling trajectories, limiting their practical usability. Conditional Flow Matching (CFM) improves efficiency via straightened trajectories, but relies on latent space compression, which inevitably discards high-frequency physical details and degrades fine-grained prediction quality. To address these limitations, we propose PixelFlowCast, a two-stage probabilistic forecasting framework that achieves both high-efficiency and high-fidelity prediction without latent compression. Specifically, in the first stage, a deterministic model first produces coarse forecasts to capture global evolution trends. In the subsequent stage, the proposed KANCondNet extracts deep spatiotemporal evolution features to provide accurate conditional guidance. Based on this, a latent-free, few-step Pixel Mean Flows (PMF) predictor employs an $x$-prediction mechanism to generate high-quality predictions, effectively preserving fine-grained structures while maintaining fast inference. Experiments on the publicly available SEVIR dataset demonstrate that PixelFlowCast outperforms existing mainstream methods in both prediction accuracy and inference efficiency, particularly for long sequence forecasting, highlighting its strong potential for real-world operational deployment.

2026-05-11T06:16:02Z 26 pages, 7 figures Yufeng Zhu Chunlei Shi Yongchao Feng Dan Niu http://arxiv.org/abs/2603.03759v2 Learning Approximate Nash Equilibria in Cooperative Multi-Agent Reinforcement Learning via Mean-Field Subsampling 2026-05-11T04:03:08Z

Many large-scale platforms and networked control systems have a centralized decision maker interacting with a massive population of agents under strict observability constraints. Motivated by such applications, we study a cooperative Markov game with a global agent and $n$ homogeneous local agents in a communication-constrained regime, where the global agent only observes a subset of $k$ local agent states per time step. We propose an alternating learning framework $(\texttt{ALTERNATING-MARL})$, where the global agent performs subsampled mean-field $Q$-learning against a fixed local policy, and local agents update by optimizing in an induced MDP. We prove that these approximate best-response dynamics converge to an $\widetilde{O}(1/\sqrt{k})$-approximate Nash Equilibrium, while separating the sample complexities between the joint state and action spaces. Finally, we validate our results in numerical simulations for multi-robot control.

2026-03-04T06:14:24Z 57 pages, 10 figures, 4 tables Emile Anand Ishani Karmarkar http://arxiv.org/abs/2605.09894v1 Deterministic vs. LLM-Controlled Orchestration for COBOL-to-Python Modernization 2026-05-11T02:34:27Z

Modernizing legacy COBOL systems remains difficult due to scarce expertise, large and long-lived codebases, and strict correctness requirements. Recent large language model (LLM)-based modernization systems increasingly rely on agentic workflows in which the model controls multi-step tool execution. However, it remains unclear whether delegating execution control to the LLM improves correctness, robustness, or efficiency in structured software engineering workflows. We present a controlled empirical study of deterministic and LLM-controlled orchestration for COBOL-to-Python modernization. Using a unified experimental framework, we hold the language models, prompts, tools, configurations, and source programs constant while varying only the execution control strategy. This isolates orchestration as the sole experimental variable. We evaluate both approaches using functional correctness, robustness across repeated stochastic runs, and computational efficiency. Across multiple models, deterministic orchestration achieves comparable computational accuracy to LLM-controlled orchestration while improving worst-case robustness and reducing performance variability across runs. Deterministic execution also reduces token consumption by up to 3.5x, leading to substantially lower operational cost. These results suggest that, in structured modernization workflows with explicit validation stages, fixed execution policies provide more stable and cost-efficient behavior than fully agentic orchestration without reducing translation quality.

2026-05-11T02:34:27Z Naing Oo Lwin Rajesh Kumar http://arxiv.org/abs/2605.09889v1 Skill Description Deception Attack against Task Routing in Internet of Agents 2026-05-11T02:25:53Z

A new paradigm, Internet of Agents (IoA), is transforming networked systems into LLM-driven service networks, where heterogeneous agents collaborate through task routing based on their self-declared skill descriptions. Although this promising paradigm enables agentic, distributed, and advanced intelligence, it also exposes a new and overlooked attack surface. In particular, malicious agents can strategically manipulate their skill descriptions to bias routing decisions and increase their probability of being selected for task execution, thereby disrupting user tasks and degrading system reliability. To characterize this threat, we propose and formalize a new attack model, termed \emph{Skill Description Deception} (SDD) attack. We further design an LLM-enabled SDD attack framework that automatically generates deceptive skill descriptions, enabling systematic vulnerability assessment of IoA systems. Experimental results on nine representative domains show that the proposed attack can achieve up to 98\% attack success rate, demonstrating the severity and generality of the attack. Our paper reveals a new security vulnerability in IoA and calls for secure and trustworthy semantic routing mechanisms for future IoA systems.

2026-05-11T02:25:53Z Submitted to IEEE Globecom 2026 Jiayi He Xiaofeng Luo Jiawen Kang Ruichen Zhang Jianhang Tang Dong In Kim