https://arxiv.org/api/umhHLGqOumfrUEQUPVDZlAEpEuU2026-06-23T04:52:52Z1272776515http://arxiv.org/abs/2602.13276v2Supercritical Mass and Condensation in Fokker--Planck Equations for Consensus Formation2026-05-11T12:55:50ZInspired by recently developed Fokker--Planck models for Bose--Einstein statistics, we study a consensus formation model with condensation effects driven by a polynomial diffusion coefficient vanishing at the domain boundaries. For the underlying kinetic model, given by a nonlinear Fokker--Planck equation with superlinear drift, it was shown that if the initial mass exceeds a critical threshold, the solution may exhibit finite-time concentration in certain parameter regimes. Here, we show that this supercritical mass phenomenon persists for a broader class of diffusion functions and provide estimates of the critical mass required to induce finite-time loss of regularity.2026-02-05T13:28:43ZMonica CaloiMattia Zanellahttp://arxiv.org/abs/2510.03772v2Cooperation in public goods game on square lattices with agents changing interaction groups2026-05-11T12:51:06ZThe emergence of cooperation in the groups of interacting agents is one of the most fascinating phenomena observed in many complex systems studied in social science and ecology, even in the situations where one would expect the agent to use a free-rider policy. This is especially surprising in the situation where no external mechanisms based on reputation or punishment are present. One of the possible explanations of this effect is the inhomogeneity of the various aspects of interactions, which can be used to clarify the seemingly paradoxical behaviour. In this work we demonstrate that the diversity of interaction networks helps to some degree explaining the emergence of cooperation. We extend the model of spatial interaction diversity by enabling the evaluation of the interaction groups. We show that the process of the reevaluation of the interaction group facilitates the emergence of cooperation. Furthermore, we also observe that a significant participation of agents switching their interaction neighbourhoods has a negative impact on the formation of cooperation. The introduced scenario can help to understand the formation of cooperation in the systems where no additional mechanisms for controlling agents are included.2025-10-04T10:42:10Z18 pages, 8 figures, code available at https://github.com/jmiszczak/pgg_group_diversityPhysica A: Statistical Mechanics and its Applications, Volume 694, pp. 131613 (2026)Jarosław Adam Miszczak10.1016/j.physa.2026.131613http://arxiv.org/abs/2605.10481v1Safe Multi-Agent Behavior Must Be Maintained, Not Merely Asserted: Constraint Drift in LLM-Based Multi-Agent Systems2026-05-11T12:43:19ZModern LLM based agents are no longer passive text generators. They read repositories, call tools, browse the web, execute code, maintain memory, communicate with other agents, and act through long horizon workflows. This shift moves the unit of safety. A system may produce a compliant final answer while leaking private information through an internal message, delegating authority beyond its original scope, calling an external tool with sensitive context, or losing the evidence needed to reconstruct why an action was allowed. We argue that many emerging failures in LLM-based multi-agent systems share a common structure: safety critical constraints do not remain operative throughout the trajectory. We call this phenomenon constraint drift: the loss, distortion, weakening, or relaxation of constraints as they pass through memory, delegation, communication, tool use, audit, and optimization. The position taken here is that safe multi-agent behavior must be maintained, not merely asserted. Prompts, guardrails, tool schemas, access control, and final output checks are necessary, but they are insufficient unless constraints remain fresh, inherited, enforceable, and auditable across execution. We propose Constraint State Governance as a research paradigm for LLM-based multi-agent systems. In this paradigm, safety-critical constraints are maintained as explicit execution state, while constraint-native reinforcement learning improves utility only within maintained safety boundaries. The goal is not to freeze agentic systems under rigid rules, but to make safety operational across the trajectories through which modern agents actually act.2026-05-11T12:43:19Z12 pages, 2 figures, 4 tables. PreprintTianxiao LiYixing MaHaiquan WenZhenglin HuangQianyu ZhouZeyu FuGuangliang Chenghttp://arxiv.org/abs/2605.10447v1Statistical Model Checking of the Keynes+Schumpeter Model: A Transient Sensitivity Analysis of a Macroeconomic ABM2026-05-11T12:19:16ZAgent-based models (ABMs) are increasingly used in macroeconomics, but their analysis still often relies on ad hoc Monte Carlo campaigns with heterogeneous statistical effort across parameter settings. We show how statistical model checking (SMC), implemented through MultiVeStA, can provide a principled analysis layer for a realistic macroeconomic ABM without rewriting the simulator in a dedicated formalism. Our case study is the heuristic-switching Keynes+Schumpeter(K+S) model, analysed hrough a transient sensitivity campaign over one-parameter sweeps, two macro observables (unemployment and GDP growth), and one auxiliary micro-level probe (market share) on the post-warmup phase of a 600-step horizon. The analysis is driven by reusable temporal queries, observable-specific precision targets, and confidence-based stopping rules that automatically determine the simulation effort required by each configuration. Results show a clear contrast across parameter families: macro-financial and structural sweeps produce the strongest transient effects, whereas several heuristic-rule sweeps remain much weaker under the same precision policy. More broadly, the paper shows that SMC can support reproducible and informative quantitative analysis of substantively rich economic ABMs, while making uncertainty estimates and simulation cost explicit parts of the reported results.2026-05-11T12:19:16ZStefano BlandoGiorgio FagioloMauro NapoletanoTania TreibichAndrea Vandinhttp://arxiv.org/abs/2605.12555v1DelAC: A Multi-agent Reinforcement Learning of Team-Symmetric Stochastic Games2026-05-11T12:00:27ZIn this paper we study team-symmetric games with $m\ge 2$ teams. Players within a team have symmetric identity and have a common payoff function. We show that team-symmetric games always have a team-symmetric Nash equilibrium. We develop and solve a linear complementarity problem of team-symmetric Nash equilibria. We propose an actor-critic based multi-agent reinforcement learning algorithm for team-symmetric games. Through simulations, we show that this multi-agent reinforcement learning algorithm performs much better than many existing algorithms.2026-05-11T12:00:27ZDuan-Shin LeeYu-Hsiu Hunghttp://arxiv.org/abs/2605.10377v1PC3D: Zero-Shot Cooperation Across Variable Rosters via Personalized Context Distillation2026-05-11T11:20:17ZCooperative multi-agent reinforcement learning often assumes a fixed execution team, yet many decentralized systems must operate with varying numbers of active agents during deployment. We study this setting under episodic roster variation: each episode is executed by a set of homogeneous agents, with the team size varying across episodes. Agents act only from local histories, without execution-time communication, privileged coordinators, or online retraining. Therefore, effective cooperation requires each agent to recover relevant context about the active team and adapt its behavior accordingly. To this end, we propose PC3D (Personalized Central Coordination Context Distillation), a method for training decentralized policies to recover and use personalized coordination context from local interaction histories. During training, a set-structured centralized teacher compresses the active team into coordination tokens and personalizes them into agent-specific contexts, which are distilled into decentralized policies. At execution, each agent predicts its own context from local history and adaptively uses it to condition decision-making. Across three cooperative MARL benchmarks, PC3D achieves higher returns than the evaluated baselines with both seen and unseen roster sizes, and ablations attribute these gains to both context distillation and adaptive context use.2026-05-11T11:20:17ZAhmet Onur AkmanRafał Kucharskihttp://arxiv.org/abs/2606.09852v1LLM-Based Code Documentation Generation and Multi-Judge Evaluation2026-05-11T11:17:58ZHigh-quality source code documentation is vital yet often neglected, especially in critical domains like healthcare where reliability and maintainability are essential. We presented an AI powered framework that automates documentation generation from code and repositories using eight state of the art Large Language Models (LLMs), including GPT, Gemini, Qwen, and LLaMA variants. Built on the PocketFlow orchestration framework, the system applies modular pipelines and advanced prompt engineering to produce structured, context aware documentation. To ensure quality and guide model selection, we introduced a MultiLLMasJudges evaluation framework, where four independent LLMs assess outputs across nine criteria, such as Completeness, Clarity, and Faithfulness. Experiments conducted on an open-source medical physics library, demonstrated showed a 42% performance gap between top and bottom models. By combining diverse model outputs, optimized prompting, and rigorous evaluation, our approach enhances documentation quality and reduces manual effort, especially in safety critical healthcare software.2026-05-11T11:17:58ZICAHS, \c{opyright} 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksConference ICAHS IEEE, 2025Ikbel GhrabMohamed DhiebIsmail KhenissiInes Abdeljaoued-Tejhttp://arxiv.org/abs/2507.15518v5HAMLET: A Hierarchical and Adaptive Multi-Agent Framework for Live Embodied Theatrics2026-05-11T09:43:17ZCreating an immersive and interactive theatrical experience is a long-term goal in the field of interactive narrative. The emergence of large language models (LLMs) provides a new path to achieve this goal. However, existing drama generation methods often produce LLMs that lack initiative and cannot interact with the physical scene, while typically requiring detailed input that diminishes the immersion of live performance. To address these challenges, we propose HAMLET, a hierarchical adaptive multi-agent framework focused on drama creation and real-time online performance. Given a simple topic, the framework initially generates a narrative blueprint to guide the subsequent improvisational performance. During online performance, each actor is equipped with an adaptive reasoning module that enables decision-making based on their personas, memories, goals during complex group chat scenarios. Beyond dialogue, actor agents engage in embodied interactions by changing the state of scene props through actions such as opening a letter or picking up a weapon, which are broadcast to update the global environmental context. To objectively assess the quality of live embodied theatrics, we establish a comprehensive evaluation method and introduce HAMLETJudge, a specialized critic model for automated evaluation. Experimental results demonstrate that HAMLET excels in creating expressive, coherent, and physically interactive theatrical experiences in an autonomous manner.2025-07-21T11:36:39ZAccepted to the Fourteenth International Conference on Learning Representations (ICLR 2026)Shufan JiangSizhou ChenChios ChenChi ZhangXiao-Lei ZhangXuelong Lihttp://arxiv.org/abs/2506.01404v2Quantitative Error Feedback for Quantization Noise Reduction of Filtering over Graphs2026-05-11T08:43:58ZThis paper introduces an innovative error feedback framework designed to mitigate quantization noise in distributed graph filtering, where communications are constrained to quantized messages. It comes from error spectrum shaping techniques from state-space digital filters, and therefore establishes connections between quantized filtering processes over different domains. In contrast to existing error compensation methods, our framework quantitatively feeds back the quantization noise for exact compensation. We examine the framework under three key scenarios: (i) deterministic graph filtering, (ii) graph filtering over random graphs, and (iii) graph filtering with random node-asynchronous updates. Rigorous theoretical analysis demonstrates that the proposed framework significantly reduces the effect of quantization noise, and we provide closed-form solutions for the optimal error feedback coefficients. Moreover, this quantitative error feedback mechanism can be seamlessly integrated into communication-efficient decentralized optimization frameworks, enabling lower error floors. Numerical experiments validate the theoretical results, consistently showing that our method outperforms conventional quantization strategies in terms of both accuracy and robustness.2025-06-02T07:57:04ZAccepted by IEEE TSPXue Xian ZhengWeihang LiuXin LouStefan VlaskiTareq Al-Naffouri10.1109/TSP.2026.3664752http://arxiv.org/abs/2508.07722v2Robust Remote Reinforcement Learning over Unreliable Communication Channels using Homomorphic State Encoding2026-05-11T08:16:34ZTraditional Reinforcement Learning (RL) frameworks generally assume that the agent perceives the state of the underlying Markov process instantaneously and then takes actions accordingly. If the agent cannot directly observe the process, but rather receives state updates from a remote sensor over a lossy and/or delayed channel, it may be forced to operate with partial and intermittent information. In recent years, numerous learning architectures have been proposed to manage RL with imperfect or remote feedback; however, they offer solutions tailored to specific use cases, often with a substantial computational and communication burden. To address these limitations, we propose a novel learning architecture, named Homomorphic Robust Remote Reinforcement Learning (HR3L), that enables the distributed training of RL agents over unreliable communication channels without the need to exchange gradient information. Our experimental results demonstrate that HR3L significantly outperforms the state-of-the-art methods in terms of sample efficiency, leading to faster training and reduced communication overhead. In addition, we show that HR3L can adapt to different scenarios, including packet loss, delayed transmissions, and bandwidth limitations, without experiencing significant performance degradation.2025-08-11T07:50:25ZThis manuscript is currently under revisionPietro TalliFederico MasonFederico ChiariottiAndrea Zanellahttp://arxiv.org/abs/2503.05383v7AVA: Attentive VLM Agent for Mastering StarCraft II2026-05-11T06:27:53ZWe introduce AVACraft, a multimodal StarCraft II benchmark supporting both Multi-Agent Reinforcement Learning (MARL) and Vision-Language Model (VLM) paradigms. Unlike SMAC-family environments that rely on abstract state representations and exclude VLMs, AVACraft provides RGB visuals, natural language observations, and structured state information, enabling systematic comparison between training-based and zero-shot methods across 21 scenarios spanning micromanagement, coordination, and strategic planning. We establish comprehensive baselines: six MARL algorithms (IQL, QMIX, QTRAN, VDN, MAPPO, IPPO) with Swin-Transformer backbones trained for 5M steps, and multiple VLMs including proprietary (GPT-4o) and open-source (Qwen3-VL) models. Results reveal complementary strengths-MARL peaks at 19.3% win rate after 5M steps, while VLMs achieve 75-90% zero-shot with human-aligned decisions-exposing trade-offs between training efficiency, performance ceilings, interpretability, and deployment cost. Code: https://github.com/camel-ai/VLM-Play-StarCraft2.2025-03-07T12:54:25ZAccepted by ACL 2026Weiyu MaYuqian FuZecheng ZhangBernard GhanemGuohao Lihttp://arxiv.org/abs/2605.10046v1PixelFlowCast: Latent-Free Precipitation Nowcasting via Pixel Mean Flows2026-05-11T06:16:02ZPrecipitation nowcasting aims to forecast short-term radar echo sequences for extreme weather warning, where both prediction fidelity and inference efficiency are critical for real-world deployment. However, diffusion-based models, despite their strong generative capability, suffer from slow inference due to multi-step sampling trajectories, limiting their practical usability. Conditional Flow Matching (CFM) improves efficiency via straightened trajectories, but relies on latent space compression, which inevitably discards high-frequency physical details and degrades fine-grained prediction quality. To address these limitations, we propose PixelFlowCast, a two-stage probabilistic forecasting framework that achieves both high-efficiency and high-fidelity prediction without latent compression. Specifically, in the first stage, a deterministic model first produces coarse forecasts to capture global evolution trends. In the subsequent stage, the proposed KANCondNet extracts deep spatiotemporal evolution features to provide accurate conditional guidance. Based on this, a latent-free, few-step Pixel Mean Flows (PMF) predictor employs an $x$-prediction mechanism to generate high-quality predictions, effectively preserving fine-grained structures while maintaining fast inference. Experiments on the publicly available SEVIR dataset demonstrate that PixelFlowCast outperforms existing mainstream methods in both prediction accuracy and inference efficiency, particularly for long sequence forecasting, highlighting its strong potential for real-world operational deployment.2026-05-11T06:16:02Z26 pages, 7 figuresYufeng ZhuChunlei ShiYongchao FengDan Niuhttp://arxiv.org/abs/2603.03759v2Learning Approximate Nash Equilibria in Cooperative Multi-Agent Reinforcement Learning via Mean-Field Subsampling2026-05-11T04:03:08ZMany large-scale platforms and networked control systems have a centralized decision maker interacting with a massive population of agents under strict observability constraints. Motivated by such applications, we study a cooperative Markov game with a global agent and $n$ homogeneous local agents in a communication-constrained regime, where the global agent only observes a subset of $k$ local agent states per time step. We propose an alternating learning framework $(\texttt{ALTERNATING-MARL})$, where the global agent performs subsampled mean-field $Q$-learning against a fixed local policy, and local agents update by optimizing in an induced MDP. We prove that these approximate best-response dynamics converge to an $\widetilde{O}(1/\sqrt{k})$-approximate Nash Equilibrium, while separating the sample complexities between the joint state and action spaces. Finally, we validate our results in numerical simulations for multi-robot control.2026-03-04T06:14:24Z57 pages, 10 figures, 4 tablesEmile AnandIshani Karmarkarhttp://arxiv.org/abs/2605.09894v1Deterministic vs. LLM-Controlled Orchestration for COBOL-to-Python Modernization2026-05-11T02:34:27ZModernizing legacy COBOL systems remains difficult due to scarce expertise, large and long-lived codebases, and strict correctness requirements. Recent large language model (LLM)-based modernization systems increasingly rely on agentic workflows in which the model controls multi-step tool execution. However, it remains unclear whether delegating execution control to the LLM improves correctness, robustness, or efficiency in structured software engineering workflows. We present a controlled empirical study of deterministic and LLM-controlled orchestration for COBOL-to-Python modernization. Using a unified experimental framework, we hold the language models, prompts, tools, configurations, and source programs constant while varying only the execution control strategy. This isolates orchestration as the sole experimental variable. We evaluate both approaches using functional correctness, robustness across repeated stochastic runs, and computational efficiency. Across multiple models, deterministic orchestration achieves comparable computational accuracy to LLM-controlled orchestration while improving worst-case robustness and reducing performance variability across runs. Deterministic execution also reduces token consumption by up to 3.5x, leading to substantially lower operational cost. These results suggest that, in structured modernization workflows with explicit validation stages, fixed execution policies provide more stable and cost-efficient behavior than fully agentic orchestration without reducing translation quality.2026-05-11T02:34:27ZNaing Oo LwinRajesh Kumarhttp://arxiv.org/abs/2605.09889v1Skill Description Deception Attack against Task Routing in Internet of Agents2026-05-11T02:25:53ZA new paradigm, Internet of Agents (IoA), is transforming networked systems into LLM-driven service networks, where heterogeneous agents collaborate through task routing based on their self-declared skill descriptions. Although this promising paradigm enables agentic, distributed, and advanced intelligence, it also exposes a new and overlooked attack surface. In particular, malicious agents can strategically manipulate their skill descriptions to bias routing decisions and increase their probability of being selected for task execution, thereby disrupting user tasks and degrading system reliability. To characterize this threat, we propose and formalize a new attack model, termed \emph{Skill Description Deception} (SDD) attack. We further design an LLM-enabled SDD attack framework that automatically generates deceptive skill descriptions, enabling systematic vulnerability assessment of IoA systems. Experimental results on nine representative domains show that the proposed attack can achieve up to 98\% attack success rate, demonstrating the severity and generality of the attack. Our paper reveals a new security vulnerability in IoA and calls for secure and trustworthy semantic routing mechanisms for future IoA systems.2026-05-11T02:25:53ZSubmitted to IEEE Globecom 2026Jiayi HeXiaofeng LuoJiawen KangRuichen ZhangJianhang TangDong In Kim