https://arxiv.org/api/M2KLs7kAwgGhKrW4lvIZtEp0vss2026-03-20T08:58:27Z11553015http://arxiv.org/abs/2108.00916v42-D Directed Formation Control Based on Bipolar Coordinates2026-03-19T17:13:03ZThis work proposes a novel 2-D formation control scheme for acyclic triangulated directed graphs (a class of minimally acyclic persistent graphs) based on bipolar coordinates with (almost) global convergence to the desired shape. Prescribed performance control is employed to devise a decentralized control law that avoids singularities and introduces robustness against external disturbances while ensuring predefined transient and steady-state performance for the closed-loop system. Furthermore, it is shown that the proposed formation control scheme can handle formation maneuvering, scaling, and orientation specifications simultaneously. Additionally, the proposed control law is implementable in agents' arbitrarily oriented local coordinate frames using only low-cost onboard vision sensors, which are favorable for practical applications. Finally, a formation maneuvering simulation study verifies the proposed approach.2021-08-02T14:09:55Z16 pages, 10 figures; minor typos corrected; no change in resultsFarhad MehdifarCharalampos P. BechlioulisJulien M. HendrickxDimos V. Dimarogonas10.1109/TAC.2022.3206603http://arxiv.org/abs/2602.16424v2Verifiable Semantics for Agent-to-Agent Communication2026-03-19T17:12:58ZMultiagent AI systems require consistent communication, but we lack methods to verify that agents share the same understanding of the terms used. Natural language is interpretable but vulnerable to semantic drift, while learned protocols are efficient but opaque. We propose a certification protocol based on the stimulus-meaning model, where agents are tested on shared observable events and terms are certified if empirical disagreement falls below a statistical threshold. In this protocol, agents restricting their reasoning to certified terms ("core-guarded reasoning") achieve provably bounded disagreement. We also outline mechanisms for detecting drift (recertification) and recovering shared vocabulary (renegotiation). In simulations with varying degrees of semantic divergence, core-guarding reduces disagreement by 72-96%. In a validation with fine-tuned language models, disagreement is reduced by 51%. Our framework provides a first step towards verifiable agent-to-agent communication.2026-02-18T12:55:58ZPhilipp SchoeneggerMatt CarlsonChris SchneiderChris Dalyhttp://arxiv.org/abs/2603.14052v2A Multi-Agent Perception-Action Alliance for Efficient Long Video Reasoning2026-03-19T15:21:11ZThis paper presents a multi-agent perception-action exploration alliance, dubbed A4VL, for efficient long-video reasoning. A4VL operates in a multi-round perception-action exploration loop with a selection of VLM agents. In each round, the team of agents performs video question-answer (VideoQA) via perception exploration followed by action exploration. During perception exploration, each agent learns to extract query-specific perception clue(s) from a few sampled frames and performs clue-based alignment to find the video block(s) that are most relevant to the query-specific event. During action exploration, A4VL performs video reasoning in three steps: (1) each agent produces its initial answer with rational, (2) all agents collaboratively scores one another through cross-reviews and relevance ranking, and (3) based on whether a satisfactory consensus is reached, the decision is made either to start a new round of perception-action deliberation by pruning (e.g., filtering out the lowest performing agent) and re-staging (e.g., new-clue and matching block based perception-action exploration), or to conclude by producing its final answer. The integration of the multi-agent alliance through multi-round perception-action exploration, coupled with event-driven partitioning and cue-guided block alignment, enables A4VL to effectively scale to real world long videos while preserving high quality video reasoning. Evaluation Results on five popular VideoQA benchmarks show that A4VL outperforms 18 existing representative VLMs and 11 recent methods optimized for long-video reasoning, while achieving significantly lower inference latency. Our code is released at https://github.com/git-disl/A4VL.2026-03-14T17:43:53ZAccepted by CVPR2026Yichang XuGaowen LiuRamana Rao KompellaTiansheng HuangSihao HuFatih IlhanSelim Furkan TekinZachary YahnLing Liuhttp://arxiv.org/abs/2603.18958v1Optimal Path Planning in Hostile Environments2026-03-19T14:25:30ZCoordinating agents through hazardous environments, such as aid-delivering drones navigating conflict zones or field robots traversing deployment areas filled with obstacles, poses fundamental planning challenges. We introduce and analyze the computational complexity of a new multi-agent path planning problem that captures this setting. A group of identical agents begins at a common start location and must navigate a graph-based environment to reach a common target. The graph contains hazards that eliminate agents upon contact but then enter a known cooldown period before reactivating. In this discrete-time, fully-observable, deterministic setting, the planning task is to compute a movement schedule that maximizes the number of agents reaching the target. We first prove that, despite the exponentially large space of feasible plans, optimal plans require only polynomially-many steps, establishing membership in NP. We then show that the problem is NP-hard even when the environment graph is a tree. On the positive side, we present a polynomial-time algorithm for graphs consisting of vertex-disjoint paths from start to target. Our results establish a rich computational landscape for this problem, identifying both intractable and tractable fragments.2026-03-19T14:25:30ZAccepted for publication at ICAPS-2026 (25 pages, 6 figures)Andrzej KaczmarczykŠimon SchierreichNicholas Axel TanujayaHaifeng Xuhttp://arxiv.org/abs/2602.07975v2Leader-following Consensus over Jointly Connected Switching Networks is Achievable for Exponentially Unstable Linear Systems2026-03-19T13:48:09ZThe leader-following consensus problem for general linear multi-agent systems over jointly connected switching networks has been a challenging problem and the solvability of the problem has been limited to the class of linear multi-agent systems whose system matrix is marginally stable. This condition is restrictive since it even excludes the most commonly used double-integrator system. This paper presents a breakthrough by demonstrating that leader-following exponential consensus is achievable for general linear multi-agent systems over jointly connected switching networks, even when the system matrix is exponentially unstable. The degree of instability can be explicitly characterized by two key quantities that arise from the jointly connected condition on a switching graph. By exploiting duality, we further show that the output-based distributed observer design problem for a general leader system is solvable over jointly connected switching networks, even when the system matrix is exponentially unstable. This is also in sharp contrast to the existing distributed observers, which rely on the assumption that the leader system is marginally stable.2026-02-08T14:03:50ZYuhan ChenTao LiuJie Huanghttp://arxiv.org/abs/2603.18894v1I Can't Believe It's Corrupt: Evaluating Corruption in Multi-Agent Governance Systems2026-03-19T13:34:54ZLarge language models are increasingly proposed as autonomous agents for high-stakes public workflows, yet we lack systematic evidence about whether they would follow institutional rules when granted authority. We present evidence that integrity in institutional AI should be treated as a pre-deployment requirement rather than a post-deployment assumption. We evaluate multi-agent governance simulations in which agents occupy formal governmental roles under different authority structures, and we score rule-breaking and abuse outcomes with an independent rubric-based judge across 28,112 transcript segments. While we advance this position, the core contribution is empirical: among models operating below saturation, governance structure is a stronger driver of corruption-related outcomes than model identity, with large differences across regimes and model--governance pairings. Lightweight safeguards can reduce risk in some settings but do not consistently prevent severe failures. These results imply that institutional design is a precondition for safe delegation: before real authority is assigned to LLM agents, systems should undergo stress testing under governance-like constraints with enforceable rules, auditable logs, and human oversight on high-impact actions.2026-03-19T13:34:54ZShort Paper, PreprintVedanta S PPonnurangam Kumaraguruhttp://arxiv.org/abs/2507.06542v3On the Surprising Effectiveness of a Single Global Merging in Decentralized Learning2026-03-19T10:05:02ZDecentralized learning provides a scalable alternative to parameter-server-based training, yet its performance is often hindered by limited peer-to-peer communication. In this paper, we study how communication should be scheduled over time, including determining when and how frequently devices synchronize. Counterintuitive empirical results show that concentrating communication budgets in the later stages of decentralized training remarkably improves global test performance. Surprisingly, we uncover that fully connected communication at the final step, implemented by a single global merging, can significantly improve the performance of decentralized learning under high data heterogeneity. Our theoretical contributions, which explain these phenomena, are the first to establish that the globally merged model of decentralized SGD can match the convergence rate of parallel SGD. Technically, we reinterpret part of the discrepancy among local models, which were previously considered as detrimental noise, as constructive components essential for matching this rate. This work provides evidence that decentralized learning is able to generalize under high data heterogeneity and limited communication, while offering broad new avenues for model merging research.2025-07-09T04:56:56ZWe discover and theoretically explain why and when a single global parameter merging in decentralized learning can recover the performance of federated learning, even in highly heterogeneous and communication-constrained environmentsICLR 2026 (Oral Presentation)Tongtian ZhuTianyu ZhangMingze WangZhanpeng ZhouCan Wanghttp://arxiv.org/abs/2510.26352v2The Geometry of Dialogue: Graphing Language Models to Reveal Synergistic Teams for Multi-Agent Collaboration2026-03-19T09:40:13ZWhile a multi-agent approach based on large language models (LLMs) represents a promising strategy to surpass the capabilities of single models, its success is critically dependent on synergistic team composition. However, forming optimal teams is a significant challenge, as the inherent opacity of most models obscures the internal characteristics necessary for effective collaboration. In this paper, we propose an interaction-centric framework for automatic team composition that does not require any prior knowledge including their internal architectures, training data, or task performances. Our method constructs a "language model graph" that maps relationships between models from the semantic coherence of pairwise conversations, and then applies community detection to identify synergistic model clusters. Our experiments with diverse LLMs demonstrate that the proposed method discovers functionally coherent groups that reflect their latent specializations. Priming conversations with specific topics identified synergistic teams which outperform random baselines on downstream benchmarks and achieve comparable accuracy to that of manually-curated teams based on known model specializations. Our findings provide a new basis for the automated design of collaborative multi-agent LLM teams.2025-10-30T11:04:15ZAccepted at the AAAI-26 Workshop on LLM-based Multi-Agent Systems: Towards Responsible, Reliable, and Scalable Agentic Systems (LaMAS 2026) as an oral presentationKotaro FuruyaYuichi Kitagawahttp://arxiv.org/abs/2510.11618v3StoryBox: Collaborative Multi-Agent Simulation for Hybrid Bottom-Up Long-Form Story Generation Using Large Language Models2026-03-19T07:37:38ZHuman writers often begin their stories with an overarching mental scene, where they envision the interactions between characters and their environment. Inspired by this creative process, we propose a novel approach to long-form story generation, termed hybrid bottom-up long-form story generation, using multi-agent simulations. In our method, agents interact within a dynamic sandbox environment, where their behaviors and interactions with one another and the environment generate emergent events. These events form the foundation for the story, enabling organic character development and plot progression. Unlike traditional top-down approaches that impose rigid structures, our hybrid bottom-up approach allows for the natural unfolding of events, fostering more spontaneous and engaging storytelling. The system is capable of generating stories exceeding 10,000 words while maintaining coherence and consistency, addressing some of the key challenges faced by current story generation models. We achieve state-of-the-art performance across several metrics. This approach offers a scalable and innovative solution for creating dynamic, immersive long-form stories that evolve organically from agent-driven interactions.2025-10-13T16:57:32ZAccepted by AAAI 2026. Project: https://storyboxproject.github.ioProceedings of the AAAI Conference on Artificial Intelligence, 2026, 40(36), 30359-30367Zehao ChenRong PanHaoran Li10.1609/aaai.v40i36.40288http://arxiv.org/abs/2603.18563v1Reasonably reasoning AI agents can avoid game-theoretic failures in zero-shot, provably2026-03-19T07:24:39ZAI agents are increasingly deployed in interactive economic environments characterized by repeated AI-AI interactions. Despite AI agents' advanced capabilities, empirical studies reveal that such interactions often fail to stably induce a strategic equilibrium, such as a Nash equilibrium. Post-training methods have been proposed to induce a strategic equilibrium; however, it remains impractical to uniformly apply an alignment method across diverse, independently developed AI models in strategic settings. In this paper, we provide theoretical and empirical evidence that off-the-shelf reasoning AI agents can achieve Nash-like play zero-shot, without explicit post-training. Specifically, we prove that `reasonably reasoning' agents, i.e., agents capable of forming beliefs about others' strategies from previous observation and learning to best respond to these beliefs, eventually behave along almost every realized play path in a way that is weakly close to a Nash equilibrium of the continuation game. In addition, we relax the common-knowledge payoff assumption by allowing stage payoffs to be unknown and by having each agent observe only its own privately realized stochastic payoffs, and we show that we can still achieve the same on-path Nash convergence guarantee. We then empirically validate the proposed theories by simulating five game scenarios, ranging from a repeated prisoner's dilemma game to stylized repeated marketing promotion games. Our findings suggest that AI agents naturally exhibit such reasoning patterns and therefore attain stable equilibrium behaviors intrinsically, obviating the need for universal alignment procedures in many real-world strategic interactions.2026-03-19T07:24:39ZEnoch Hyunwook Kanghttp://arxiv.org/abs/2603.18503v1Computationally Efficient Density-Driven Optimal Control via Analytical KKT Reduction and Contractive MPC2026-03-19T05:23:21ZEfficient coordination for collective spatial distribution is a fundamental challenge in multi-agent systems. Prior research on Density-Driven Optimal Control (D2OC) established a framework to match agent trajectories to a desired spatial distribution. However, implementing this as a predictive controller requires solving a large-scale Karush-Kuhn-Tucker (KKT) system, whose computational complexity grows cubically with the prediction horizon. To resolve this, we propose an analytical structural reduction that transforms the T-horizon KKT system into a condensed quadratic program (QP). This formulation achieves O(T) linear scalability, significantly reducing the online computational burden compared to conventional O(T^3) approaches. Furthermore, to ensure rigorous convergence in dynamic environments, we incorporate a contractive Lyapunov constraint and prove the Input-to-State Stability (ISS) of the closed-loop system against reference propagation drift. Numerical simulations verify that the proposed method facilitates rapid density coverage with substantial computational speed-up, enabling long-horizon predictive control for large-scale multi-agent swarms.2026-03-19T05:23:21ZJulian MartinezKooktae Leehttp://arxiv.org/abs/2509.14295v5Aegis: Automated Error Generation and Attribution for Multi-Agent Systems2026-03-19T04:45:30ZLarge language model based multi-agent systems (MAS) have unlocked significant advancements in tackling complex problems, but their increasing capability introduces a structural fragility that makes them difficult to debug. A key obstacle to improving their reliability is the severe scarcity of large-scale, diverse datasets for error attribution, as existing resources rely on costly and unscalable manual annotation. To address this bottleneck, we introduce Aegis, a novel framework for Automated error generation and attribution for multi-agent systems. Aegis constructs a large dataset of 9,533 trajectories with annotated faulty agents and error modes, covering diverse MAS architectures and task domains. This is achieved using a LLM-based manipulator that can adaptively inject context-aware errors into successful execution trajectories. Leveraging fine-grained labels and the structured arrangement of positive-negative sample pairs, Aegis supports three different learning paradigms: Supervised Fine-Tuning, Reinforcement Learning, and Contrastive Learning. We develop learning methods for each paradigm. Comprehensive experiments show that trained models consistently achieve substantial improvements in error attribution. Notably, several of our fine-tuned LLMs demonstrate performance competitive with or superior to proprietary models an order of magnitude larger, validating our automated data generation framework as a crucial resource for developing more robust and interpretable multi-agent systems. Our project website is available at https://kfq20.github.io/Aegis-Website/.2025-09-17T02:31:03ZFanqi KongRuijie ZhangHuaxiao YinGuibin ZhangXiaofei ZhangZiang ChenZhaowei ZhangXiaoyuan ZhangSong-Chun ZhuXue Fenghttp://arxiv.org/abs/2512.18561v3Adaptive Accountability in Networked MAS: Tracing and Mitigating Emergent Norms at Scale2026-03-19T04:07:22ZLarge-scale networked multi-agent systems increasingly underpin critical infrastructure, yet their collective behavior can drift toward undesirable emergent norms such as collusion, resource hoarding, and implicit unfairness. We present the Adaptive Accountability Framework (AAF), an end-to-end runtime layer that (i) records cryptographically verifiable interaction provenance, (ii) detects distributional change points in streaming traces, (iii) attributes responsibility via a causal influence graph, and (iv) applies cost-bounded interventions-reward shaping and targeted policy patching-to steer the system back toward compliant behavior. We establish a bounded-compromise guarantee: if the expected cost of intervention exceeds an adversary's expected payoff, the long-run fraction of compromised interactions converges to a value strictly below one. We evaluate AAF in a large-scale factorial simulation suite (87,480 runs across two tasks; up to 100 agents plus a 500-agent scaling sweep; full and partial observability; Byzantine rates up to 10%; 10 seeds per regime). Across 324 regimes, AAF lowers the executed compromise ratio relative to a Proximal Policy Optimization baseline in 96% of regimes (median relative reduction 11.9%) while preserving social welfare (median change 0.4%). Under adversarial injections, AAF detects norm violations with a median delay of 71 steps (interquartile range 39-177) and achieves a mean top-ranked attribution accuracy of 0.97 at 10% Byzantine rate.2025-12-21T02:04:47ZSaad Alqithamihttp://arxiv.org/abs/2603.18407v1Interleaved Information Structures in Dynamic Games: A General Framework with Application to the Linear-Quadratic Case2026-03-19T02:11:28ZA fundamental problem in noncooperative dynamic game theory is the computation of Nash equilibria under different information structures, which specify the information available to each agent during decision-making. Prior work has extensively studied equilibrium solutions for two canonical information structures: feedback, where agents observe the current state at each time, and open-loop, where agents only observe the initial state. However, these paradigms are often too restrictive to capture realistic settings exhibiting interleaved information structures, in which each agent observes only a subset of other agents at every timestep. To date, there is no systematic framework for modeling and solving dynamic games under arbitrary interleaved information structures. To this end, we make two main contributions. First, we introduce a method to model deterministic dynamic games with arbitrary interleaved information structures as Mathematical Program Networks (MPNs), where the network structure encodes the informational dependencies between agents. Second, for linear-quadratic (LQ) dynamic games, we leverage the MPN formulation to develop a systematic procedure for deriving Riccati-like equations that characterize Nash equilibria. Finally, we illustrate our approach through an example involving three agents exhibiting a cyclic information structure.2026-03-19T02:11:28Z6 pages, 3 figuresJanani S KKushagra GuptaUfuk TopcuDavid Fridovich-Keilhttp://arxiv.org/abs/2603.05789v3The Coordination Gap: Multi-Agent Alternation Metrics for Temporal Fairness in Repeated Games2026-03-19T02:04:50ZMulti-agent coordination dilemmas expose a fundamental tension between individual optimization and collective welfare, yet characterizing such coordination requires metrics sensitive to temporal structure and collective dynamics. As a diagnostic testbed, we study a BoE-derived multi-agent variant of the Battle of the Exes, formalizing it as a Markov game in which turn-taking emerges as a periodic coordination regime. Conventional outcome-based metrics (e.g., efficiency and min/max fairness) are temporally blind (they cannot distinguish structured alternation from monopolistic or random access patterns) and fairness ratios lose discriminative power as n grows, obscuring inequities.
To address this limitation, we introduce Perfect Alternation (PA) as a reference coordination regime and propose six novel Alternation (ALT) metrics designed as temporally sensitive observables of coordination quality. Using Q-learning agents as a minimal adaptive diagnostic baseline, and comparing against random-policy null processes, we uncover a clear measurement failure: despite exhibiting deceptively high traditional metrics (e.g., reward fairness often exceeding 0.9), learned policies perform up to 81% below random baselines under ALT-variant evaluation, a deficit already present in the two-agent case and intensifying as n grows.
These results demonstrate, in this setting, that high aggregate payoffs can coexist with poor temporal coordination, and that conventional metrics may severely mischaracterize emergent dynamics. Our findings underscore the necessity of temporally aware observables for analyzing coordination in multi-agent games and highlight random-policy baselines as essential null processes for interpreting coordination outcomes relative to chance-level behavior.2026-03-06T00:43:53Z41 pages, 5 figures, 4 tables, 1 supplementary pdf. Submitted to Social Choice & WelfareNikolaos Al. PapadopoulosKonstantinos Psannis