https://arxiv.org/api/9M8mBCUvt58nTMJIX0Lv6xavshI 2026-03-28T10:44:39Z 11628 105 15 http://arxiv.org/abs/2603.18330v1 MemArchitect: A Policy Driven Memory Governance Layer 2026-03-18T22:37:05Z Persistent Large Language Model (LLM) agents expose a critical governance gap in memory management. Standard Retrieval-Augmented Generation (RAG) frameworks treat memory as passive storage, lacking mechanisms to resolve contradictions, enforce privacy, or prevent outdated information ("zombie memories") from contaminating the context window. We introduce MemArchitect, a governance layer that decouples memory lifecycle management from model weights. MemArchitect enforces explicit, rule-based policies, including memory decay, conflict resolution, and privacy controls. We demonstrate that governed memory consistently outperforms unmanaged memory in agentic settings, highlighting the necessity of structured memory governance for reliable and safe autonomous systems. 2026-03-18T22:37:05Z This is an on going research work and will be updated periodically Lingavasan Suresh Kumar Yang Ba Rong Pan http://arxiv.org/abs/2504.09022v2 Game-Theoretic Coordination for Time-Critical Missions of UAV Systems 2026-03-18T22:14:39Z Coordinated missions involving Unmanned Aerial Vehicles (UAVs) in dynamic environments pose significant challenges in maintaining both coordination and agility. In this paper, relying on the cooperative path following framework and using a game-theoretic formulation, we introduce a novel and scalable approach in which each UAV acts autonomously in different mission conditions. This formulation naturally accommodates heterogeneous and time-varying objectives across the system. In our setting, each UAV optimizes a cost function that incorporates temporal and mission-specific constraints. The optimization is performed within a one-dimensional domain, significantly reducing the computational cost and enabling real-time application to complex and dynamic scenarios. The framework is distributed in structure, enabling global, system-wide coordination (a Nash equilibrium) by using only local information. For ideal systems, we prove the existence and the Nash equilibrium exhibits exponential convergence. Furthermore, we invoke model predictive control (MPC) for non-ideal scenarios. In particular, we propose a discrete-time optimization approach that tackles path-following errors and communication failures, ensuring reliable and agile performance in dynamic and uncertain environments. Simulation results demonstrate the effectiveness and agility of the approach in ensuring successful mission execution across diverse realistic scenarios. 2025-04-12T00:21:11Z Revised version with improved exposition, expanded introduction, updated abstract, minor corrections and updated author list Mikayel Aramyan Anna Manucharyan Lusine Poghosyan Tigran Bakaryan Naira Hovakimyan http://arxiv.org/abs/2603.17907v1 Actionable Recourse in Competitive Environments: A Dynamic Game of Endogenous Selection 2026-03-18T16:45:11Z Actionable recourse studies whether individuals can modify feasible features to overturn unfavorable outcomes produced by AI-assisted decision-support systems. However, many such systems operate in competitive settings, such as admission or hiring, where only a fraction of candidates can succeed. A fundamental question arises: what happens when actionable recourse is available to everyone in a competitive environment? This study proposes a framework that models recourse as a strategic interaction among candidates under a risk-based selection rule. Rejected individuals exert effort to improve actionable features along directions implied by the decision rule, while the success benchmark evolves endogenously as many candidates adjust simultaneously. This creates endogenous selection, in which both the decision rule and the selection threshold are determined by the population's current feature state. This interaction generates a closed-loop dynamical system linking candidate selection and strategic recourse. We show that the initially selected candidates determine both the benchmark of success and the direction of improvement, thereby amplifying initial disparities and producing persistent performance gaps across the population. 2026-03-18T16:45:11Z Ya-Ting Yang Quanyan Zhu http://arxiv.org/abs/2505.08448v2 Scalable UAV Multi-Hop Networking via Multi-Agent Reinforcement Learning with Large Language Models 2026-03-18T16:10:32Z In disaster scenarios, establishing robust emergency communication networks is critical, and unmanned aerial vehicles (UAVs) offer a promising solution to rapidly restore connectivity. However, organizing UAVs to form multi-hop networks in large-scale dynamic environments presents significant challenges, including limitations in algorithmic scalability and the vast exploration space required for coordinated decision-making. To address these issues, we propose MRLMN, a novel framework that integrates multi-agent reinforcement learning (MARL) and large language models (LLMs) to jointly optimize UAV agents toward achieving optimal networking performance. The framework incorporates a grouping strategy with reward decomposition to enhance algorithmic scalability and balance decision-making across UAVs. In addition, behavioral constraints are applied to selected key UAVs to improve the robustness of the network. Furthermore, the framework integrates LLM agents, leveraging knowledge distillation to transfer their high-level decision-making capabilities to MARL agents. This enhances both the efficiency of exploration and the overall training process. In the distillation module, a Hungarian algorithm-based matching scheme is applied to align the decision outputs of the LLM and MARL agents and define the distillation loss. Extensive simulation results validate the effectiveness of our approach, demonstrating significant improvements in network performance over the MAPPO baseline and other comparison methods, including enhanced coverage and communication quality. 2025-05-13T11:23:25Z 18 pages, 23 figures Yanggang Xu Jirong Zha Weijie Hong Xiangmin Yi Geng Chen Jianfeng Zheng Chen-Chun Hsia Xinlei Chen http://arxiv.org/abs/2603.14697v2 Forecast-Aware Cooperative Planning on Temporal Graphs under Stochastic Adversarial Risk 2026-03-18T15:50:43Z Cooperative multi-robot missions often require teams of robots to traverse environments where traversal risk evolves due to adversary patrols or shifting hazards with stochastic dynamics. While support coordination--where robots assist teammates in traversing risky regions--can significantly reduce mission costs, its effectiveness depends on the team's ability to anticipate future risk. Existing support-based frameworks assume static risk landscapes and therefore fail to account for predictable temporal trends in risk evolution. We propose a forecast-aware cooperative planning framework that integrates stochastic risk forecasting with anticipatory support allocation on temporal graphs. By modeling adversary dynamics as a first-order Markov stay-move process over graph edges, we propagate the resulting edge-occupancy probabilities forward in time to generate time-indexed edge-risk forecasts. These forecasts guide the proactive allocation of support positions to forecasted risky edges for effective support coordination, while also informing joint robot path planning. Experimental results demonstrate that our approach consistently reduces total expected team cost compared to non-anticipatory baselines, approaching the performance of an oracle planner. 2026-03-16T01:06:58Z Manshi Limbu Xuan Wang Gregory J. Stein Daigo Shishika Xuesu Xiao http://arxiv.org/abs/2603.17787v1 Governed Memory: A Production Architecture for Multi-Agent Workflows 2026-03-18T14:49:31Z Enterprise AI deploys dozens of autonomous agent nodes across workflows, each acting on the same entities with no shared memory and no common governance. We identify five structural challenges arising from this memory governance gap: memory silos across agent workflows; governance fragmentation across teams and tools; unstructured memories unusable by downstream systems; redundant context delivery in autonomous multi-step executions; and silent quality degradation without feedback loops. We present Governed Memory, a shared memory and governance layer addressing this gap through four mechanisms: a dual memory model combining open-set atomic facts with schema-enforced typed properties; tiered governance routing with progressive context delivery; reflection-bounded retrieval with entity-scoped isolation; and a closed-loop schema lifecycle with AI-assisted authoring and automated per-property refinement. We validate each mechanism through controlled experiments (N=250, five content types): 99.6% fact recall with complementary dual-modality coverage; 92% governance routing precision; 50% token reduction from progressive delivery; zero cross-entity leakage across 500 adversarial queries; 100% adversarial governance compliance; and output quality saturation at approximately seven governed memories per entity. On the LoCoMo benchmark, the architecture achieves 74.8% overall accuracy, confirming that governance and schema enforcement impose no retrieval quality penalty. The system is in production at Personize.ai. 2026-03-18T14:49:31Z 18 pages, 4 figures, 11 tables, 7 appendices. Code and datasets: https://github.com/personizeai/governed-memory Hamed Taheri http://arxiv.org/abs/2603.18096v1 A Trace-Based Assurance Framework for Agentic AI Orchestration: Contracts, Testing, and Governance 2026-03-18T10:23:48Z In Agentic AI, Large Language Models (LLMs) are increasingly used in the orchestration layer to coordinate multiple agents and to interact with external services, retrieval components, and shared memory. In this setting, failures are not limited to incorrect final outputs. They also arise from long-horizon interaction, stochastic decisions, and external side effects (such as API calls, database writes, and message sends). Common failures include non-termination, role drift, propagation of unsupported claims, and attacks via untrusted context or external channels. This paper presents an assurance framework for such Agentic AI systems. Executions are instrumented as Message-Action Traces (MAT) with explicit step and trace contracts. Contracts provide machine-checkable verdicts, localize the first violating step, and support deterministic replay. The framework includes stress testing, formulated as a budgeted counterexample search over bounded perturbations. It also supports structured fault injection at service, retrieval, and memory boundaries to assess containment under realistic operational faults and degraded conditions. Finally, governance is treated as a runtime component, enforcing per-agent capability limits and action mediation (allow, rewrite, block) at the language-to-action boundary. To support comparative evaluations across stochastic seeds, models, and orchestration configurations, the paper defines trace-based metrics for task success, termination reliability, contract compliance, factuality indicators, containment rate, and governance outcome distributions. More broadly, the framework is intended as a common abstraction to support testing and evaluation of multi-agent LLM systems, and to facilitate reproducible comparison across orchestration designs and configurations. 2026-03-18T10:23:48Z Ciprian Paduraru Petru-Liviu Bouruc Alin Stefanescu http://arxiv.org/abs/2603.17564v1 In Trust We Survive: Emergent Trust Learning 2026-03-18T10:12:54Z We introduce Emergent Trust Learning (ETL), a lightweight, trust-based control algorithm that can be plugged into existing AI agents. It enables these to reach cooperation in competitive game environments under shared resources. Each agent maintains a compact internal trust state, which modulates memory, exploration, and action selection. ETL requires only individual rewards and local observations and incurs negligible computational and communication overhead. We evaluate ETL in three environments: In a grid-based resource world, trust-based agents reduce conflicts and prevent long-term resource depletion while achieving competitive individual returns. In a hierarchical Tower environment with strong social dilemmas and randomised floor assignments, ETL sustains high survival rates and recovers cooperation even after extended phases of enforced greed. In the Iterated Prisoner's Dilemma, the algorithm generalises to a strategic meta-game, maintaining cooperation with reciprocal opponents while avoiding long-term exploitation by defectors. Code will be released upon publication. 2026-03-18T10:12:54Z Qianpu Chen Giulio Barbero Mike Preuss Derya Soydaner http://arxiv.org/abs/2603.17472v1 Bringing Network Coding into Multi-Robot Systems: Interplay Study for Autonomous Systems over Wireless Communications 2026-03-18T08:26:17Z Communication is a core enabler for multi-robot systems (MRS), providing the mechanism through which robots exchange state information, coordinate actions, and satisfy safety constraints. While many MRS autonomy algorithms assume reliable and timely message delivery, realistic wireless channels introduce delay, erasures, and ordering stalls that can degrade performance and compromise safety-critical decisions of the robot task. In this paper, we investigate how transport-layer reliability mechanisms that mitigate communication losses and delays shape the autonomy-communication loop. We show that conventional non-coded retransmission-based protocols introduce long delays that are misaligned with the timeliness requirements of MRS applications, and may render the received data irrelevant. As an alternative, we advocate for adaptive and causal network coding, which proactively injects coded redundancy to achieve the desired delay and throughput that enable relevant data delivery to the robotic task. Specifically, this method adapts to channel conditions between robots and causally tunes the communication rates via efficient algorithms. We present two case studies: cooperative localization under delayed and lossy inter-robot communication, and a safety-critical overtaking maneuver where timely vehicle-to-vehicle message availability determines whether an ego vehicle can abort to avoid a crash. Our results demonstrate that coding-based communication significantly reduces in-order delivery stalls, preserves estimation consistency under delay, and improves deadline reliability relative to retransmission-based transport. Overall, the study highlights the need to jointly design autonomy algorithms and communication mechanisms, and positions network coding as a principled tool for dependable multi-robot operation over wireless networks. 2026-03-18T08:26:17Z Anil Zaher Kiril Solovey Alejandro Cohen http://arxiv.org/abs/2508.11401v4 FACET: Teacher-Centred LLM-Based Multi-Agent Systems-Towards Personalized Educational Worksheets 2026-03-18T07:32:06Z The increasing heterogeneity of student populations poses significant challenges for teachers, particularly in mathematics education, where cognitive, motivational, and emotional differences strongly influence learning outcomes. While AI-driven personalization tools have emerged, most remain performance-focused, offering limited support for teachers and neglecting broader pedagogical needs. This paper presents the FACET framework, a teacher-facing, large language model (LLM)-based multi-agent system designed to generate individualized classroom materials that integrate both cognitive and motivational dimensions of learner profiles. The framework comprises three specialized agents: (1) learner agents that simulate diverse profiles incorporating topic proficiency and intrinsic motivation, (2) a teacher agent that adapts instructional content according to didactical principles, and (3) an evaluator agent that provides automated quality assurance. We tested the system using authentic grade 8 mathematics curriculum content and evaluated its feasibility through a) automated agent-based assessment of output quality and b) exploratory feedback from K-12 in-service teachers. Results from ten internal evaluations highlighted high stability and alignment between generated materials and learner profiles, and teacher feedback particularly highlighted structure and suitability of tasks. The findings demonstrate the potential of multi-agent LLM architectures to provide scalable, context-aware personalization in heterogeneous classroom settings, and outline directions for extending the framework to richer learner profiles and real-world classroom trials. 2025-08-15T11:10:40Z Jana Gonnermann-Müller Jennifer Haase Konstantin Fackeldey Sebastian Pokutta http://arxiv.org/abs/2311.17697v3 Swarm Self Clustering for Communication denied Environments without Global Positioning 2026-03-18T07:05:15Z In this work, we investigate swarm self-clustering, where robots autonomously organize into spatially coherent groups using only local sensing and decision-making, without external commands, global positioning, or inter-robot communication. Each robot forms and maintains clusters by responding to relative distances from nearby neighbors detected through onboard range sensors with limited fields of view. The method is suited for GPS-denied and communication-constrained environments and requires no prior knowledge of cluster size, number, or membership. A mechanism enables robots to alternate between consensus-based and random goal assignment based on local neighborhood size, ensuring robustness, scalability, and untraceable clustering independent of initial conditions. Extensive simulations and real-robot experiments demonstrate empirical convergence, adaptability to dynamic additions, and improved performance over local-only baselines across standard cluster quality metrics. 2023-11-29T15:03:14Z 36 Pages, 15 figures, 8 tables, pre-print version Sweksha Jain Rugved Katole Leena Vachhani http://arxiv.org/abs/2603.17392v1 Agentic Cognitive Profiling: Realigning Automated Alzheimer's Disease Detection with Clinical Construct Validity 2026-03-18T06:15:35Z Automated Alzheimer's Disease (AD) screening has predominantly followed the inductive paradigm of pattern recognition, which directly maps the input signal to the outcome label. This paradigm sacrifices construct validity of clinical protocol for statistical shortcuts. This paper proposes Agentic Cognitive Profiling (ACP), an agentic framework that realigns automated screening with clinical protocol logic across multiple cognitive domains. Rather than learning opaque mappings from transcripts to labels, the framework decomposes standardized assessments into atomic cognitive tasks and orchestrates specialized LLM agents to extract verifiable scoring primitives. Central to our design is decoupling semantic understanding from measurement by delegating all quantification to deterministic function calling, thereby mitigating hallucination and restoring construct validity. Unlike popular datasets that typically comprise around a hundred participants under a single task, we evaluate on a clinically-annotated corpus of 402 participants across eight structured cognitive tasks spanning multiple cognitive domains. The framework achieves 90.5% score match rate in task examination and 85.3% accuracy in AD prediction, surpassing popular baselines while generating interpretable cognitive profiles grounded in behavioral evidence. This work demonstrates that construct validity and predictive performance need not be traded off, charting a path toward AD screening systems that explain rather than merely predict. 2026-03-18T06:15:35Z Jiawen Kang Kun Li Dongrui Han Jinchao Li Junan Li Lingwei Meng Xixin Wu Helen Meng http://arxiv.org/abs/2510.19995v2 Communication to Completion: Modeling Collaborative Workflows with Intelligent Multi-Agent Communication 2026-03-18T05:14:57Z Multi-agent LLM systems have demonstrated impressive capabilities in complex collaborative tasks, yet most frameworks treat communication as instantaneous and free, overlooking a fundamental constraint in real world teamwork, collaboration cost. We propose a scalable framework implemented via Communication to Completion (C2C), which explicitly models communication as a constrained resource with realistic temporal costs. We introduce the Alignment Factor (AF), a dynamic metric inspired by Shared Mental Models, to quantify the link between task understanding and work efficiency. Through experiments on 15 software engineering workflows spanning three complexity tiers and team sizes from 5 to 17 agents, we demonstrate that cost-aware strategies achieve over 40% higher efficiency compared to unconstrained interaction. Our analysis reveals emergent coordination patterns: agents naturally adopt manager centric hub-and-spoke topologies, strategically escalate from asynchronous to synchronous channels based on complexity, and prioritize high value help requests. These patterns remain consistent across multiple frontier models (GPT-5.2, Claude Sonnet 4.5, Gemini 2.5 Pro). This study moves beyond simple agent construction, offering a theoretical foundation for quantifying and optimizing the dynamics of collaboration in future digital workplaces. 2025-10-22T19:48:17Z 13 pages Yiming Lu Xun Wang Simin Ma Shujian Liu Sathish Reddy Indurthi Song Wang Haoyun Deng Fei Liu Kaiqiang Song http://arxiv.org/abs/2603.17335v1 Distributed Equilibrium-Seeking in Target Coverage Games via Self-Configurable Networks under Limited Communication 2026-03-18T04:02:57Z We study a target coverage problem in which a team of sensing agents, operating under limited communication, must collaboratively monitor targets that may be adaptively repositioned by an attacker. We model this interaction as a zero-sum game between the sensing team (known as the defender) and the attacker. However, computing an exact Nash equilibrium (NE) for this game is computationally prohibitive as the action space of the defender grows exponentially with the number of sensors and their possible orientations. Exploiting the submodularity property of the game's utility function, we propose a distributed framework that enables agents to self-configure their communication neighborhoods under bandwidth constraints and collaboratively maximize the target coverage. We establish theoretical guarantees showing that the resulting sensing strategies converge to an approximate NE of the game. To our knowledge, this is the first distributed, communication-aware approach that scales effectively for games with combinatorial action spaces while explicitly incorporating communication constraints. To this end, we leverage the distributed bandit-submodular optimization framework and the notion of Value of Coordination that were introduced in [1]. Through simulations, we show that our approach attains near-optimal game value and higher target coverage compared to baselines. 2026-03-18T04:02:57Z Jayanth Bhargav Zirui Xu Vasileios Tzoumas Mahsa Ghasemi Shreyas Sundaram http://arxiv.org/abs/2603.17309v1 ReLMXEL: Adaptive RL-Based Memory Controller with Explainable Energy and Latency Optimization 2026-03-18T03:07:54Z Reducing latency and energy consumption is critical to improving the efficiency of memory systems in modern computing. This work introduces ReLMXEL (Reinforcement Learning for Memory Controller with Explainable Energy and Latency Optimization), a explainable multi-agent online reinforcement learning framework that dynamically optimizes memory controller parameters using reward decomposition. ReLMXEL operates within the memory controller, leveraging detailed memory behavior metrics to guide decision-making. Experimental evaluations across diverse workloads demonstrate consistent performance gains over baseline configurations, with refinements driven by workload-specific memory access behaviour. By incorporating explainability into the learning process, ReLMXEL not only enhances performance but also increases the transparency of control decisions, paving the way for more accountable and adaptive memory system designs. 2026-03-18T03:07:54Z Panuganti Chirag Sai Gandholi Sarat R. Raghunatha Sarma Venkata Kalyan Tavva Naveen M