https://arxiv.org/api/ma0Ld0R+WgKcj20MpsCPjoTrChA 2026-03-22T08:37:42Z 43966 60 15 http://arxiv.org/abs/2603.17836v1 Verification and Validation of Physics-Informed Surrogate Component Models for Dynamic Power-System Simulation 2026-03-18T15:28:18Z Physics-informed machine learning surrogates are increasingly explored to accelerate dynamic simulation of generators, converters, and other power grid components. The key question, however, is not only whether a surrogate matches a stand-alone component model on average, but whether it remains accurate after insertion into a differential-algebraic simulator, where the surrogate outputs enter the algebraic equations coupling the component to the rest of the system. This paper formulates that in-simulator use as a verification and validation (V\&V) problem. A finite-horizon bound is derived that links allowable component-output error to algebraic-coupling sensitivity, dynamic error amplification, and the simulation horizon. Two complementary settings are then studied: model-based verification against a reference component solver, and data-based validation through conformal calibration of the component-output variables exchanged with the simulator. The framework is general, but the case study focuses on physics-informed neural-network surrogates of second-, fourth-, and sixth-order synchronous-machine models. Results show that good stand-alone surrogate accuracy does not by itself guarantee accurate in-simulator behavior, that the largest discrepancies concentrate in stressed operating regions, and that small equation residuals do not necessarily imply small state-trajectory errors. 2026-03-18T15:28:18Z Petros Ellinas Indrajit Chaudhuri Johanna Vorwerk Spyros Chatzivasileiadis http://arxiv.org/abs/2510.14959v5 CBF-RL: Safety Filtering Reinforcement Learning in Training with Control Barrier Functions 2026-03-18T15:28:16Z Reinforcement learning (RL), while powerful and expressive, can often prioritize performance at the expense of safety. Yet safety violations can lead to catastrophic outcomes in real-world deployments. Control Barrier Functions (CBFs) offer a principled method to enforce dynamic safety -- traditionally deployed online via safety filters. While the result is safe behavior, the fact that the RL policy does not have knowledge of the CBF can lead to conservative behaviors. This paper proposes CBF-RL, a framework for generating safe behaviors with RL by enforcing CBFs in training. CBF-RL has two key attributes: (1) minimally modifying a nominal RL policy to encode safety constraints via a CBF term, (2) and safety filtering of the policy rollouts in training. Theoretically, we prove that continuous-time safety filters can be deployed via closed-form expressions on discrete-time roll-outs. Practically, we demonstrate that CBF-RL internalizes the safety constraints in the learned policy -- both enforcing safer actions and biasing towards safer rewards -- enabling safe deployment without the need for an online safety filter. We validate our framework through ablation studies on navigation tasks and on the Unitree G1 humanoid robot, where CBF-RL enables safer exploration, faster convergence, and robust performance under uncertainty, enabling the humanoid robot to avoid obstacles and climb stairs safely in real-world settings without a runtime safety filter. 2025-10-16T17:58:58Z To appear at ICRA 2026; sample code for the navigation example with CBF-RL reward core construction can be found at https://github.com/lzyang2000/cbf-rl-navigation-demo Lizhi Yang Blake Werner Massimiliano de Sa Aaron D. Ames http://arxiv.org/abs/2508.19945v5 Constraint Learning in Multi-Agent Dynamic Games from Demonstrations of Local Nash Interactions 2026-03-18T14:56:02Z We present an inverse dynamic game-based algorithm to learn parametric constraints from a given dataset of local Nash equilibrium interactions between multiple agents. Specifically, we introduce mixed-integer linear programs (MILP) encoding the Karush-Kuhn-Tucker (KKT) conditions of the interacting agents, which recover constraints consistent with the local Nash stationarity of the interaction demonstrations. We establish theoretical guarantees that our method learns inner approximations of the true safe and unsafe sets. We also use the interaction constraints recovered by our method to design motion plans that robustly satisfy the underlying constraints. Across simulations and hardware experiments, our methods accurately inferred constraints and designed safe interactive motion plans for various classes of constraints, both convex and non-convex, from interaction demonstrations of agents with nonlinear dynamics. 2025-08-27T15:01:09Z Zhouyu Zhang Chih-Yuan Chiu Glen Chou http://arxiv.org/abs/2603.17802v1 An HMDP-MPC Decision-making Framework with Adaptive Safety Margins and Hysteresis for Autonomous Driving 2026-03-18T14:55:58Z This paper presents a unified decision-making framework that integrates Hybrid Markov Decision Processes (HMDPs) with Model Predictive Control (MPC), augmented by velocity-dependent safety margins and a prediction-aware hysteresis mechanism. Both the ego and surrounding vehicles are modeled as HMDPs, allowing discrete maneuver transition and kinematic evolution to be jointly considered within the MPC optimization. Safety margins derived from the Intelligent Driver Model (IDM) adapt to traffic context but vary with speed, which can cause oscillatory decisions and velocity fluctuations. To mitigate this, we propose a frozen-release hysteresis mechanism with distinct trigger and release thresholds, effectively enlarging the reaction buffer and suppressing oscillations. Decision continuity is further safeguarded by a two-layer recovery scheme: a global bounded relaxation tied to IDM margins and a deterministic fallback policy. The framework is evaluated through a case study, an ablation against a no-hysteresis baseline, and largescale randomized experiments across 18 traffic settings. Across 8,050 trials, it achieves a collision rate of only 0.05%, with 98.77% of decisions resolved by nominal MPC and minimal reliance on relaxation or fallback. These results demonstrate the robustness and adaptability of the proposed decision-making framework in heterogeneous traffic conditions. 2026-03-18T14:55:58Z 8 pages, 6 figures, to be published in ICRA 2026 proceedings Siyuan Li Loughborough University Chengyuan Liu Loughborough University Wen-Hua Chen The Hong Kong Polytechnic University http://arxiv.org/abs/2512.23636v3 NashOpt - A Python Library for Computing Generalized Nash Equilibria 2026-03-18T14:54:46Z NashOpt is an open-source Python library for computing and designing generalized Nash equilibria (GNEs) in noncooperative games with shared constraints and real-valued decision variables. The library exploits the joint Karush-Kuhn-Tucker (KKT) conditions of all players to handle both general nonlinear GNEs and linear-quadratic games, including their variational versions. Nonlinear games are solved via nonlinear least-squares formulations, relying on JAX for automatic differentiation. Linear-quadratic GNEs are reformulated as mixed-integer linear programs, enabling efficient computation of multiple equilibria. The framework also supports inverse-game and Stackelberg game-design problems. The capabilities of NashOpt are demonstrated through several examples, including noncooperative game-theoretic control problems of linear quadratic regulation and model predictive control. The library is available at https://github.com/bemporad/nashopt 2025-12-29T17:49:09Z 24 pages, 7 figures Alberto Bemporad http://arxiv.org/abs/2603.17780v1 Data-Driven Predictive Control for Stochastic Descriptor Systems: An Innovation-Based Approach Handling Non-Causal Dynamics 2026-03-18T14:45:51Z Descriptor systems arise naturally in applications governed by algebraic constraints, such as power networks and chemical processes. The singular system matrix in descriptor systems may introduce non-causal dynamics, where the current output depends on future inputs and, in the presence of stochastic process and measurement noise, on future noise realizations as well. This paper proposes a data-driven predictive control framework for stochastic descriptor systems that accommodates algebraic constraints and impulsive modes without explicit system identification. A causal innovation representation is constructed by augmenting the system state with a noise buffer that encapsulates the non-causal stochastic interactions, transforming the descriptor system into an equivalent proper state-space form. Willems' Fundamental Lemma is then extended to the innovation form with fully data-verifiable conditions. Building on these results, a practical Inno-DeePC algorithm is developed that integrates offline innovation estimation and online predictive control. Numerical experiments on a direct-current (DC) microgrid demonstrate the effectiveness of the proposed approach for stochastic descriptor systems. 2026-03-18T14:45:51Z 6 pages, 2 figures Yunxiang Ma Yibo Wang Zhongmei Li Chao Shang http://arxiv.org/abs/2603.17764v1 Robust Dynamic Pricing and Admission Control with Fairness Guarantees 2026-03-18T14:23:36Z Dynamic pricing is commonly used to regulate congestion in shared service systems. This paper is motivated by the fact that when heterogeneaous user groups (in terms of price responsiveness) are present, conventional monotonic pricing can lead to unfair outcomes by disproportionately excluding price-elastic users, particularly under high or uncertain demand. The paper's contributions are twofold. First, we show that when fairness is imposed as a hard state constraint, the optimal (revenue maximizing) pricing policy is generally non-monotonic in demand. This structural result departs fundamentally from standard surge pricing rules and reveals that price reduction under heavy load may be necessary to maintain equitable access. Second, we address the problem that price elasticity among heterogeneous users is unobservable. To solve it, we develop a robust dynamic pricing and admission control framework that enforces resource capacity and fairness constraints for all user type distributions consistent with aggregate measurements. By integrating integral High Order Control Barrier Functions (iHOCBFs) with a worst case robust optimization framework, we obtain a controller that guarantees forward invariance of safety and fairness constraints while optimizing revenue. Numerical experiments demonstrate improved fairness and revenue performance relative to monotonic surge pricing policies. 2026-03-18T14:23:36Z Yingqing Chen Anni Li Christos G. Cassandras Homayoun Hamedmoghadam Fabian Wirth Robert Shorten http://arxiv.org/abs/2603.17751v1 Multi-Source Human-in-the-Loop Digital Twin Testbed for Connected and Autonomous Vehicles in Mixed Traffic Flow 2026-03-18T14:14:58Z In the emerging mixed traffic environments, Connected and Autonomous Vehicles (CAVs) have to interact with surrounding human-driven vehicles (HDVs). This paper introduces MSH-MCCT (Multi-Source Human-in-the-Loop Mixed Cloud Control Testbed), a novel CAV testbed that captures complex interactions between various CAVs and HDVs. Utilizing the Mixed Digital Twin concept, which combines Mixed Reality with Digital Twin, MSH-MCCT integrates physical, virtual, and mixed platforms, along with multi-source control inputs. Bridged by the mixed platform, MSH-MCCT allows human drivers and CAV algorithms to operate both physical and virtual vehicles within multiple fields of view. Particularly, this testbed facilitates the coexistence and real-time interaction of physical and virtual CAVs \& HDVs, significantly enhancing the experimental flexibility and scalability. Experiments on vehicle platooning in mixed traffic showcase the potential of MSH-MCCT to conduct CAV testing with multi-source real human drivers in the loop through driving simulators of diverse fidelity. The videos for the experiments are available at our project website: https://dongjh20.github.io/MSH-MCCT. 2026-03-18T14:14:58Z Jianghong Dong Jiawei Wang Chunying Yang Mengchi Cai Chaoyi Chen Qing Xu Jianqiang Wang Keqiang Li http://arxiv.org/abs/2603.17686v1 On maximal positive invariant set computation for rank-deficient linear systems 2026-03-18T13:01:08Z The maximal positively invariant (MPI) set is obtained through a backward reachability procedure involving the iterative computation and intersection of predecessor sets under state and input constraints. However, standard static feedback synthesis may place some of the closed-loop eigenvalues at zero, leading to rank-deficient dynamics. This affects the MPI computation by inducing projections onto lower-dimensional subspaces during intermediate steps. By exploiting the Schur decomposition, we explicitly address this singular case and propose a robust algorithm that computes the MPI set in both polyhedral and constrained-zonotope representations. 2026-03-18T13:01:08Z Bogdan Gheorghe Daniel Ioan Cristian Flutur Ionela Prodan Florin Stoican http://arxiv.org/abs/2603.17665v1 Physical Layer Security in Finite Blocklength Massive IoT with Randomly Located Eavesdroppers 2026-03-18T12:32:10Z This paper analyzes the physical layer security performance of massive uplink Internet of Things (IoT) networks operating under the finite blocklength (FBL) regime. IoT devices and base stations (BS) are modeled using a stochastic geometry approach, while an eavesdropper is placed at a random location around the transmitting device. This system model captures security risks common in dense IoT deployments. Analytical expressions for the secure success probability, secrecy outage probability and secrecy throughput are derived to characterize how stochastic interference, fading and eavesdropper spatial uncertainty interact with FBL constraints in short packet uplink transmissions. Numerical results illustrate key system behavior under different network and channel conditions. 2026-03-18T12:32:10Z Tijana Devaja Milica Petkovic Sokol Kosta Dejan Vukobratovic Cedomir Stefanovic http://arxiv.org/abs/2603.17640v1 Defending the power grid by segmenting the EV charging cyber infrastructure 2026-03-18T12:01:19Z This paper examines defending the power grid against load-altering attacks using electric vehicle charging. It proposes to preventively segment the cyber infrastructure that charging station operators (CSOs) use to communicate with and control their charging stations, thereby limiting the impact of successful cyber-attacks. Using real German charging station data and a reconstructed transmission grid model, a threat analysis shows that without segmentation, the successful hack of just two CSOs can overload two transmission grid branches, exceeding the N-1 security margin and necessitating defense measures. A novel defense design problem is then formulated that minimizes the number of imposed segmentations while bounding the number of branch overloads under worst-case attacks. The resulting IP-MILP bi-level problem can be solved with an exact column and constraint generation algorithm and with heuristics for fast computation on large-scale instances. For the near-real-world Germany case, the applicability of the heuristics is demonstrated and validated under relevant load and dispatch scenarios. It is found that the simple scheme of segmenting CSOs evenly by their installed capacity leads to only 23% more segments compared to the heuristic optimization result, suggesting potential relevance as a regulatory measure. 2026-03-18T12:01:19Z Kirill Kuroptev Florian Steinke Efthymios Karangelos http://arxiv.org/abs/2603.17634v1 Hierarchical Decision-Making under Uncertainty: A Hybrid MDP and Chance-Constrained MPC Approach 2026-03-18T11:56:44Z This paper presents a hierarchical decision-making framework for autonomous systems operating under uncertainty, demonstrated through autonomous driving as a representative application. Surrounding agents are modeled using Hybrid Markov Decision Processes (HMDPs) that jointly capture maneuver-level and dynamic-level uncertainties, enabling the multi-modal environmental prediction. The ego agent is modeled using a separate HMDP and integrated into a Model Predictive Control (MPC) framework that unifies maneuver selection with dynamic feasibility within a single optimization. A set of joint chance constraints serves as the bridge between environmental prediction and optimization, incorporating multi-modal environment predictions into the MPC formulation and ensuring safety across all plausible interaction scenarios. The proposed framework provides theoretical guarantees on recursive feasibility and asymptotic stability, and its benefits in terms of safety and efficiency are validated through comprehensive evaluations in highway and urban environments, together with comparisons against a rule-based baseline. 2026-03-18T11:56:44Z 14 pages, 10 figures Siyuan Li Chengyuan Liu Wen-Hua Chen http://arxiv.org/abs/2603.17632v1 Real-Time Online Learning for Model Predictive Control using a Spatio-Temporal Gaussian Process Approximation 2026-03-18T11:53:21Z Learning-based model predictive control (MPC) can enhance control performance by correcting for model inaccuracies, enabling more precise state trajectory predictions than traditional MPC. A common approach is to model unknown residual dynamics as a Gaussian process (GP), which leverages data and also provides an estimate of the associated uncertainty. However, the high computational cost of online learning poses a major challenge for real-time GP-MPC applications. This work presents an efficient implementation of an approximate spatio-temporal GP model, offering online learning at constant computational complexity. It is optimized for GP-MPC, where it enables improved control performance by learning more accurate system dynamics online in real-time, even for time-varying systems. The performance of the proposed method is demonstrated by simulations and hardware experiments in the exemplary application of autonomous miniature racing. 2026-03-18T11:53:21Z to be published at 2026 IEEE International Conference on Robotics & Automation (ICRA) Lars Bartels Amon Lahr Andrea Carron Melanie N. Zeilinger http://arxiv.org/abs/2603.17631v1 Benchmarking Reinforcement Learning via Stochastic Converse Optimality: Generating Systems with Known Optimal Policies 2026-03-18T11:52:21Z The objective comparison of Reinforcement Learning (RL) algorithms is notoriously complex as outcomes and benchmarking of performances of different RL approaches are critically sensitive to environmental design, reward structures, and stochasticity inherent in both algorithmic learning and environmental dynamics. To manage this complexity, we introduce a rigorous benchmarking framework by extending converse optimality to discrete-time, control-affine, nonlinear systems with noise. Our framework provides necessary and sufficient conditions, under which a prescribed value function and policy are optimal for constructed systems, enabling the systematic generation of benchmark families via homotopy variations and randomized parameters. We validate it by automatically constructing diverse environments, demonstrating our framework's capacity for a controlled and comprehensive evaluation across algorithms. By assessing standard methods against a ground-truth optimum, our work delivers a reproducible foundation for precise and rigorous RL benchmarking. 2026-03-18T11:52:21Z Sinan Ibrahim Grégoire Ouerdane Hadi Salloum Henni Ouerdane Stefan Streif Pavel Osinenko http://arxiv.org/abs/2502.05228v2 Physics-Informed Evolution: An Evolutionary Framework for Solving Quantum Control Problems Involving the Schrödinger Equation 2026-03-18T11:46:04Z Physics-Informed Neural Networks (PINNs) have demonstrated that embedding physical laws directly into the learning objective can significantly enhance the efficiency and physical consistency of neural network solutions. Inspired by this principle, we ask a natural question: can physical information be similarly embedded into the fitness function of evolutionary algorithms? In this work, we propose Physics-Informed Evolution (PIE), a novel framework that incorporates physical information derived from governing physical laws into the evolutionary fitness landscape, bridging the long-standing connection between learning and evolution in artificial intelligence. As a concrete instantiation, we apply PIE to quantum control problems governed by the Schrödinger equation, where the goal is to find optimal control fields that drive quantum systems from initial states to desired target states. We validate PIE on three representative quantum control benchmarks: state preparation in V-type three-level systems, entangled state generation in superconducting quantum circuits, and two-atom cavity QED systems, under varying levels of system uncertainty. Extensive comparisons against ten single-objective and five multi-objective evolutionary baselines demonstrate that PIE consistently achieves higher fidelity, lower state deviation, and improved robustness. Our results suggest that the physics-informed principle extends naturally beyond neural network training to the broader domain of evolutionary computation. 2025-02-06T08:43:21Z 22 pages, 2 figures Kaichen Ouyang Mingyang Yu