https://arxiv.org/api/ma0Ld0R+WgKcj20MpsCPjoTrChA2026-03-22T08:37:42Z439666015http://arxiv.org/abs/2603.17836v1Verification and Validation of Physics-Informed Surrogate Component Models for Dynamic Power-System Simulation2026-03-18T15:28:18ZPhysics-informed machine learning surrogates are increasingly explored to accelerate dynamic simulation of generators, converters, and other power grid components. The key question, however, is not only whether a surrogate matches a stand-alone component model on average, but whether it remains accurate after insertion into a differential-algebraic simulator, where the surrogate outputs enter the algebraic equations coupling the component to the rest of the system. This paper formulates that in-simulator use as a verification and validation (V\&V) problem. A finite-horizon bound is derived that links allowable component-output error to algebraic-coupling sensitivity, dynamic error amplification, and the simulation horizon. Two complementary settings are then studied: model-based verification against a reference component solver, and data-based validation through conformal calibration of the component-output variables exchanged with the simulator. The framework is general, but the case study focuses on physics-informed neural-network surrogates of second-, fourth-, and sixth-order synchronous-machine models. Results show that good stand-alone surrogate accuracy does not by itself guarantee accurate in-simulator behavior, that the largest discrepancies concentrate in stressed operating regions, and that small equation residuals do not necessarily imply small state-trajectory errors.2026-03-18T15:28:18ZPetros EllinasIndrajit ChaudhuriJohanna VorwerkSpyros Chatzivasileiadishttp://arxiv.org/abs/2510.14959v5CBF-RL: Safety Filtering Reinforcement Learning in Training with Control Barrier Functions2026-03-18T15:28:16ZReinforcement learning (RL), while powerful and expressive, can often prioritize performance at the expense of safety. Yet safety violations can lead to catastrophic outcomes in real-world deployments. Control Barrier Functions (CBFs) offer a principled method to enforce dynamic safety -- traditionally deployed online via safety filters. While the result is safe behavior, the fact that the RL policy does not have knowledge of the CBF can lead to conservative behaviors. This paper proposes CBF-RL, a framework for generating safe behaviors with RL by enforcing CBFs in training. CBF-RL has two key attributes: (1) minimally modifying a nominal RL policy to encode safety constraints via a CBF term, (2) and safety filtering of the policy rollouts in training. Theoretically, we prove that continuous-time safety filters can be deployed via closed-form expressions on discrete-time roll-outs. Practically, we demonstrate that CBF-RL internalizes the safety constraints in the learned policy -- both enforcing safer actions and biasing towards safer rewards -- enabling safe deployment without the need for an online safety filter. We validate our framework through ablation studies on navigation tasks and on the Unitree G1 humanoid robot, where CBF-RL enables safer exploration, faster convergence, and robust performance under uncertainty, enabling the humanoid robot to avoid obstacles and climb stairs safely in real-world settings without a runtime safety filter.2025-10-16T17:58:58ZTo appear at ICRA 2026; sample code for the navigation example with CBF-RL reward core construction can be found at https://github.com/lzyang2000/cbf-rl-navigation-demoLizhi YangBlake WernerMassimiliano de SaAaron D. Ameshttp://arxiv.org/abs/2508.19945v5Constraint Learning in Multi-Agent Dynamic Games from Demonstrations of Local Nash Interactions2026-03-18T14:56:02ZWe present an inverse dynamic game-based algorithm to learn parametric constraints from a given dataset of local Nash equilibrium interactions between multiple agents. Specifically, we introduce mixed-integer linear programs (MILP) encoding the Karush-Kuhn-Tucker (KKT) conditions of the interacting agents, which recover constraints consistent with the local Nash stationarity of the interaction demonstrations. We establish theoretical guarantees that our method learns inner approximations of the true safe and unsafe sets. We also use the interaction constraints recovered by our method to design motion plans that robustly satisfy the underlying constraints. Across simulations and hardware experiments, our methods accurately inferred constraints and designed safe interactive motion plans for various classes of constraints, both convex and non-convex, from interaction demonstrations of agents with nonlinear dynamics.2025-08-27T15:01:09ZZhouyu ZhangChih-Yuan ChiuGlen Chouhttp://arxiv.org/abs/2603.17802v1An HMDP-MPC Decision-making Framework with Adaptive Safety Margins and Hysteresis for Autonomous Driving2026-03-18T14:55:58ZThis paper presents a unified decision-making framework that integrates Hybrid Markov Decision Processes (HMDPs) with Model Predictive Control (MPC), augmented by velocity-dependent safety margins and a prediction-aware hysteresis mechanism. Both the ego and surrounding vehicles are modeled as HMDPs, allowing discrete maneuver transition and kinematic evolution to be jointly considered within the MPC optimization. Safety margins derived from the Intelligent Driver Model (IDM) adapt to traffic context but vary with speed, which can cause oscillatory decisions and velocity fluctuations. To mitigate this, we propose a frozen-release hysteresis mechanism with distinct trigger and release thresholds, effectively enlarging the reaction buffer and suppressing oscillations. Decision continuity is further safeguarded by a two-layer recovery scheme: a global bounded relaxation tied to IDM margins and a deterministic fallback policy. The framework is evaluated through a case study, an ablation against a no-hysteresis baseline, and largescale randomized experiments across 18 traffic settings. Across 8,050 trials, it achieves a collision rate of only 0.05%, with 98.77% of decisions resolved by nominal MPC and minimal reliance on relaxation or fallback. These results demonstrate the robustness and adaptability of the proposed decision-making framework in heterogeneous traffic conditions.2026-03-18T14:55:58Z8 pages, 6 figures, to be published in ICRA 2026 proceedingsSiyuan LiLoughborough UniversityChengyuan LiuLoughborough UniversityWen-Hua ChenThe Hong Kong Polytechnic Universityhttp://arxiv.org/abs/2512.23636v3NashOpt - A Python Library for Computing Generalized Nash Equilibria2026-03-18T14:54:46ZNashOpt is an open-source Python library for computing and designing generalized Nash equilibria (GNEs) in noncooperative games with shared constraints and real-valued decision variables. The library exploits the joint Karush-Kuhn-Tucker (KKT) conditions of all players to handle both general nonlinear GNEs and linear-quadratic games, including their variational versions. Nonlinear games are solved via nonlinear least-squares formulations, relying on JAX for automatic differentiation. Linear-quadratic GNEs are reformulated as mixed-integer linear programs, enabling efficient computation of multiple equilibria. The framework also supports inverse-game and Stackelberg game-design problems. The capabilities of NashOpt are demonstrated through several examples, including noncooperative game-theoretic control problems of linear quadratic regulation and model predictive control. The library is available at https://github.com/bemporad/nashopt2025-12-29T17:49:09Z24 pages, 7 figuresAlberto Bemporadhttp://arxiv.org/abs/2603.17780v1Data-Driven Predictive Control for Stochastic Descriptor Systems: An Innovation-Based Approach Handling Non-Causal Dynamics2026-03-18T14:45:51ZDescriptor systems arise naturally in applications governed by algebraic constraints, such as power networks and chemical processes. The singular system matrix in descriptor systems may introduce non-causal dynamics, where the current output depends on future inputs and, in the presence of stochastic process and measurement noise, on future noise realizations as well. This paper proposes a data-driven predictive control framework for stochastic descriptor systems that accommodates algebraic constraints and impulsive modes without explicit system identification. A causal innovation representation is constructed by augmenting the system state with a noise buffer that encapsulates the non-causal stochastic interactions, transforming the descriptor system into an equivalent proper state-space form. Willems' Fundamental Lemma is then extended to the innovation form with fully data-verifiable conditions. Building on these results, a practical Inno-DeePC algorithm is developed that integrates offline innovation estimation and online predictive control. Numerical experiments on a direct-current (DC) microgrid demonstrate the effectiveness of the proposed approach for stochastic descriptor systems.2026-03-18T14:45:51Z6 pages, 2 figuresYunxiang MaYibo WangZhongmei LiChao Shanghttp://arxiv.org/abs/2603.17764v1Robust Dynamic Pricing and Admission Control with Fairness Guarantees2026-03-18T14:23:36ZDynamic pricing is commonly used to regulate congestion in shared service systems. This paper is motivated by the fact that when heterogeneaous user groups (in terms of price responsiveness) are present, conventional monotonic pricing can lead to unfair outcomes by disproportionately excluding price-elastic users, particularly under high or uncertain demand. The paper's contributions are twofold. First, we show that when fairness is imposed as a hard state constraint, the optimal (revenue maximizing) pricing policy is generally non-monotonic in demand. This structural result departs fundamentally from standard surge pricing rules and reveals that price reduction under heavy load may be necessary to maintain equitable access. Second, we address the problem that price elasticity among heterogeneous users is unobservable. To solve it, we develop a robust dynamic pricing and admission control framework that enforces resource capacity and fairness constraints for all user type distributions consistent with aggregate measurements. By integrating integral High Order Control Barrier Functions (iHOCBFs) with a worst case robust optimization framework, we obtain a controller that guarantees forward invariance of safety and fairness constraints while optimizing revenue. Numerical experiments demonstrate improved fairness and revenue performance relative to monotonic surge pricing policies.2026-03-18T14:23:36ZYingqing ChenAnni LiChristos G. CassandrasHomayoun HamedmoghadamFabian WirthRobert Shortenhttp://arxiv.org/abs/2603.17751v1Multi-Source Human-in-the-Loop Digital Twin Testbed for Connected and Autonomous Vehicles in Mixed Traffic Flow2026-03-18T14:14:58ZIn the emerging mixed traffic environments, Connected and Autonomous Vehicles (CAVs) have to interact with surrounding human-driven vehicles (HDVs). This paper introduces MSH-MCCT (Multi-Source Human-in-the-Loop Mixed Cloud Control Testbed), a novel CAV testbed that captures complex interactions between various CAVs and HDVs. Utilizing the Mixed Digital Twin concept, which combines Mixed Reality with Digital Twin, MSH-MCCT integrates physical, virtual, and mixed platforms, along with multi-source control inputs. Bridged by the mixed platform, MSH-MCCT allows human drivers and CAV algorithms to operate both physical and virtual vehicles within multiple fields of view. Particularly, this testbed facilitates the coexistence and real-time interaction of physical and virtual CAVs \& HDVs, significantly enhancing the experimental flexibility and scalability. Experiments on vehicle platooning in mixed traffic showcase the potential of MSH-MCCT to conduct CAV testing with multi-source real human drivers in the loop through driving simulators of diverse fidelity. The videos for the experiments are available at our project website: https://dongjh20.github.io/MSH-MCCT.2026-03-18T14:14:58ZJianghong DongJiawei WangChunying YangMengchi CaiChaoyi ChenQing XuJianqiang WangKeqiang Lihttp://arxiv.org/abs/2603.17686v1On maximal positive invariant set computation for rank-deficient linear systems2026-03-18T13:01:08ZThe maximal positively invariant (MPI) set is obtained through a backward reachability procedure involving the iterative computation and intersection of predecessor sets under state and input constraints.
However, standard static feedback synthesis may place some of the closed-loop eigenvalues at zero, leading to rank-deficient dynamics. This affects the MPI computation by inducing projections onto lower-dimensional subspaces during intermediate steps. By exploiting the Schur decomposition, we explicitly address this singular case and propose a robust algorithm that computes the MPI set in both polyhedral and constrained-zonotope representations.2026-03-18T13:01:08ZBogdan GheorgheDaniel IoanCristian FluturIonela ProdanFlorin Stoicanhttp://arxiv.org/abs/2603.17665v1Physical Layer Security in Finite Blocklength Massive IoT with Randomly Located Eavesdroppers2026-03-18T12:32:10ZThis paper analyzes the physical layer security performance of massive uplink Internet of Things (IoT) networks operating under the finite blocklength (FBL) regime. IoT devices and base stations (BS) are modeled using a stochastic geometry approach, while an eavesdropper is placed at a random location around the transmitting device. This system model captures security risks common in dense IoT deployments. Analytical expressions for the secure success probability, secrecy outage probability and secrecy throughput are derived to characterize how stochastic interference, fading and eavesdropper spatial uncertainty interact with FBL constraints in short packet uplink transmissions. Numerical results illustrate key system behavior under different network and channel conditions.2026-03-18T12:32:10ZTijana DevajaMilica PetkovicSokol KostaDejan VukobratovicCedomir Stefanovichttp://arxiv.org/abs/2603.17640v1Defending the power grid by segmenting the EV charging cyber infrastructure2026-03-18T12:01:19ZThis paper examines defending the power grid against load-altering attacks using electric vehicle charging. It proposes to preventively segment the cyber infrastructure that charging station operators (CSOs) use to communicate with and control their charging stations, thereby limiting the impact of successful cyber-attacks. Using real German charging station data and a reconstructed transmission grid model, a threat analysis shows that without segmentation, the successful hack of just two CSOs can overload two transmission grid branches, exceeding the N-1 security margin and necessitating defense measures. A novel defense design problem is then formulated that minimizes the number of imposed segmentations while bounding the number of branch overloads under worst-case attacks. The resulting IP-MILP bi-level problem can be solved with an exact column and constraint generation algorithm and with heuristics for fast computation on large-scale instances. For the near-real-world Germany case, the applicability of the heuristics is demonstrated and validated under relevant load and dispatch scenarios. It is found that the simple scheme of segmenting CSOs evenly by their installed capacity leads to only 23% more segments compared to the heuristic optimization result, suggesting potential relevance as a regulatory measure.2026-03-18T12:01:19ZKirill KuroptevFlorian SteinkeEfthymios Karangeloshttp://arxiv.org/abs/2603.17634v1Hierarchical Decision-Making under Uncertainty: A Hybrid MDP and Chance-Constrained MPC Approach2026-03-18T11:56:44ZThis paper presents a hierarchical decision-making framework for autonomous systems operating under uncertainty, demonstrated through autonomous driving as a representative application. Surrounding agents are modeled using Hybrid Markov Decision Processes (HMDPs) that jointly capture maneuver-level and dynamic-level uncertainties, enabling the multi-modal environmental prediction. The ego agent is modeled using a separate HMDP and integrated into a Model Predictive Control (MPC) framework that unifies maneuver selection with dynamic feasibility within a single optimization. A set of joint chance constraints serves as the bridge between environmental prediction and optimization, incorporating multi-modal environment predictions into the MPC formulation and ensuring safety across all plausible interaction scenarios. The proposed framework provides theoretical guarantees on recursive feasibility and asymptotic stability, and its benefits in terms of safety and efficiency are validated through comprehensive evaluations in highway and urban environments, together with comparisons against a rule-based baseline.2026-03-18T11:56:44Z14 pages, 10 figuresSiyuan LiChengyuan LiuWen-Hua Chenhttp://arxiv.org/abs/2603.17632v1Real-Time Online Learning for Model Predictive Control using a Spatio-Temporal Gaussian Process Approximation2026-03-18T11:53:21ZLearning-based model predictive control (MPC) can enhance control performance by correcting for model inaccuracies, enabling more precise state trajectory predictions than traditional MPC. A common approach is to model unknown residual dynamics as a Gaussian process (GP), which leverages data and also provides an estimate of the associated uncertainty. However, the high computational cost of online learning poses a major challenge for real-time GP-MPC applications. This work presents an efficient implementation of an approximate spatio-temporal GP model, offering online learning at constant computational complexity. It is optimized for GP-MPC, where it enables improved control performance by learning more accurate system dynamics online in real-time, even for time-varying systems. The performance of the proposed method is demonstrated by simulations and hardware experiments in the exemplary application of autonomous miniature racing.2026-03-18T11:53:21Zto be published at 2026 IEEE International Conference on Robotics & Automation (ICRA)Lars BartelsAmon LahrAndrea CarronMelanie N. Zeilingerhttp://arxiv.org/abs/2603.17631v1Benchmarking Reinforcement Learning via Stochastic Converse Optimality: Generating Systems with Known Optimal Policies2026-03-18T11:52:21ZThe objective comparison of Reinforcement Learning (RL) algorithms is notoriously complex as outcomes and benchmarking of performances of different RL approaches are critically sensitive to environmental design, reward structures, and stochasticity inherent in both algorithmic learning and environmental dynamics. To manage this complexity, we introduce a rigorous benchmarking framework by extending converse optimality to discrete-time, control-affine, nonlinear systems with noise. Our framework provides necessary and sufficient conditions, under which a prescribed value function and policy are optimal for constructed systems, enabling the systematic generation of benchmark families via homotopy variations and randomized parameters. We validate it by automatically constructing diverse environments, demonstrating our framework's capacity for a controlled and comprehensive evaluation across algorithms. By assessing standard methods against a ground-truth optimum, our work delivers a reproducible foundation for precise and rigorous RL benchmarking.2026-03-18T11:52:21ZSinan IbrahimGrégoire OuerdaneHadi SalloumHenni OuerdaneStefan StreifPavel Osinenkohttp://arxiv.org/abs/2502.05228v2Physics-Informed Evolution: An Evolutionary Framework for Solving Quantum Control Problems Involving the Schrödinger Equation2026-03-18T11:46:04ZPhysics-Informed Neural Networks (PINNs) have demonstrated that embedding physical laws directly into the learning objective can significantly enhance the efficiency and physical consistency of neural network solutions. Inspired by this principle, we ask a natural question: can physical information be similarly embedded into the fitness function of evolutionary algorithms? In this work, we propose Physics-Informed Evolution (PIE), a novel framework that incorporates physical information derived from governing physical laws into the evolutionary fitness landscape, bridging the long-standing connection between learning and evolution in artificial intelligence. As a concrete instantiation, we apply PIE to quantum control problems governed by the Schrödinger equation, where the goal is to find optimal control fields that drive quantum systems from initial states to desired target states. We validate PIE on three representative quantum control benchmarks: state preparation in V-type three-level systems, entangled state generation in superconducting quantum circuits, and two-atom cavity QED systems, under varying levels of system uncertainty. Extensive comparisons against ten single-objective and five multi-objective evolutionary baselines demonstrate that PIE consistently achieves higher fidelity, lower state deviation, and improved robustness. Our results suggest that the physics-informed principle extends naturally beyond neural network training to the broader domain of evolutionary computation.2025-02-06T08:43:21Z22 pages, 2 figuresKaichen OuyangMingyang Yu