http://arxiv.org/api/Gb3y7GBmmhIN/mz6P/hqeKMtxVk 2025-04-22T00:00:00-04:00 53757 15 15 http://arxiv.org/abs/2504.14853v1 2025-04-21T04:35:56Z 2025-04-21T04:35:56Z Output regulation for an unstable wave equation with output delay and one measurement only

This paper addresses the output regulation problem for a one-dimensional unstable wave equation subject to output delay and all-channel disturbances with unknown frequencies and amplitudes. First, this problem is transformed into a stabilization problem for an unstable wave equation with output delay and disturbances by employing regulator equations. Subsequently, a backstepping-based feedforward regulator is proposed to exponentially stabilize this system. To track the states of the unstable wave equation, the time interval is partitioned into two segments. The observers and predictors are designed at these distinct intervals, respectively. Therein, the observers comprise two components: a state observer proposed via dynamic compensators and an adaptive observer designed by the adaptive internal model method. Finally, a novel error-based feedback controller is derived using a single measurement, ensuring exponential convergence of the tracking error to zero. This work establishes the pioneering solution to the output regulation problem for distributed parameter systems (DPS) with output delay. Numerical simulations are provided to illustrate the results.

Shen Wang Zhong-Jie Han Shuangxi Huang Zhi-Xue Zhao http://arxiv.org/abs/2407.00359v6 2025-04-21T04:24:58Z 2024-06-29T08:12:36Z Multicriteria Optimization and Decision Making: Principles, Algorithms and Case Studies

Real-world decision and optimization problems, often involve constraints and conflicting criteria. For example, choosing a travel method must balance speed, cost, environmental footprint, and convenience. Similarly, designing an industrial process must consider safety, environmental impact, and cost efficiency. Ideal solutions where all objectives are optimally met are rare; instead, we seek good compromises and aim to avoid lose-lose scenarios. Multicriteria optimization offers computational techniques to compute Pareto optimal solutions, aiding decision analysis and decision making. This reader offers an introduction to this topic and has been developed on the basis of the revised edition of the reader for the MSc computer science course "Multicriteria Optimization and Decision Analysis" at the Leiden Institute of Advanced Computer Science, Leiden University, The Netherlands. This course was taught annually by the first author from 2007 to 2023 as a single semester course with lectures and practicals. Our aim was to make the material accessible to MSc students who do not study mathematics as their core discipline by introducing basic numerical analysis concepts when necessary and providing numerical examples for interesting cases. The introduction is organized in a unique didactic manner developed by the authors, starting from more simple concepts such as linear programming and single-point methods, and advancing from these to more difficult concepts such as optimality conditions for nonlinear optimization and set-oriented solution algorithms. Besides, we focus on the mathematical modeling and foundations rather than on specific algorithms, though not excluding the discussion of some representative examples of solution algorithms.

Michael Emmerich André Deutz 102 pages, Lecture notes http://arxiv.org/abs/2410.21676v4 2025-04-21T04:19:56Z 2024-10-29T02:54:06Z How Does Critical Batch Size Scale in Pre-training?

Training large-scale models under given resources requires careful design of parallelism strategies. In particular, the efficiency notion of critical batch size (CBS), concerning the compromise between time and compute, marks the threshold beyond which greater data parallelism leads to diminishing returns. To operationalize it, we propose a measure of CBS and pre-train a series of auto-regressive language models, ranging from 85 million to 1.2 billion parameters, on the C4 dataset. Through extensive hyper-parameter sweeps and careful control of factors such as batch size, momentum, and learning rate along with its scheduling, we systematically investigate the impact of scale on CBS. Then we fit scaling laws with respect to model and data sizes to decouple their effects. Overall, our results demonstrate that CBS scales primarily with data size rather than model size, a finding we justify theoretically through the analysis of infinite-width limits of neural networks and infinite-dimensional least squares regression. Of independent interest, we highlight the importance of common hyper-parameter choices and strategies for studying large-scale pre-training beyond fixed training durations.

Hanlin Zhang Depen Morwani Nikhil Vyas Jingfeng Wu Difan Zou Udaya Ghai Dean Foster Sham Kakade ICLR 2025, Blog post: https://kempnerinstitute.harvard.edu/research/deeper-learning/how-does-critical-batch-size-scale-in-pre-training-decoupling-data-and-model-size http://arxiv.org/abs/2504.14841v1 2025-04-21T03:44:46Z 2025-04-21T03:44:46Z (Sub)Exponential Quantum Speedup for Optimization

We demonstrate provable (sub)exponential quantum speedups in both discrete and continuous optimization, achieved through simple and natural quantum optimization algorithms, namely the quantum adiabatic algorithm for discrete optimization and quantum Hamiltonian descent for continuous optimization. Our result builds on the Gily\'en--Hastings--Vazirani (sub)exponential oracle separation for adiabatic quantum computing. With a sequence of perturbative reductions, we compile their construction into two standalone objective functions, whose oracles can be directly leveraged by the plain adiabatic evolution and Schr\"odinger operator evolution for discrete and continuous optimization, respectively.

Jiaqi Leng Kewen Wu Xiaodi Wu Yufan Zheng 69 pages, 6 figures http://arxiv.org/abs/2504.14834v1 2025-04-21T03:29:09Z 2025-04-21T03:29:09Z Output regulation for a reaction-diffusion system with input delay and unknown frequency

This study solves the output regulation problem for a reaction-diffusion system confronting concurrent input delay and fully unidentified disturbances (encompassing both unknown frequencies and amplitudes) across all channels. The principal innovation emerges from a novel adaptive control architecture that synergizes the modal decomposition technique with a dual-observer mechanism, enabling real-time concurrent estimation of unmeasurable system states and disturbances through a state observer and an adaptive disturbance estimator. Unlike existing approaches limited to either delay compensation or partial disturbance rejection, our methodology overcomes the technical barrier of coordinating these two requirements through a rigorously constructed tracking-error-based controller, achieving exponential convergence of system output to reference signals. Numerical simulations are presented to validate the effectiveness of the proposed output feedback control strategy.

Shen Wang Zhong-Jie Han Kai Liu Zhi-Xue Zhao http://arxiv.org/abs/2410.21863v4 2025-04-21T01:55:13Z 2024-10-29T08:56:34Z On invariance of observability for BSDEs and its applications to stochastic control systems

In this paper, we establish the invariance of observability for the observed backward stochastic differential equations (BSDEs) with constant coefficients, relative to the filtered probability space. This signifies that the observability of these observed BSDEs with constant coefficients remains unaffected by the selection of the filtered probability space. As an illustrative application, we demonstrate that for stochastic control systems with constant coefficients, weak observability, approximate null controllability with cost, and stabilizability are equivalent across some or any filtered probability spaces.

Bao-Zhu Guo Huaiqiang Yu Meixuan Zhang 26 Pages http://arxiv.org/abs/2504.14741v1 2025-04-20T21:07:59Z 2025-04-20T21:07:59Z AltGDmin: Alternating GD and Minimization for Partly-Decoupled (Federated) Optimization

This article describes a novel optimization solution framework, called alternating gradient descent (GD) and minimization (AltGDmin), that is useful for many problems for which alternating minimization (AltMin) is a popular solution. AltMin is a special case of the block coordinate descent algorithm that is useful for problems in which minimization w.r.t one subset of variables keeping the other fixed is closed form or otherwise reliably solved. Denote the two blocks/subsets of the optimization variables Z by Za, Zb, i.e., Z = {Za, Zb}. AltGDmin is often a faster solution than AltMin for any problem for which (i) the minimization over one set of variables, Zb, is much quicker than that over the other set, Za; and (ii) the cost function is differentiable w.r.t. Za. Often, the reason for one minimization to be quicker is that the problem is ``decoupled" for Zb and each of the decoupled problems is quick to solve. This decoupling is also what makes AltGDmin communication-efficient for federated settings. Important examples where this assumption holds include (a) low rank column-wise compressive sensing (LRCS), low rank matrix completion (LRMC), (b) their outlier-corrupted extensions such as robust PCA, robust LRCS and robust LRMC; (c) phase retrieval and its sparse and low-rank model based extensions; (d) tensor extensions of many of these problems such as tensor LRCS and tensor completion; and (e) many partly discrete problems where GD does not apply -- such as clustering, unlabeled sensing, and mixed linear regression. LRCS finds important applications in multi-task representation learning and few shot learning, federated sketching, and accelerated dynamic MRI. LRMC and robust PCA find important applications in recommender systems, computer vision and video analytics.

Namrata Vaswani To appear in Foundations and Trends in Optimization (NOW publishers) http://arxiv.org/abs/2504.14725v1 2025-04-20T19:39:37Z 2025-04-20T19:39:37Z Sensor Scheduling in Intrusion Detection Games with Uncertain Payoffs

We study the problem of sensor scheduling for an intrusion detection task. We model this as a two-player zero-sum game over a graph, where the defender (Player 1) seeks to identify the optimal strategy for scheduling sensor orientations to minimize the probability of missed detection at minimal cost, while the intruder (Player 2) aims to identify the optimal path selection strategy to maximize missed detection probability at minimal cost. The defender's strategy space grows exponentially with the number of sensors, making direct computation of the Nash Equilibrium (NE) strategies computationally expensive. To tackle this, we propose a distributed variant of the Weighted Majority algorithm that exploits the structure of the game's payoff matrix, enabling efficient computation of the NE strategies with provable convergence guarantees. Next, we consider a more challenging scenario where the defender lacks knowledge of the true sensor models and, consequently, the game's payoff matrix. For this setting, we develop online learning algorithms that leverage bandit feedback from sensors to estimate the NE strategies. By building on existing results from perturbation theory and online learning in matrix games, we derive high-probability order-optimal regret bounds for our algorithms. Finally, through simulations, we demonstrate the empirical performance of our proposed algorithms in both known and unknown payoff scenarios.

Jayanth Bhargav Shreyas Sundaram Mahsa Ghasemi 19 pages, 1 figure http://arxiv.org/abs/2409.16087v2 2025-04-20T18:10:42Z 2024-09-24T13:40:05Z Exact Null Controllability of Non-Autonomous Conformable Fractional Semi-Linear Systems with Nonlocal Conditions

We study the exact null controllability of a class of non-autonomous conformable fractional semi-linear evolution systems with nonlocal initial conditions in Hilbert spaces. The analysis is carried out within the framework of conformable fractional calculus and linear evolution operator theory. Under suitable assumptions, we establish the existence of mild solutions and provide sufficient conditions for exact null controllability. Notably, the nonlocal term is allowed to be continuous without requiring compactness or Lipschitz-type conditions. An example is included to illustrate the applicability of the main results.

Dev Prakash Jha Raju K. George 24 pages, 0 figure. arXiv admin note: text overlap with arXiv:2408.13814 http://arxiv.org/abs/2504.00364v2 2025-04-20T17:21:22Z 2025-04-01T02:29:11Z Control Barrier Functions via Minkowski Operations for Safe Navigation among Polytopic Sets

Safely navigating around obstacles while respecting the dynamics, control, and geometry of the underlying system is a key challenge in robotics. Control Barrier Functions (CBFs) generate safe control policies by considering system dynamics and geometry when calculating safe forward-invariant sets. Existing CBF-based methods often rely on conservative shape approximations, like spheres or ellipsoids, which have explicit and differentiable distance functions. In this paper, we propose an optimization-defined CBF that directly considers the exact Signed Distance Function (SDF) between a polytopic robot and polytopic obstacles. Inspired by the Gilbert-Johnson-Keerthi (GJK) algorithm, we formulate both (i) minimum distance and (ii) penetration depth between polytopic sets as convex optimization problems in the space of Minkowski difference operations (the MD-space). Convenient geometric properties of the MD-space enable the derivatives of implicit SDF between two polytopes to be computed via differentiable optimization. We demonstrate the proposed framework in three scenarios including pure translation, initialization inside an unsafe set, and multi-obstacle avoidance. These three scenarios highlight the generation of a non-conservative maneuver, a recovery after starting in collision, and the consideration of multiple obstacles via pairwise CBF constraint, respectively.

Yi-Hsuan Chen Shuo Liu Wei Xiao Calin Belta Michael Otte 8 pages, 3 figures http://arxiv.org/abs/2504.14673v1 2025-04-20T16:33:09Z 2025-04-20T16:33:09Z Moment Sum-of-Squares Hierarchy for Gromov Wasserstein: Continuous Extensions and Sample Complexity

The Gromov-Wasserstein (GW) problem is an extension of the classical optimal transport problem to settings where the source and target distributions reside in incomparable spaces, and for which a cost function that attributes the price of moving resources is not available. The sum-of-squares (SOS) hierarchy is a principled method for deriving tractable semidefinite relaxations to generic polynomial optimization problems. In this work, we apply ideas from the moment-SOS hierarchy to solve the GW problem. More precisely, we identify extensions of the moment-SOS hierarchy, previously introduced for the discretized GW problem, such that they remain valid for general probability distributions. This process requires a suitable generalization of positive semidefiniteness over finite-dimensional vector spaces to the space of probability distributions. We prove the following properties concerning these continuous extensions: First, these relaxations form a genuine hierarchy in that the optimal value converges to the GW distance. Second, each of these relaxations induces a pseudo-metric over the collection of metric measure spaces. Crucially, unlike the GW problem, these induced instances are tractable to compute -- the discrete analogs are expressible as semidefinite programs and hence are tractable to solve. Separately from these properties, we also establish a statistical consistency result arising from sampling the source and target distributions. Our work suggests fascinating applications of the SOS hierarchy to optimization problems over probability distributions in settings where the objective and constraint depend on these distributions in a polynomial way.

Hoang Anh Tran Binh Tuan Nguyen Yong Sheng Soh http://arxiv.org/abs/2310.04321v3 2025-04-20T14:17:02Z 2023-10-06T15:33:40Z The Value of Ancillary Services for Electrolyzers

Although primarily designed for hydrogen production, electrolyzers can support power systems by providing various ancillary services, opening new revenue streams that enhance their economic viability. This paper investigates the participation of an electrolyzer in frequency-supporting reserve markets, analyzing how bid structures and activation intensities affect its value. We develop a mixed-integer linear program to co-optimize electricity procurement and reserve provision, and analytically derive the opportunity cost of reserve provision, which determines the optimal bid price. Using historical price and frequency data from western Denmark, we show that asymmetric, hourly reserve products often entail no opportunity cost and can increase profits by up to 47%. However, energy-intensive reserves may disrupt hydrogen production and risk unmet demand. Our findings reveal that flexible bidding can mitigate these risks while maintaining profitability. We also highlight the benefits of diversifying across reserve products and offer two recommendations: System operators should reconsider reserve bid structures to better accommodate electrolyzers, and electrolyzer owners should not overlook energy-intensive reserve services when hydrogen demand is flexible.

Andrea Gloppen Johnsen Lesia Mitridati Donato Zarrilli Jalal Kazempour http://arxiv.org/abs/2504.08223v2 2025-04-20T13:05:35Z 2025-04-11T03:11:51Z Stochastic momentum ADMM for nonconvex and nonsmooth optimization with application to PnP algorithm

This paper proposes SMADMM, a single-loop Stochastic Momentum Alternating Direction Method of Multipliers for solving a class of nonconvex and nonsmooth composite optimization problems. SMADMM achieves the optimal oracle complexity of $\mathcal{O}(\epsilon^{-3/2})$ in the online setting. Unlike previous stochastic ADMM algorithms that require large mini-batches or a double-loop structure, SMADMM uses only $\mathcal{O}(1)$ stochastic gradient evaluations per iteration and avoids costly restarts. To further improve practicality, we incorporate dynamic step sizes and penalty parameters, proving that SMADMM maintains its optimal complexity without the need for large initial batches. We also develop PnP-SMADMM by integrating plug-and-play priors, and establish its theoretical convergence under mild assumptions. Extensive experiments on classification, CT image reconstruction, and phase retrieval tasks demonstrate that our approach outperforms existing stochastic ADMM methods both in accuracy and efficiency, validating our theoretical results.

Kangkang Deng Shuchang Zhang Boyu Wang Jiachen Jin Juan Zhou Hongxia Wang 27 Pages http://arxiv.org/abs/2201.00193v2 2025-04-20T12:25:38Z 2022-01-01T14:19:03Z On the facet pivot simplex method for linear programming II: a linear iteration bound

The Hirsch Conjecture stated that any $d$-dimensional polytope with n facets has a diameter at most equal to $n - d$. This conjecture was disproved by Santos (A counterexample to the Hirsch Conjecture, Annals of Mathematics, 172(1) 383-412, 2012). The implication of Santos' work is that all {\it vertex} pivot algorithms cannot solve the linear programming problem in the worst case in $n - d$ vertex pivot iterations. In the first part of this series of papers, we proposed a {\it facet} pivot method. In this paper, we show that the proposed facet pivot method can solve the canonical linear programming problem in the worst case in at most $n-d$ facet pivot iterations. This work was inspired by Smale's Problem 9 (Mathematical problems for the next century, In Arnold, V. I.; Atiyah, M.; Lax, P.; Mazur, B. Mathematics: frontiers and perspectives, American Mathematical Society, 271-294, 1999).

Yaguang Yang An error in Section 3 is found. I am working on a fix http://arxiv.org/abs/2503.22124v4 2025-04-20T11:27:48Z 2025-03-28T03:56:12Z Scheduling problem of aircrafts on a same runway and dual runways

In this paper, the scheduling problems of landing and takeoff aircraft on a same runway and on dual runways are addressed. In contrast to the approaches based on mixed-integer optimization models in existing works, our approach focuses on the minimum separation times between aircraft by introducing some necessary assumptions and new concepts including relevance, breakpoint aircraft, path and class-monotonically-decreasing sequence. Four scheduling problems are discussed including landing scheduling problem, takeoff scheduling problem, and mixed landing and takeoff scheduling problems on a same runway and on dual runways with the consideration of conversions between different aircraft sequences in typical scenarios. Two real-time optimal algorithms are proposed for the four scheduling problems by fully exploiting the combinations of different classes of aircraft, and necessary definitions, lemmas and theorems are presented for the optimal convergence of the algorithms. Numerical examples are presented to show the effectiveness of the proposed algorithms. In particular, when $100$ aircraft are considered, by using the algorithm in this paper, the optimal solution can be obtained in less than $5$ seconds, while by using the CPLEX software to solve the mix-integer optimization model, the optimal solution cannot be obtained within $1$ hour.

Peng Lin Haopeng Yang Weihua Gui