http://arxiv.org/api/Gb3y7GBmmhIN/mz6P/hqeKMtxVk
2025-04-22T00:00:00-04:00
53757
15
15
http://arxiv.org/abs/2504.14853v1
2025-04-21T04:35:56Z
2025-04-21T04:35:56Z
Output regulation for an unstable wave equation with output delay and
one measurement only
This paper addresses the output regulation problem for a one-dimensional
unstable wave equation subject to output delay and all-channel disturbances
with unknown frequencies and amplitudes. First, this problem is transformed
into a stabilization problem for an unstable wave equation with output delay
and disturbances by employing regulator equations. Subsequently, a
backstepping-based feedforward regulator is proposed to exponentially stabilize
this system. To track the states of the unstable wave equation, the time
interval is partitioned into two segments. The observers and predictors are
designed at these distinct intervals, respectively. Therein, the observers
comprise two components: a state observer proposed via dynamic compensators and
an adaptive observer designed by the adaptive internal model method. Finally, a
novel error-based feedback controller is derived using a single measurement,
ensuring exponential convergence of the tracking error to zero. This work
establishes the pioneering solution to the output regulation problem for
distributed parameter systems (DPS) with output delay. Numerical simulations
are provided to illustrate the results.
Shen Wang
Zhong-Jie Han
Shuangxi Huang
Zhi-Xue Zhao
http://arxiv.org/abs/2407.00359v6
2025-04-21T04:24:58Z
2024-06-29T08:12:36Z
Multicriteria Optimization and Decision Making: Principles, Algorithms
and Case Studies
Real-world decision and optimization problems, often involve constraints and
conflicting criteria. For example, choosing a travel method must balance speed,
cost, environmental footprint, and convenience. Similarly, designing an
industrial process must consider safety, environmental impact, and cost
efficiency. Ideal solutions where all objectives are optimally met are rare;
instead, we seek good compromises and aim to avoid lose-lose scenarios.
Multicriteria optimization offers computational techniques to compute Pareto
optimal solutions, aiding decision analysis and decision making. This reader
offers an introduction to this topic and has been developed on the basis of the
revised edition of the reader for the MSc computer science course
"Multicriteria Optimization and Decision Analysis" at the Leiden Institute of
Advanced Computer Science, Leiden University, The Netherlands. This course was
taught annually by the first author from 2007 to 2023 as a single semester
course with lectures and practicals. Our aim was to make the material
accessible to MSc students who do not study mathematics as their core
discipline by introducing basic numerical analysis concepts when necessary and
providing numerical examples for interesting cases. The introduction is
organized in a unique didactic manner developed by the authors, starting from
more simple concepts such as linear programming and single-point methods, and
advancing from these to more difficult concepts such as optimality conditions
for nonlinear optimization and set-oriented solution algorithms. Besides, we
focus on the mathematical modeling and foundations rather than on specific
algorithms, though not excluding the discussion of some representative examples
of solution algorithms.
Michael Emmerich
André Deutz
102 pages, Lecture notes
http://arxiv.org/abs/2410.21676v4
2025-04-21T04:19:56Z
2024-10-29T02:54:06Z
How Does Critical Batch Size Scale in Pre-training?
Training large-scale models under given resources requires careful design of
parallelism strategies. In particular, the efficiency notion of critical batch
size (CBS), concerning the compromise between time and compute, marks the
threshold beyond which greater data parallelism leads to diminishing returns.
To operationalize it, we propose a measure of CBS and pre-train a series of
auto-regressive language models, ranging from 85 million to 1.2 billion
parameters, on the C4 dataset. Through extensive hyper-parameter sweeps and
careful control of factors such as batch size, momentum, and learning rate
along with its scheduling, we systematically investigate the impact of scale on
CBS. Then we fit scaling laws with respect to model and data sizes to decouple
their effects. Overall, our results demonstrate that CBS scales primarily with
data size rather than model size, a finding we justify theoretically through
the analysis of infinite-width limits of neural networks and
infinite-dimensional least squares regression. Of independent interest, we
highlight the importance of common hyper-parameter choices and strategies for
studying large-scale pre-training beyond fixed training durations.
Hanlin Zhang
Depen Morwani
Nikhil Vyas
Jingfeng Wu
Difan Zou
Udaya Ghai
Dean Foster
Sham Kakade
ICLR 2025, Blog post:
https://kempnerinstitute.harvard.edu/research/deeper-learning/how-does-critical-batch-size-scale-in-pre-training-decoupling-data-and-model-size
http://arxiv.org/abs/2504.14841v1
2025-04-21T03:44:46Z
2025-04-21T03:44:46Z
(Sub)Exponential Quantum Speedup for Optimization
We demonstrate provable (sub)exponential quantum speedups in both discrete
and continuous optimization, achieved through simple and natural quantum
optimization algorithms, namely the quantum adiabatic algorithm for discrete
optimization and quantum Hamiltonian descent for continuous optimization. Our
result builds on the Gily\'en--Hastings--Vazirani (sub)exponential oracle
separation for adiabatic quantum computing. With a sequence of perturbative
reductions, we compile their construction into two standalone objective
functions, whose oracles can be directly leveraged by the plain adiabatic
evolution and Schr\"odinger operator evolution for discrete and continuous
optimization, respectively.
Jiaqi Leng
Kewen Wu
Xiaodi Wu
Yufan Zheng
69 pages, 6 figures
http://arxiv.org/abs/2504.14834v1
2025-04-21T03:29:09Z
2025-04-21T03:29:09Z
Output regulation for a reaction-diffusion system with input delay and
unknown frequency
This study solves the output regulation problem for a reaction-diffusion
system confronting concurrent input delay and fully unidentified disturbances
(encompassing both unknown frequencies and amplitudes) across all channels. The
principal innovation emerges from a novel adaptive control architecture that
synergizes the modal decomposition technique with a dual-observer mechanism,
enabling real-time concurrent estimation of unmeasurable system states and
disturbances through a state observer and an adaptive disturbance estimator.
Unlike existing approaches limited to either delay compensation or partial
disturbance rejection, our methodology overcomes the technical barrier of
coordinating these two requirements through a rigorously constructed
tracking-error-based controller, achieving exponential convergence of system
output to reference signals. Numerical simulations are presented to validate
the effectiveness of the proposed output feedback control strategy.
Shen Wang
Zhong-Jie Han
Kai Liu
Zhi-Xue Zhao
http://arxiv.org/abs/2410.21863v4
2025-04-21T01:55:13Z
2024-10-29T08:56:34Z
On invariance of observability for BSDEs and its applications to
stochastic control systems
In this paper, we establish the invariance of observability for the observed
backward stochastic differential equations (BSDEs) with constant coefficients,
relative to the filtered probability space. This signifies that the
observability of these observed BSDEs with constant coefficients remains
unaffected by the selection of the filtered probability space. As an
illustrative application, we demonstrate that for stochastic control systems
with constant coefficients, weak observability, approximate null
controllability with cost, and stabilizability are equivalent across some or
any filtered probability spaces.
Bao-Zhu Guo
Huaiqiang Yu
Meixuan Zhang
26 Pages
http://arxiv.org/abs/2504.14741v1
2025-04-20T21:07:59Z
2025-04-20T21:07:59Z
AltGDmin: Alternating GD and Minimization for Partly-Decoupled
(Federated) Optimization
This article describes a novel optimization solution framework, called
alternating gradient descent (GD) and minimization (AltGDmin), that is useful
for many problems for which alternating minimization (AltMin) is a popular
solution. AltMin is a special case of the block coordinate descent algorithm
that is useful for problems in which minimization w.r.t one subset of variables
keeping the other fixed is closed form or otherwise reliably solved. Denote the
two blocks/subsets of the optimization variables Z by Za, Zb, i.e., Z = {Za,
Zb}. AltGDmin is often a faster solution than AltMin for any problem for which
(i) the minimization over one set of variables, Zb, is much quicker than that
over the other set, Za; and (ii) the cost function is differentiable w.r.t. Za.
Often, the reason for one minimization to be quicker is that the problem is
``decoupled" for Zb and each of the decoupled problems is quick to solve. This
decoupling is also what makes AltGDmin communication-efficient for federated
settings.
Important examples where this assumption holds include (a) low rank
column-wise compressive sensing (LRCS), low rank matrix completion (LRMC), (b)
their outlier-corrupted extensions such as robust PCA, robust LRCS and robust
LRMC; (c) phase retrieval and its sparse and low-rank model based extensions;
(d) tensor extensions of many of these problems such as tensor LRCS and tensor
completion; and (e) many partly discrete problems where GD does not apply --
such as clustering, unlabeled sensing, and mixed linear regression. LRCS finds
important applications in multi-task representation learning and few shot
learning, federated sketching, and accelerated dynamic MRI. LRMC and robust PCA
find important applications in recommender systems, computer vision and video
analytics.
Namrata Vaswani
To appear in Foundations and Trends in Optimization (NOW publishers)
http://arxiv.org/abs/2504.14725v1
2025-04-20T19:39:37Z
2025-04-20T19:39:37Z
Sensor Scheduling in Intrusion Detection Games with Uncertain Payoffs
We study the problem of sensor scheduling for an intrusion detection task. We
model this as a two-player zero-sum game over a graph, where the defender
(Player 1) seeks to identify the optimal strategy for scheduling sensor
orientations to minimize the probability of missed detection at minimal cost,
while the intruder (Player 2) aims to identify the optimal path selection
strategy to maximize missed detection probability at minimal cost. The
defender's strategy space grows exponentially with the number of sensors,
making direct computation of the Nash Equilibrium (NE) strategies
computationally expensive. To tackle this, we propose a distributed variant of
the Weighted Majority algorithm that exploits the structure of the game's
payoff matrix, enabling efficient computation of the NE strategies with
provable convergence guarantees. Next, we consider a more challenging scenario
where the defender lacks knowledge of the true sensor models and, consequently,
the game's payoff matrix. For this setting, we develop online learning
algorithms that leverage bandit feedback from sensors to estimate the NE
strategies. By building on existing results from perturbation theory and online
learning in matrix games, we derive high-probability order-optimal regret
bounds for our algorithms. Finally, through simulations, we demonstrate the
empirical performance of our proposed algorithms in both known and unknown
payoff scenarios.
Jayanth Bhargav
Shreyas Sundaram
Mahsa Ghasemi
19 pages, 1 figure
http://arxiv.org/abs/2409.16087v2
2025-04-20T18:10:42Z
2024-09-24T13:40:05Z
Exact Null Controllability of Non-Autonomous Conformable Fractional
Semi-Linear Systems with Nonlocal Conditions
We study the exact null controllability of a class of non-autonomous
conformable fractional semi-linear evolution systems with nonlocal initial
conditions in Hilbert spaces. The analysis is carried out within the framework
of conformable fractional calculus and linear evolution operator theory. Under
suitable assumptions, we establish the existence of mild solutions and provide
sufficient conditions for exact null controllability. Notably, the nonlocal
term is allowed to be continuous without requiring compactness or
Lipschitz-type conditions. An example is included to illustrate the
applicability of the main results.
Dev Prakash Jha
Raju K. George
24 pages, 0 figure. arXiv admin note: text overlap with
arXiv:2408.13814
http://arxiv.org/abs/2504.00364v2
2025-04-20T17:21:22Z
2025-04-01T02:29:11Z
Control Barrier Functions via Minkowski Operations for Safe Navigation
among Polytopic Sets
Safely navigating around obstacles while respecting the dynamics, control,
and geometry of the underlying system is a key challenge in robotics. Control
Barrier Functions (CBFs) generate safe control policies by considering system
dynamics and geometry when calculating safe forward-invariant sets. Existing
CBF-based methods often rely on conservative shape approximations, like spheres
or ellipsoids, which have explicit and differentiable distance functions. In
this paper, we propose an optimization-defined CBF that directly considers the
exact Signed Distance Function (SDF) between a polytopic robot and polytopic
obstacles. Inspired by the Gilbert-Johnson-Keerthi (GJK) algorithm, we
formulate both (i) minimum distance and (ii) penetration depth between
polytopic sets as convex optimization problems in the space of Minkowski
difference operations (the MD-space). Convenient geometric properties of the
MD-space enable the derivatives of implicit SDF between two polytopes to be
computed via differentiable optimization. We demonstrate the proposed framework
in three scenarios including pure translation, initialization inside an unsafe
set, and multi-obstacle avoidance. These three scenarios highlight the
generation of a non-conservative maneuver, a recovery after starting in
collision, and the consideration of multiple obstacles via pairwise CBF
constraint, respectively.
Yi-Hsuan Chen
Shuo Liu
Wei Xiao
Calin Belta
Michael Otte
8 pages, 3 figures
http://arxiv.org/abs/2504.14673v1
2025-04-20T16:33:09Z
2025-04-20T16:33:09Z
Moment Sum-of-Squares Hierarchy for Gromov Wasserstein: Continuous
Extensions and Sample Complexity
The Gromov-Wasserstein (GW) problem is an extension of the classical optimal
transport problem to settings where the source and target distributions reside
in incomparable spaces, and for which a cost function that attributes the price
of moving resources is not available. The sum-of-squares (SOS) hierarchy is a
principled method for deriving tractable semidefinite relaxations to generic
polynomial optimization problems. In this work, we apply ideas from the
moment-SOS hierarchy to solve the GW problem.
More precisely, we identify extensions of the moment-SOS hierarchy,
previously introduced for the discretized GW problem, such that they remain
valid for general probability distributions. This process requires a suitable
generalization of positive semidefiniteness over finite-dimensional vector
spaces to the space of probability distributions. We prove the following
properties concerning these continuous extensions: First, these relaxations
form a genuine hierarchy in that the optimal value converges to the GW
distance. Second, each of these relaxations induces a pseudo-metric over the
collection of metric measure spaces. Crucially, unlike the GW problem, these
induced instances are tractable to compute -- the discrete analogs are
expressible as semidefinite programs and hence are tractable to solve.
Separately from these properties, we also establish a statistical consistency
result arising from sampling the source and target distributions. Our work
suggests fascinating applications of the SOS hierarchy to optimization problems
over probability distributions in settings where the objective and constraint
depend on these distributions in a polynomial way.
Hoang Anh Tran
Binh Tuan Nguyen
Yong Sheng Soh
http://arxiv.org/abs/2310.04321v3
2025-04-20T14:17:02Z
2023-10-06T15:33:40Z
The Value of Ancillary Services for Electrolyzers
Although primarily designed for hydrogen production, electrolyzers can
support power systems by providing various ancillary services, opening new
revenue streams that enhance their economic viability. This paper investigates
the participation of an electrolyzer in frequency-supporting reserve markets,
analyzing how bid structures and activation intensities affect its value. We
develop a mixed-integer linear program to co-optimize electricity procurement
and reserve provision, and analytically derive the opportunity cost of reserve
provision, which determines the optimal bid price. Using historical price and
frequency data from western Denmark, we show that asymmetric, hourly reserve
products often entail no opportunity cost and can increase profits by up to
47%. However, energy-intensive reserves may disrupt hydrogen production and
risk unmet demand. Our findings reveal that flexible bidding can mitigate these
risks while maintaining profitability. We also highlight the benefits of
diversifying across reserve products and offer two recommendations: System
operators should reconsider reserve bid structures to better accommodate
electrolyzers, and electrolyzer owners should not overlook energy-intensive
reserve services when hydrogen demand is flexible.
Andrea Gloppen Johnsen
Lesia Mitridati
Donato Zarrilli
Jalal Kazempour
http://arxiv.org/abs/2504.08223v2
2025-04-20T13:05:35Z
2025-04-11T03:11:51Z
Stochastic momentum ADMM for nonconvex and nonsmooth optimization with
application to PnP algorithm
This paper proposes SMADMM, a single-loop Stochastic Momentum Alternating
Direction Method of Multipliers for solving a class of nonconvex and nonsmooth
composite optimization problems. SMADMM achieves the optimal oracle complexity
of $\mathcal{O}(\epsilon^{-3/2})$ in the online setting. Unlike previous
stochastic ADMM algorithms that require large mini-batches or a double-loop
structure, SMADMM uses only $\mathcal{O}(1)$ stochastic gradient evaluations
per iteration and avoids costly restarts. To further improve practicality, we
incorporate dynamic step sizes and penalty parameters, proving that SMADMM
maintains its optimal complexity without the need for large initial batches. We
also develop PnP-SMADMM by integrating plug-and-play priors, and establish its
theoretical convergence under mild assumptions. Extensive experiments on
classification, CT image reconstruction, and phase retrieval tasks demonstrate
that our approach outperforms existing stochastic ADMM methods both in accuracy
and efficiency, validating our theoretical results.
Kangkang Deng
Shuchang Zhang
Boyu Wang
Jiachen Jin
Juan Zhou
Hongxia Wang
27 Pages
http://arxiv.org/abs/2201.00193v2
2025-04-20T12:25:38Z
2022-01-01T14:19:03Z
On the facet pivot simplex method for linear programming II: a linear
iteration bound
The Hirsch Conjecture stated that any $d$-dimensional polytope with n facets
has a diameter at most equal to $n - d$. This conjecture was disproved by
Santos (A counterexample to the Hirsch Conjecture, Annals of Mathematics,
172(1) 383-412, 2012). The implication of Santos' work is that all {\it vertex}
pivot algorithms cannot solve the linear programming problem in the worst case
in $n - d$ vertex pivot iterations.
In the first part of this series of papers, we proposed a {\it facet} pivot
method. In this paper, we show that the proposed facet pivot method can solve
the canonical linear programming problem in the worst case in at most $n-d$
facet pivot iterations. This work was inspired by Smale's Problem 9
(Mathematical problems for the next century, In Arnold, V. I.; Atiyah, M.; Lax,
P.; Mazur, B. Mathematics: frontiers and perspectives, American Mathematical
Society, 271-294, 1999).
Yaguang Yang
An error in Section 3 is found. I am working on a fix
http://arxiv.org/abs/2503.22124v4
2025-04-20T11:27:48Z
2025-03-28T03:56:12Z
Scheduling problem of aircrafts on a same runway and dual runways
In this paper, the scheduling problems of landing and takeoff aircraft on a
same runway and on dual runways are addressed. In contrast to the approaches
based on mixed-integer optimization models in existing works, our approach
focuses on the minimum separation times between aircraft by introducing some
necessary assumptions and new concepts including relevance, breakpoint
aircraft, path and class-monotonically-decreasing sequence. Four scheduling
problems are discussed including landing scheduling problem, takeoff scheduling
problem, and mixed landing and takeoff scheduling problems on a same runway and
on dual runways with the consideration of conversions between different
aircraft sequences in typical scenarios. Two real-time optimal algorithms are
proposed for the four scheduling problems by fully exploiting the combinations
of different classes of aircraft, and necessary definitions, lemmas and
theorems are presented for the optimal convergence of the algorithms. Numerical
examples are presented to show the effectiveness of the proposed algorithms. In
particular, when $100$ aircraft are considered, by using the algorithm in this
paper, the optimal solution can be obtained in less than $5$ seconds, while by
using the CPLEX software to solve the mix-integer optimization model, the
optimal solution cannot be obtained within $1$ hour.
Peng Lin
Haopeng Yang
Weihua Gui