http://arxiv.org/api/xUaf7PXciuOcp0i35YqClgHt3Aw2025-04-21T00:00:00-04:0053757015http://arxiv.org/abs/2410.14054v22025-04-21T16:57:16Z2024-10-17T21:52:00ZAdaptive Gradient Normalization and Independent Sampling for
(Stochastic) Generalized-Smooth Optimization Recent studies have shown that many nonconvex machine learning problems
satisfy a generalized-smooth condition that extends beyond traditional smooth
nonconvex optimization. However, the existing algorithms are not fully adapted
to such generalized-smooth nonconvex geometry and encounter significant
technical limitations on their convergence analysis. In this work, we first
analyze the convergence of adaptively normalized gradient descent under
function geometries characterized by generalized-smoothness and generalized
P{\L} condition, revealing the advantage of adaptive gradient normalization.
Our results provide theoretical insights into adaptive normalization across
various scenarios.For stochastic generalized-smooth nonconvex optimization, we
propose \textbf{I}ndependent-\textbf{A}daptively \textbf{N}ormalized
\textbf{S}tochastic \textbf{G}radient \textbf{D}escent, which leverages
adaptive gradient normalization, independent sampling, and gradient clipping to
achieve an $\mathcal{O}(\epsilon^{-4})$ sample complexity under relaxed noise
assumptions. Experiments on large-scale nonconvex generalized-smooth problems
demonstrate the fast convergence of our algorithm.
Yufeng YangErin TrippYifan SunShaofeng ZouYi Zhou40 pages, 1 tableshttp://arxiv.org/abs/2412.17012v32025-04-21T16:33:22Z2024-12-22T13:21:34ZAdaptive Control of Positive Systems with Application to Learning SSP An adaptive controller is proposed and analyzed for the class of
infinite-horizon optimal control problems in positive linear systems presented
in (Ohlin et al., 2024b). This controller is derived from the solution of a
"data-driven algebraic equation" constructed using the model-free Bellman
equation from Q-learning. The equation is driven by data correlation matrices
that do not scale with the number of data points, enabling efficient online
implementation. Consequently, a sufficient condition guaranteeing stability and
robustness to unmodeled dynamics is established. The derived results also
provide a quantitative characterization of the interplay between excitation
level and robustness to unmodeled dynamics. The class of optimal control
problems considered here is equivalent to Stochastic Shortest Path (SSP)
problems, allowing for a performance comparison between the proposed adaptive
policy and model-free algorithms for learning the stochastic shortest path, as
demonstrated in the numerical experiment.
Fethi BencherkiAnders RantzerAccepted for publication in the Proceedings of the 7th Annual
Learning for Dynamics and Control Conference (L4DC)http://arxiv.org/abs/2504.15196v12025-04-21T16:07:32Z2025-04-21T16:07:32ZFully Adaptive Stepsizes: Which System Benefit More -- Centralized or
Decentralized? In decentralized optimization, the choice of stepsize plays a critical role
in algorithm performance. A common approach is to use a shared stepsize across
all agents to ensure convergence. However, selecting an optimal stepsize often
requires careful tuning, which can be time-consuming and may lead to slow
convergence, especially when there is significant variation in the smoothness
(L-smoothness) of local objective functions across agents. Individually tuning
stepsizes per agent is also impractical, particularly in large-scale networks.
To address these limitations, we propose AdGT, an adaptive gradient tracking
method that enables each agent to adjust its stepsize based on the smoothness
of its local objective. We prove that AdGT generates a sequence of iterates
that converges to the optimal consensus solution. Through numerical
experiments, we compare AdGT with fixed-stepsize gradient tracking methods and
demonstrate its superior performance. Additionally, we compare AdGT with
adaptive gradient descent (AdGD) in a centralized setting and observe that
fully adaptive stepsizes offer greater benefits in decentralized networks than
in centralized ones.
Diyako GhaderyanStefan Wernerhttp://arxiv.org/abs/2504.15177v12025-04-21T15:39:42Z2025-04-21T15:39:42ZAn $rp$-adaptive method for accurate resolution of shock-dominated
viscous flow based on implicit shock tracking This work introduces an optimization-based $rp$-adaptive numerical method to
approximate solutions of viscous, shock-dominated flows using implicit shock
tracking and a high-order discontinuous Galerkin discretization on
traditionally coarse grids without nonlinear stabilization (e.g., artificial
viscosity or limiting). The proposed method adapts implicit shock tracking
methods, originally developed to align mesh faces with solution
discontinuities, to compress elements into viscous shocks and boundary layers,
functioning as a novel approach to aggressive $r$-adaptation. This form of
$r$-adaptation is achieved naturally as the minimizer of the enriched residual
with respect to the discrete flow variables and coordinates of the nodes of the
grid. Several innovations to the shock tracking optimization solver are
proposed to ensure sufficient mesh compression at viscous features to render
stabilization unnecessary, including residual weighting, step constraints and
modifications, and viscosity-based continuation. Finally, $p$-adaptivity is
used to locally increase the polynomial degree with three clear benefits: (1)
lessens the mesh compression requirements near shock waves and boundary layers,
(2) reduces the error in regions where $r$-adaptivity is not sufficient with
the given grid topology, and (3) reduces computational cost by performing a
majority of the $r$-adaptivity iterations on the coarsest discretization. A
series of numerical experiments show the proposed method effectively resolves
viscous, shock-dominated flows, including accurate prediction of heat flux
profiles produced by hypersonic flow over a cylinder, and compares favorably in
terms of accuracy per degree of freedom to $h$-adaptation with a high-order
discretization.
Huijing DongMasayuki YanoTianci HuangMatthew J. Zahr43 pages, 35 figures,http://arxiv.org/abs/2504.15117v12025-04-21T14:13:15Z2025-04-21T14:13:15ZSymplectic Geometry in Hybrid and Impulsive Optimal Control Hybrid dynamical systems are systems which undergo both continuous and
discrete transitions. The Bolza problem from optimal control theory is applied
to these systems and a hybrid version of Pontryagin's maximum principle is
presented. This hybrid maximum principle is presented to emphasize its
geometric nature which makes its study amenable to the tools of geometric
mechanics and symplectic geometry. One explicit benefit of this geometric
approach is that the symplectic structure (and hence the induced volume) is
preserved. This allows for a hybrid analog of caustics and conjugate points.
Additionally, an introductory analysis of singular solutions (beating and Zeno)
is discussed geometrically. This work concludes on a biological example where
beating can occur.
William ClarkMaria OpreaComments welcomehttp://arxiv.org/abs/2404.08383v42025-04-21T14:10:07Z2024-04-12T10:37:24ZOptimal Transport and Wasserstein Barycenter for Radially Contoured
Distributions The optimal transport and Wasserstein barycenter of Gaussian distributions
have been solved. In literature, the closed form formulas of the Monge map, the
Wasserstein distance and the Wasserstein barycenter have been given. Moreover,
when Gaussian distributions extend more generally to elliptically contoured
distributions, similar results also hold true. In this case, Gaussian
distributions are regarded as elliptically contoured distribution with
generator function $e^{-x/2}$. However, there are few results about optimal
transport for elliptically contoured distributions with different generator
functions. In this paper, we degenerate elliptically contoured distributions to
radially contoured distributions and study their optimal transport and prove
their Wasserstein barycenter is still radially contoured. For general
elliptically contoured distributions, we give two numerical counterexamples to
show that the Wasserstein barycenter of elliptically contoured distributions
does not have to be elliptically contoured.
Keyu ChenYunxin Zhanghttp://arxiv.org/abs/2504.15113v12025-04-21T14:06:25Z2025-04-21T14:06:25ZAdaptive sieving with semismooth Newton proximal augmented Lagrangian
algorithm for multi-task Lasso problems Multi-task learning enhances model generalization by jointly learning from
related tasks. This paper focuses on the $\ell_{1,\infty}$-norm constrained
multi-task learning problem, which promotes a shared feature representation
while inducing sparsity in task-specific parameters. We propose an adaptive
sieving (AS) strategy to efficiently generate a solution path for multi-task
Lasso problems. Each subproblem along the path is solved via an inexact
semismooth Newton proximal augmented Lagrangian ({\sc Ssnpal}) algorithm,
achieving an asymptotically superlinear convergence rate. By exploiting the
Karush-Kuhn-Tucker (KKT) conditions and the inherent sparsity of multi-task
Lasso solutions, the {\sc Ssnpal} algorithm solves a sequence of reduced
subproblems with small dimensions. This approach enables our method to scale
effectively to large problems. Numerical experiments on synthetic and
real-world datasets demonstrate the superior efficiency and robustness of our
algorithm compared to state-of-the-art solvers.
Lanyu LinYong-Jin LiuBo WangJunfeng Yanghttp://arxiv.org/abs/2409.04297v22025-04-21T13:42:27Z2024-09-06T14:13:57ZMinimization of the Pseudospectral Abscissa of a Quadratic Matrix
Polynomial For a quadratic matrix polynomial dependent on parameters and a given
tolerance $\epsilon > 0$, the minimization of the $\epsilon$-pseudospectral
abscissa over the set of permissible parameter values is discussed, with
applications in damping optimization and brake squeal reductions in mind. An
approach is introduced that is based on nonsmooth and global optimization (or
smooth optimization techniques such as BFGS if there are many parameters)
equipped with a globally convergent criss-cross algorithm to compute the
$\epsilon$-pseudospectral abscissa objective when the matrix polynomial is of
small size. For the setting when the matrix polynomial is large, a subspace
framework is introduced, and it is argued formally that it solves the
minimization problem globally. The subspace framework restricts the
parameter-dependent matrix polynomial to small subspaces, and thus solves the
minimization problem for such restricted small matrix polynomials. It then
expands the subspaces using the minimizers for the restricted polynomials. The
proposed approach makes the global minimization of the
$\epsilon$-pseudospectral abscissa possible for a quadratic matrix polynomial
dependent on a few parameters and for sizes up to at least a few hundreds. This
is illustrated on several examples originating from damping optimization.
Volker MehrmannEmre Mengi29 pages, 5 figureshttp://arxiv.org/abs/2504.15084v12025-04-21T13:16:28Z2025-04-21T13:16:28ZReconfiguration and Real-Time Operation of Networked Microgrids Under
Load Uncertainty Distribution networks are increasingly exposed to threats such as extreme
weather, aging infrastructure, and cyber risks--resulting in more frequent
contingencies and outages, a trend likely to persist. Microgrids, particularly
dynamic networked microgrids (DNMGs), offer a promising solution to mitigate
the impacts of such contingencies and enhance resiliency. However, distribution
networks present unique challenges due to their unbalanced nature and the
inherent uncertainty in both loads and generation. This paper builds upon our
prior work on the two-stage mixed-integer robust optimization problem for
configuring DNMGs, improving the solve time and scalability. Furthermore, we
introduce a model-free, real-time optimal power flow algorithm to manage DNMG
operations in the time between reconfigurations. A case study on a realistic
network based on part of the San Francisco Bay Area demonstrates the
scalability of both approaches. The case study also illustrates the ability to
maintain power flow feasibility as loads vary and operating conditions change
when the methods are used in tandem.
Hannah MoringBala Kameshwar PoollaHarsha NagarajanJohanna L. MathieuAndrey BernsteinDavid M. Fobeshttp://arxiv.org/abs/2504.15062v12025-04-21T12:41:35Z2025-04-21T12:41:35ZOPO: Making Decision-Focused Data Acquisition Decisions We propose a model for making data acquisition decisions for variables in
contextual stochastic optimisation problems. Data acquisition decisions are
typically treated as separate and fixed. We explore problem settings in which
the acquisition of contextual variables is costly and consequently constrained.
The data acquisition problem is often solved heuristically for proxy objectives
such as coverage. The more intuitive objective is the downstream decision
quality as a result of data acquisition decisions. The whole pipeline can be
characterised as an optimise-then-predict-then-optimise (OPO) problem.
Analogously, much recent research has focused on how to integrate prediction
and optimisation (PO) in the form of decision-focused learning. We propose
leveraging differentiable optimisation to extend the integration to data
acquisition. We solve the data acquisition problem with well-defined
constraints by learning a surrogate linear objective function. We demonstrate
an application of this model on a shortest path problem for which we first have
to set a drone reconnaissance strategy to capture image segments serving as
inputs to a model that predicts travel costs. We ablate the problem with a
number of training modalities and demonstrate that the differentiable
optimisation approach outperforms random search strategies.
Egon PeršakMiguel F. Anjoshttp://arxiv.org/abs/2504.08191v22025-04-21T11:52:34Z2025-04-11T01:12:59ZOptimal protection and vaccination against epidemics with reinfection
risk We consider the problem of optimal allocation of vaccination and protection
measures for the Susceptible-Infected-Recovered-Infected (SIRI) epidemiological
model, which generalizes the classical Susceptible-Infected-Recovered (SIR) and
Susceptible-Infected-Susceptible (SIS) epidemiological models by allowing for
reinfection. We first introduce the controlled SIRI dynamical model, and
discuss the existence and stability of the equilibrium points. We then
formulate a finite-horizon optimal control problem where the cost of
vaccination and protection is proportional to the mass of population that
adopts it. Our main contribution in this work arises from a detailed
investigation into the existence/non-existence of singular control inputs, and
establishing optimality of bang-bang controls. The optimality of bang-bang
control is established by solving an optimal control problem with a running
cost that is linear with respect to the input variables. The input variables
are associated with actions including vaccination and imposition of protective
measures, e.g. masking or isolation. In contrast to most prior works, we
rigorously establish the non-existence of singular controls, i.e., the
optimality of bang-bang control for our SIRI model. Under the assumption that
the reinfection rate exceeds the first-time infection rate, we characterize the
structure of both the optimal control inputs, and establish that the
vaccination control input admits a bang-bang structure. Numerical results
provide valuable insights into the evolution of the disease spread under
optimal control.
Urmee MaitraIndian Institute of Technology, KharagpurAshish R. HotaIndian Institute of Technology, KharagpurRohit GuptaIndian Institute of Technology, BombayAlfred O. HeroUniversity of Michigan, Ann Arbor21 pages, 2 figureshttp://arxiv.org/abs/2410.14592v22025-04-21T11:13:32Z2024-10-18T16:43:10ZContractivity and linear convergence in bilinear saddle-point problems:
An operator-theoretic approach We study the convex-concave bilinear saddle-point problem $\min_x \max_y f(x)
+ y^\top Ax - g(y)$, where both, only one, or none of the functions $f$ and $g$
are strongly convex, and suitable rank conditions on the matrix $A$ hold. The
solution of this problem is at the core of many machine learning tasks. By
employing tools from monotone operator theory, we systematically prove the
contractivity (in turn, the linear convergence) of several first-order
primal-dual algorithms, including the Chambolle-Pock method. Our approach
results in concise proofs, and it yields new convergence guarantees and tighter
bounds compared to known results.
Colin DirrenMattia BianchiPanagiotis D. GrontasJohn LygerosFlorian DörflerAISTATS 2025http://arxiv.org/abs/2504.15019v12025-04-21T11:05:03Z2025-04-21T11:05:03ZFeedback Stackelberg-Nash equilibria in difference games with
quasi-hierarchical interactions and inequality constraints In this paper, we study a class of two-player deterministic finite-horizon
difference games with coupled inequality constraints, where each player has two
types of decision variables: one involving sequential interactions and the
other simultaneous interactions. We refer to these as quasi-hierarchical
dynamic games and define a solution concept called the feedback
Stackelberg-Nash (FSN) equilibrium. Under a separability assumption on cost
functions, we formulate FSN solutions recursively using a dynamic
programming-like approach. We further show that the FSN solution for these
constrained games can be derived from the parametric feedback Stackelberg
solution of an associated unconstrained game with only sequential interactions,
given parameter choices that satisfy implicit complementarity conditions. For
the linear-quadratic case, we show that the FSN solutions are obtained by
reformulating these complementarity conditions as a single large-scale linear
complementarity problem. Finally, we illustrate our results with a dynamic
duopoly game with production constraints.
Partha Sarathi MohapatraPuduru Viswanadha ReddyGeorges Zaccourhttp://arxiv.org/abs/2504.14987v12025-04-21T09:30:21Z2025-04-21T09:30:21ZA general approach to distributed operator splitting Splitting methods have emerged as powerful tools to address complex problems
by decomposing them into smaller solvable components. In this work, we develop
a general approach of forward-backward splitting methods for solving monotone
inclusion problems involving both set-valued and single-valued operators, where
the latter may lack cocoercivity. Our proposed approach, based on some
coefficient matrices, not only encompasses several important existing
algorithms but also extends to new ones, offering greater flexibility for
different applications. Moreover, by appropriately selecting the coefficient
matrices, the resulting algorithms can be implemented in a distributed and
decentralized manner.
Minh N. DaoMatthew K. TamThang D. Truonghttp://arxiv.org/abs/2504.14892v12025-04-21T06:41:29Z2025-04-21T06:41:29ZA level set topology optimization theory based on Hamilton's principle In this paper, we present a novel framework for deriving the evolution
equation of the level set function in topology optimization, departing from
conventional Hamilton-Jacobi based formulations. The key idea is the
introduction of an auxiliary domain, geometrically identical to the physical
design domain, occupied by fictitious matter which is dynamically excited by
the conditions prevailing in the design domain. By assigning kinetic and
potential energy to this matter and interpreting the level set function as the
generalized coordinate to describe its deformation, the governing equation of
motion is determined via Hamilton's principle, yielding a modified wave
equation. Appropriate combinations of model parameters enable the recovery of
classical physical behaviors, including the standard and biharmonic wave
equations. The evolution problem is formulated in weak form using variational
methods and implemented in the software environment FreeFEM++. The influence of
the numerical parameters is analyzed on the example of minimum mean compliance.
The results demonstrate that topological complexity and strut design can be
effectively controlled by the respective parameters. In addition, the method
allows for the nucleation of new holes and eliminates the need for
re-initializing the level set function. The inclusion of a damping term further
enhances numerical stability. To showcase the versatility and robustness of our
method, we also apply it to compliant mechanism design and a bi-objective
optimization problem involving self-weight and compliance minimization under
local stress constraints.
Jan OellerichTakayuki Yamada66 pages, 27 figures