https://arxiv.org/api/U9qZWc2DU9LCZu2XS/NvEDum4LI 2026-03-24T08:22:13Z 60057 60 15 http://arxiv.org/abs/2405.18777v2 SPABA: A Single-Loop and Probabilistic Stochastic Bilevel Algorithm Achieving Optimal Sample Complexity 2026-03-21T11:45:43Z While stochastic bilevel optimization methods have been extensively studied for addressing large-scale nested optimization problems in machine learning, it remains an open question whether the optimal complexity bounds for solving bilevel optimization are the same as those in single-level optimization. Our main result resolves this question: SPABA, an adaptation of the PAGE method for nonconvex optimization in (Li et al., 2021) to the bilevel setting, can achieve optimal sample complexity in both the finite-sum and expectation settings. We show the optimality of SPABA by proving that there is no gap in complexity analysis between stochastic bilevel and single-level optimization when implementing PAGE. Notably, as indicated by the results of (Dagréou et al., 2022), there might exist a gap in complexity analysis when implementing other stochastic gradient estimators, like SGD and SAGA. In addition to SPABA, we propose several other single-loop stochastic bilevel algorithms, that either match or improve the state-of-the-art sample complexity results, leveraging our convergence rate and complexity analysis. Numerical experiments demonstrate the superior practical performance of the proposed methods. 2024-05-29T05:36:03Z We have primarily fixed Lemma F.3 and revised the proofs of Theorems 3.7, 3.9, and 3.11 in this version, while the main results remain unchanged Tianshu Chu Dachuan Xu Wei Yao Jin Zhang http://arxiv.org/abs/2603.20766v1 A reliability-aware randomized simheuristic for the team orienteering problem with stochastic travel times 2026-03-21T11:30:57Z We study a stochastic variant of the Team Orienteering Problem (TOP) with uncertain travel times and an all-or-nothing reward policy, under which the reward of a route is lost if its travel time exceeds the available budget. This setting makes the trade-off between expected reward and route reliability a central issue in solution design. To address this problem, we propose a reliability-aware simheuristic that combines a savings-based constructive heuristic, controlled randomization, local search, and Monte Carlo simulation. The method evaluates candidate solutions directly under uncertainty and selects them using both estimated expected reward and a reliability criterion, rather than relying on deterministic optimization followed by ex-post stochastic evaluation. Computational experiments on benchmark instances adapted from the TOP literature show that the proposed approach substantially improves stochastic performance with respect to a deterministic baseline evaluated under uncertainty. In most instances, the simheuristic increases both expected reward and reliability, and in the loosest regimes reliability can approach 0.99 while keeping computation times moderate. 2026-03-21T11:30:57Z Michele Circelli http://arxiv.org/abs/2603.20759v1 Derivative-Free Bilevel Optimization with Inexact Lower-Level Solutions 2026-03-21T11:13:52Z In this work, we propose derivative-free framework for bilevel optimization. We consider both the upper and lower-level problems with bound constraints on the variables, as well as general nonlinear constraints, assuming that first-order information (in the upper-level) is not available or it is impractical to obtain. The lower-level problem is solved with an accuracy that is progressively refined throughout the optimization process. We first analyze the case in which the upper-level problem is subject only to bound constraints, establishing convergence to Clarke-Jahn stationary points when the refinement process is allowed to reach its maximum precision. When a limitation is imposed on this refinement process, we prove convergence to approximate stationary points using an extended notion of Goldstein stationarity. Finally, we extend the proposed framework to handle more complex constraints via an exact penalty function approach, proving convergence to stationary points under suitable assumptions. A comprehensive numerical study on 160 problems from the BOLIB collection shows that the adaptive accuracy strategy consistently yields better results than fixed-precision solves, with its benefits becoming more pronounced as the required lower-level accuracy becomes more stringent. 2026-03-21T11:13:52Z Edoardo Cesaroni Giampaolo Liuzzi Stefano Lucidi http://arxiv.org/abs/2603.20751v1 Local Convergence Analysis of ADMM for Nonconvex Composite Optimization 2026-03-21T10:47:53Z In this paper, we study the local convergence of the standard ADMM scheme for a class of nonconvex composite problems arising from modern imaging and machine learning models. This problem is constrained by a closed convex set, while its objective is the sum of a continuously differentiable (possibly nonconvex) smooth term and a polyhedral convex nonsmooth term composed with a linear mapping. Our analysis is mainly motivated by the recent works of Rockafellar [29,30]. We begin with an elementary proof of a key local strong convexity property of the Moreau envelope of polyhedral convex functions. Building on this property, we show that the strong variational sufficiency condition holds for the considered problem under appropriate assumptions. Using the strong variational sufficiency condition, we further derive a descent inequality for the ADMM iterates, in a form analogous to the classical descent analysis of ADMM for convex problems. As a consequence, for a suitable choice of the penalty parameter, we establish local convergence of the ADMM scheme to a primal-dual solution, and a local linear convergence rate for the case where the constraint set is polyhedral convex. Finally, we present three analytic examples to illustrate the applicability of our local convergence result and the necessity of the local assumptions. 2026-03-21T10:47:53Z Xiyuan Xie Lihua Yang Qia li http://arxiv.org/abs/2505.02140v3 Proximal Gradient Descent Ascent Methods for Nonsmooth Nonconvex-Concave Minimax Problems on Riemannian Manifolds 2026-03-21T10:28:01Z Nonsmooth nonconvex-concave minimax problems have attracted significant attention due to their wide applications in many fields. In this paper, we consider a class of nonsmooth nonconvex-concave minimax problems on Riemannian manifolds. Owing to the nonsmoothness of the objective function, existing minimax manifold optimization methods cannot be directly applied to solve this problem. We propose two manifold proximal gradient descent ascent (MPGDA) algorithms for solving the problem. The first algorithm alternatively performs one or multiple manifold proximal gradient descent steps and a proximal ascent step at each iteration, and we prove that it can find an $\varepsilon$-game-stationary point and an $\varepsilon$-optimization-stationary point within $\mathcal{O}(\varepsilon^{-3})$ outer iterations. The second algorithm alternatively performs one manifold proximal gradient descent step and a proximal gradient ascent step, and we show that it can reach an $\varepsilon$-game-stationary point and an $\varepsilon$-optimization-stationary point within $\mathcal{O}(\varepsilon^{-4})$ outer iterations. Numerical experiments on an analytic example, fair sparse PCA, and sparse spectral clustering are conducted to illustrate the advantages of the proposed algorithms. 2025-05-04T15:04:08Z Xiyuan Xie Qia Li http://arxiv.org/abs/2603.20735v1 Optimality in Decentralized Optimization under Bandwidth Constraints 2026-03-21T09:49:42Z We consider a realistic decentralized setup with bandwidth-constrained communication and derive optimal time complexities for non-convex stochastic parallel and asynchronous optimization (up to logarithmic factors). We develop the corresponding methods, Grace SGD and Leon SGD, for both homogeneous and heterogeneous settings. Unlike previous work, our optimal bounds are characterized in terms of min-cut/max-flow quantities and rely on tools from Gomory-Hu trees and Steiner Tree Packing problems, providing tighter and more practical complexities. 2026-03-21T09:49:42Z Alexander Tyurin http://arxiv.org/abs/2603.20728v1 Tackling heavy-tailed noise in distributed estimation: Asymptotic performance and tradeoffs 2026-03-21T09:27:43Z We present an algorithm for distributed estimation of an unknown vector parameter $\boldsymbolθ^\ast \in {\mathbb R}^M$ in the presence of heavy-tailed observation and communication noises. Heavy-tailed noises frequently appear, e.g., in densely deployed Internet of Things (IoT) or wireless sensor network systems. The presented algorithm falls within the class of \emph{consensus+innovation} estimators and combats the effect of the heavy-tailed noises by adding general nonlinearities in the consensus and innovations update parts. We present results on almost sure convergence and asymptotic normality of the estimator. In addition, we provide novel analytical studies that reveal interesting tradeoffs between the system noises and the underlying network topology. 2026-03-21T09:27:43Z Dragana Bajovic Dusan Jakovetic Soummya Kar Manojlo Vukovic 10.1109/TELFOR63250.2024.10819058 http://arxiv.org/abs/2603.20726v1 Neural network model for mathematical programming problems with complementary constraints 2026-03-21T09:22:36Z In this paper, we propose a a gradient-based neural network model to solve the mathematical programming problems with complementary constraints (MPCC). In order to facilitate tractable optimization, the problem MPCC is transformed via a regularized approach into a relaxed nonlinear optimization problem NLP($β$). After that employing the penalty function and neural network model an estimate of the optimal solution of the problem NLP($β$) is obtained. On the basis of Lyapunov stability theory and LaSalle invariance principle, the equilibrium point of proposed neural network is theoretically proven to be asymptotically stable and capable to generate optimal solution of the problem MPCC. Further, we demonstrate the performance and dynamic behavior of the proposed neural network through various illustrative examples and its effectiveness via theoretical and numerical experiments. 2026-03-21T09:22:36Z 29 pages Anurag Jayswal Ajeet Kumar http://arxiv.org/abs/2603.20684v1 Centrality-Based Pruning for Efficient Echo State Networks 2026-03-21T06:55:50Z Echo State Networks (ESNs) are a reservoir computing framework widely used for nonlinear time-series prediction. However, despite their effectiveness, the randomly initialized reservoir often contains redundant nodes, leading to unnecessary computational overhead and reduced efficiency. In this work, we propose a graph centrality-based pruning approach that interprets the reservoir as a weighted directed graph and removes structurally less important nodes using centrality measures. Experiments on Mackey-Glass time-series prediction and electric load forecasting demonstrate that the proposed method can significantly reduce reservoir size while maintaining, and in some cases improving, prediction accuracy, while preserving the essential reservoir dynamics. 2026-03-21T06:55:50Z 8 pages, 3 figures, 2 tables Sudip Laudari http://arxiv.org/abs/2603.20656v1 Sinkhorn Based Associative Memory Retrieval Using Spherical Hellinger Kantorovich Dynamics 2026-03-21T05:25:15Z We propose a dense associative memory for empirical measures (weighted point clouds). Stored patterns and queries are finitely supported probability measures, and retrieval is defined by minimizing a Hopfield-style log-sum-exp energy built from the debiased Sinkhorn divergence. We derive retrieval dynamics as a spherical Hellinger Kantorovich (SHK) gradient flow, which updates both support locations and weights. Discretizing the flow yields a deterministic algorithm that uses Sinkhorn potentials to compute barycentric transport steps and a multiplicative simplex reweighting. Under local separation and PL-type conditions we prove basin invariance, geometric convergence to a local minimizer, and a bound showing the minimizer remains close to the corresponding stored pattern. Under a random pattern model, we further show that these Sinkhorn basins are disjoint with high probability, implying exponential capacity in the ambient dimension. Experiments on synthetic Gaussian point-cloud memories demonstrate robust recovery from perturbed queries versus a Euclidean Hopfield-type baseline. 2026-03-21T05:25:15Z Aratrika Mustafi Soumya Mukherjee http://arxiv.org/abs/2211.16715v4 Policy Optimization over General State and Action Spaces 2026-03-21T02:24:14Z Reinforcement learning (RL) problems over general state and action spaces are notoriously challenging. In contrast to the tableau setting, one can not enumerate all the states and then iteratively update the policies for each state. This prevents the application of many well-studied RL methods especially those with provable convergence guarantees. In this paper, we first present a substantial generalization of the recently developed policy mirror descent method to deal with general state and action spaces. We introduce new approaches to incorporate function approximation into this method, so that we do not need to use explicit policy parameterization at all. Moreover, we present a novel policy dual averaging method for which possibly simpler function approximation techniques can be applied. We establish linear convergence rate to global optimality or sublinear convergence to stationarity for these methods applied to solve different classes of RL problems under exact policy evaluation. We then define proper notions of the approximation errors for policy evaluation and investigate their impact on the convergence of these methods applied to general-state RL problems with either finite-action or continuous-action spaces. To the best of our knowledge, the development of these algorithmic frameworks as well as their convergence analysis appear to be new in the literature. Preliminary numerical results demonstrate the robustness of the aforementioned methods and show they can be competitive with state-of-the-art RL algorithms. 2022-11-30T03:44:44Z Writing updates and new experimental results Caleb Ju Guanghui Lan http://arxiv.org/abs/2409.19437v5 Strongly-polynomial time and validation analysis of policy gradient methods 2026-03-21T02:13:10Z This paper proposes a novel termination criterion, termed the advantage gap function, for finite state and action Markov decision processes (MDP) and reinforcement learning (RL). By incorporating this advantage gap function into the design of step size rules and deriving a new linear rate of convergence that is independent of the stationary state distribution of the optimal policy, we demonstrate that policy gradient methods can solve MDPs in strongly-polynomial time. To the best of our knowledge, this is the first time that such strong convergence properties have been established for policy gradient methods. Moreover, in the stochastic setting, where only stochastic estimates of policy gradients are available, we show that the advantage gap function provides close approximations of the optimality gap for each individual state and exhibits a sublinear rate of convergence at every state. The advantage gap function can be easily estimated in the stochastic case, and when coupled with easily computable upper bounds on policy values, they provide a convenient way to validate the solutions generated by policy gradient methods. Therefore, our developments offer a principled and computable measure of optimality for RL, whereas current practice tends to rely on algorithm-to-algorithm or baselines comparisons with no certificate of optimality. 2024-09-28T18:56:48Z Updated manuscript with new experiments Caleb Ju Guanghui Lan http://arxiv.org/abs/2603.15606v2 Saddle Point Evasion via Curvature-Regularized Gradient Dynamics 2026-03-20T23:30:58Z Nonconvex optimization underlies many modern machine learning and control tasks, where saddle points pose the dominant obstacle to reliable convergence in high-dimensional settings. Escaping these saddle points deterministically and at a controllable rate remains an open challenge: gradient descent is blind to curvature, stochastic perturbation methods lack deterministic guarantees, and Newton-type approaches suffer from Hessian singularity. We present Curvature-Regularized Gradient Dynamics (CRGD), which augments the objective with a smooth penalty on the most negative Hessian eigenvalue, yielding an augmented cost that serves as an optimization Lyapunov function with user-selectable convergence rates to second-order stationary points. Numerical experiments on a nonconvex matrix factorization example confirm that CRGD escapes saddle points across all tested configurations, with escape time that decreases with the eigenvalue gap, in contrast to gradient descent, whose escape time grows inversely with the gap. 2026-03-16T17:56:38Z This work has been submitted to the IEEE for possible publication. 6 pages, 3 figures Liraz Mudrik Isaac Kaminer Sean Kragelund Abram H. Clark http://arxiv.org/abs/2503.24075v3 Optimization on the Oblique Manifold for Sparse Simplex Constraints via Multiplicative Updates 2026-03-20T23:28:13Z Low-rank optimization problems with sparse simplex constraints involve variables that must satisfy nonnegativity, sparsity, and sum-to-1 conditions, making their optimization particularly challenging due to the interplay between low-rank structures and constraints. These problems arise in various applications, including machine learning, signal processing, environmental fields, and computational biology. In this work, we propose a novel manifold optimization approach to efficiently tackle these problems. Our method leverages the geometry of oblique manifolds to reformulate the problem and introduces a new Riemannian optimization method based on Riemannian gradient descent that strictly maintains the simplex constraints. By exploiting the underlying manifold structure, our approach improves optimization efficiency. Experiments on synthetic and real datasets demonstrate the effectiveness of the proposed method compared to standard Euclidean and Riemannian methods, paving the way for broader applications. 2025-03-31T13:31:05Z 19 pages, 3 figures, 2 tables Flavia Esposito Andersen Ang http://arxiv.org/abs/2603.20521v1 Delightful Distributed Policy Gradient 2026-03-20T21:45:51Z Distributed reinforcement learning trains on data from stale, buggy, or mismatched actors, producing actions with high surprisal (negative log-probability) under the learner's policy. The core difficulty is not surprising data per se, but \emph{negative learning from surprising data}. High-surprisal failures can dominate the update direction despite carrying little useful signal, while high-surprisal successes reveal opportunities the current policy would otherwise miss. The \textit{Delightful Policy Gradient} (DG) separates these cases by gating each update with delight, the product of advantage and surprisal, suppressing rare failures and amplifying rare successes without behavior probabilities. Under contaminated sampling, the cosine similarity between the standard policy gradient and the true gradient collapses, while DG's grows as the policy improves. No sign-blind reweighting, including exact importance sampling, can reproduce this effect. On MNIST with simulated staleness, DG without off-policy correction outperforms importance-weighted PG with exact behavior probabilities. On a transformer sequence task with staleness, actor bugs, reward corruption, and rare discovery, DG achieves roughly $10{\times}$ lower error. When all four frictions act simultaneously, its compute advantage is order-of-magnitude and grows with task complexity. 2026-03-20T21:45:51Z Ian Osband