Identity-Paired Progressive Depth Training: When Trainability Persists Beyond Expressibility

2026-07-18T12:47:50Z

Variational Quantum Algorithms (VQAs) are a leading paradigm for near-term quantum computing, yet their training suffers from sensitivity to circuit depth, initialization, and landscape pathologies such as barren plateaus. We study \emph{progressive depth training} (PDT) -- a layerwise curriculum that trains a shallow circuit before appending new layers -- and identify a fundamental obstacle: fixed entangling gates (CNOTs) in hardware-efficient ansätze cause \emph{initialization shock}, an energy spike when new layers are added. We propose \emph{identity-paired progressive depth training} (IP-PDT), which appends forward/inverse block pairs -- each consisting of a standard rotation$+$CNOT block followed by its reverse -- that compose to the identity at initialization. Because the adjacent CNOT rings cancel, the effective circuit retains only \textit{a single entangling layer} surrounded by \textit{overparameterized local rotations}. We prove a simple \textit{Reachable Set Saturation Theorem}: under this construction the variational manifold expands exactly once (when post-entangler rotations are first introduced) and then \emph{saturates}; all subsequent depth increases provide pure overparameterization of single-qubit unitaries. Despite this saturation, progressive addition of rotation parameters can continue to improve optimization outcomes -- a phenomenon we term \emph{trainability beyond expressibility}. We formalize IP-PDT as a continuation method on nested manifolds, prove monotone energy guarantees under an acceptance rule, and connect energy error to ground-state fidelity through spectral-gap inequalities. A detailed resource analysis shows that IP-PDT achieves lower total gate cost than both baselines by eliminating most CNOT gates.

Benchmarking Optimization Algorithms with Quality Profiles and Test Set Profiles

2026-07-18T12:40:11Z

We propose a couple of novel tools for benchmarking optimization algorithms which possibly converge to different solutions on a test set: the quality profiles and the test set profiles. Their aim is to assess and compare algorithms in terms of quality (i.e. value of the objective function) of the obtained solutions, as well as to assess the consistency of the test set. A key distinguishing feature of the quality profiles we propose is its comparative deterministic procedure that emphasizes the accuracy of the solution, rather than the computational burden of solvers. In this regard, several test set--dependent approaches for both comparing and ranking algorithms have already been proposed in the literature, representing widely used benchmarking procedures. We believe that the joint use of such procedures, along with the novel quality profiles detailed here, should enhance the benchmarking process, in all those cases where the comparison encompasses exact methods as well as heuristics. Moreover, the literature on numerical optimization seems to have paid less attention, in the last decade, to determining how appropriate a test set used for benchmarking the selected solvers may be. This motivates the introduction of test set profiles, which assess the appropriateness of a test set and represent the flip side of evaluating the robustness of the solvers on that test set. This paper also includes extensive numerical experiments, showing the usefulness of quality profiles in both smooth and nonsmooth (derivative--free) optimization, along with the reference to a MATLAB code for plotting quality profiles and test set profiles.

Tight Conic Relaxations for Rank-one Doubly Nonnegative Matrix Completion

2026-07-18T12:25:36Z

We study tight conic relaxations for a quadratically constrained quadratic programming (QCQP) formulation of rank-one doubly nonnegative (DNN) matrix completion. Motivated by sparse QCQPs whose lifted matrix variables include elements not directly specified by the objective or constraints, we interpret tightness as a rank-one completion property for the unspecified elements. For sparsity patterns whose blocks consist of cycles and edges, we prove that the dual formulations associated with the DNN and completely positive (CP) relaxations are equivalent. For cycle-type sparsity patterns, we derive explicit sufficient conditions under which the semidefinite programming (SDP) and DNN relaxations are tight. These sufficient conditions are stated explicitly in terms of local ratio bounds and cumulative-difference conditions on a rank-one optimal solution. We also show that adding suitable edges to the sparsity pattern relaxes the ratio conditions required for tightness. The results provide tractable certificates for when conic relaxations recover a rank-one optimal solution of the underlying QCQP.

Bayesian Risk Preference Persuasion

2026-07-18T12:13:57Z

A decision-maker's risk preference is inherently unstable and may adjust in response to external information, shaping subsequent choices and outcomes. This paper develops a persuasion framework to study how information can be designed to steer risk preferences and decision results. In our model, a receiver starts with an initial risk preference represented by a coherent risk measure and revises it after observing a system state generated by an information rule claimed by a sender. The revision must preserve time consistency of risk evaluations before and after the state realization. We characterize the sender's optimal information design by analyzing the induced distribution of posterior beliefs over states. Each belief leads to specific preference revisions and corresponding conditional risk assessments. We identify conditions under which information design benefits the sender across several settings and illustrate the framework's potential in risk management through an application to reinsurance design.

Dynamical Optimal Transport with $\mathfrak{so}(d)$-Invariance: From Theory to Computation

2026-07-18T11:33:15Z

We introduce a modified Benamou--Brenier (MBB) formulation of optimal transport that incorporates Euclidean invariance at the dynamical level. We establish existence of minimizers for the resulting variational problem and prove its equivalence to a static formulation defining the Procrustes--Wasserstein distance. In the Gaussian setting, we show that this distance admits a closed-form expression, reducing to the Euclidean distance between the vectors of square roots of the ordered eigenvalues of the covariance matrices. On the computational side, we formulate a primal--dual scheme for the discretized problem. We prove a local conditional subsequential convergence result through an abstract analysis of a class of parameter-dependent saddle-point problems and illustrate the method's performance numerically.

When Is Heterogeneous Distance-Decay Facility Location Tractable? A Structural Classification, Exact Methods, and a Real-World Study

2026-07-18T11:03:18Z

We study continuous planar facility location in which a demand point's captured value decays with distance, with the per-point decay scale varying across points. This heterogeneity is ubiquitous yet underexploited, and one nearest-facility objective unifies decay, clustering, and median goals, containing k-means, the Weber/p-median problem, and maximum covering as special cases. We make four contributions. (i) A tractability classification: the discrete objective is always monotone submodular, so the (1-1/e) greedy guarantee holds regardless of decay shape or heterogeneity, and the continuous cooperative objective is concave if and only if the decay is concave in distance; the clip max(0,d) in common coverage specifications is what destroys concavity, and the classification is tight. (ii) An exact discrete method: the candidate-discretized maximum-cover MIP has an empirically tight LP relaxation (~0% gap) and is solved by branch-and-bound in seconds for n <= 500. (iii) A force-as-gradient / large-neighborhood-search heuristic, within 0.5% of the discrete optimum, that outperforms the (1-1/e) greedy, Cooper-style alternating location-allocation, particle swarm optimization, and weighted k-means (30/30 per-instance wins at K=30, p<10^-9) and is competitive with bespoke solvers on k-means, Weber/p-median, and shape-demand instances. (iv) A real-world study: on 592,667 urban-delivery orders, ignoring the calibrated decay variation loses up to 9.7% of captured demand and relocates facilities by up to 37% of the map; a retail dataset calibrates the decay as exponential with scale R ~ 1.4 km.

A Deep Second-Order Stochastic Residual Method for Fully Nonlinear Parabolic PDEs

2026-07-18T09:43:15Z

We introduce the Deep Second-Order Stochastic Residual Method (D2SRM) for high-dimensional, Hessian-dependent fully nonlinear parabolic PDEs. A single scalar space--time network generates derivative-consistent approximations of the solution, gradient, and Hessian, which are trained jointly through second-order Brownian one-step residuals and terminal value and gradient penalties. For globally Lipschitz equations with identity diffusion and sufficiently weak Hessian coupling, we establish well-posedness in a Brownian occupation space and develop a population-level convergence theory. Under additional regularity, an a posteriori estimate bounds the squared full-jet occupation error of any admissible candidate by the time step and its population objective. For approximate population minimizers, the error bound separates time discretization, neural approximation, and population suboptimality; when the latter two terms are $O(h)$, the full-jet occupation norm is $O(h^{1/2})$. Experiments on a 100-dimensional manufactured benchmark compare terminal treatments, probe Hessian couplings inside and outside the proved small-gain range, and show decreasing errors as the time step decreases. The code is available at https://github.com/ZZHPKU/D2SRM.

Laplacian Spectral Shaping for Non-Uniform Scaling Formation Control of Open Multi-Agent Systems

2026-07-18T08:53:01Z

Non-uniform scaling control enables a multi-agent formation to adjust its shape by compressing or stretching independently along different coordinate axes through inter-agent interactions, offering high flexibility in complex environments. The fundamental idea is encoding the desired formation shape as the kernel of a matrix-valued Laplacian. In open multi-agent systems, however, changes in number of agents, number of edges, and leader selection dynamically alter this Laplacian, destroying the required spectral properties: positive semidefiniteness, correct kernel, and positive definiteness of the follower block (we summarize these properties as the formation spectrum). In this paper, we develop distributed protocols to strategically adjust partial weights of the Laplacian matrix for formation control in arbitrary dimensional space. By implementing the protocols, the desired formation spectrum can be preserved under dynamic topology changes including agent joining, edge addition, agent leaving, and edge removal, while any pair of agents can serve as leaders. Unlike existing Laplacian design methods for affine formation control under topology changes, the proposed approach requires a sparser sensing graph, avoids a predefined parent-child hierarchical structure, and supports leader reassignment. The effectiveness of the proposed protocols is validated through both theoretical analysis and numerical simulations.

Localisation for the second-order Beckmann problem: bimartingale couplings and leaf decompositions

2026-07-18T04:46:31Z

We develop a second-order localisation theory for optimal transport based on leaf decompositions and bimartingale couplings. It provides a second-order analogue of the classical decomposition of optimal transport into transport rays and monotone couplings: transport rays are replaced by the leaves of the $1$-Lipschitz derivative map $Du$ of an optimal dual potential $u\in C^{1,1}(\mathbb R^n)$, while monotone couplings are replaced by bimartingale couplings. We apply this framework to the three-marginal optimal transport problem introduced by Bolbotowski and Bouchitté, whose relaxation is the second-order Beckmann problem. We introduce bimartingale couplings and characterise their existence through a convex-concave order condition. This yields a generalisation of Strassen's theorem from convex order to the convex-concave setting. Equivalently, the dual problem associated with the second-order Beckmann problem admits an optimiser whose derivative is an isometry. For absolutely continuous measures with common barycentre, assuming the existence of an optimal plan with absolutely continuous third marginal, we prove that every optimal plan decomposes into a family of problems on the leaves of $Du$. On each leaf, all optimal plans are completely described by bimartingale couplings between the corresponding conditional measures. Without this absolute continuity assumption, we show that the leaf decomposition persists in a more general form: optimal plans are mixtures of plans concentrated on triples $(x,y,z)$ satisfying $x\in \mathcal{S}_1$, $y\in \mathcal{S}_2$, and $z\in \mathcal{S}_1\cap \mathcal{S}_2$, where $\mathcal{S}_1$ and $\mathcal{S}_2$ are neighbouring leaves of $Du$.

Dynamic mean-variance portfolio selection with no-shorting constraints and unknown investment opportunity sets

2026-07-18T03:58:34Z

We study continuous-time mean-variance portfolio selection with no-shorting constraints and unknown investment opportunity sets from a reinforcement learning (RL) perspective. The problem is a constrained stochastic linear -- quadratic control problem for which the entropy-regularized exploratory formulation of Wang et al. (2020) leads to difficulty in theoretical analysis, because enforcing the constraint on the support of randomized policies nullifies the tractable Gaussian exploration. To tackle this challenge, we introduce an auxiliary exploratory problem without entropy in which exploratory policies are still Gaussian whose samples may violate the no-shorting requirement but their means satisfy it. We then prove that, for a suitable choice of exploration variance, the mean of the optimal Gaussian policy of the auxiliary problem coincides with the optimal policy of the original problem. Motivated by this theoretical result, we develop a model-free RL algorithm that learns the optimal policy of the auxiliary (and hence the original) problem directly from trajectory data without estimating the investment opportunity set. A numerical example demonstrates the performance of the proposed algorithm.

End-to-End Supply Chain Planning in the Paper Industry Via Column Generation and Benders Decomposition

2026-07-18T03:31:00Z

Problem definition: The paper studies an integrated end-to-end planning problem in large-scale paper manufacturing, where production scheduling, trimming decisions, vehicle loading, and multi-period fulfillment of make-to-order and make-to-stock demand must be coordinated over time. In practice, these decisions are often optimized sequentially, leading to material waste, inefficient transportation, and degraded service levels. Solving the fully integrated problem at industrial scale remains computationally challenging due to its combinatorial structure. Methodology/results: A key structural feature of the problem is that downstream fulfillment decisions depend on upstream production and logistics choices only through aggregate supply availability over time. By exploiting this structure, the paper develops an exact mathematical formulation and proposes a two-phase hybrid framework (BDCG-DP) that integrates column generation (CG) using exact dynamic-programming (DP) for supply-side decisions with Benders decomposition (BD) for downstream fulfillment. Computational experiments on proprietary instances from a major North American paper manufacturer show that BDCG-DP lowers total costs by 24.4% compared to a traditional CG-DP on challenging eight-week planning problems. Median runtime for four-week planning problems decreases from over five hours using CG-DP to under one hour using BDCG-DP. Managerial implications: This paper provides the first exact model that integrates production, trimming, load planning, and multi-period fulfillment at an industrial scale. The proposed approach returns integer-feasible plans within 2.3 to 6 hours for the most complex planning problems, enabling planners to access high-quality implementable schedules within hours, a capability that was previously unavailable in practice.

Group Steering: Approaches Based on Power Moments

2026-07-18T03:30:03Z

This paper considers the problem of steering a vast group of agents of which the dynamics are governed by a discrete-time asymptotically stable first-order linear system. The group of agents are characterized as a probability density function and an occupation measure respectively in the paper and two corresponding treatments are given. We propose to use the power moments to characterize the density function/occupation measure of the agents. A moment system representation of the original system is put forward for control and an empirical control scheme corresponding to it is proposed. By the designed control law, the moment sequence of the control at each time step is positive, which ensures the existence of the control for the moment system. We then realize the control as an analytic form of function by a convex optimization scheme of which the existence and uniqueness of the solution have been proved in our previous paper. The terminal density is proved to converge to the desired terminal one, which distinguishes the proposed distribution steering scheme from other existing ones. An error analysis of the terminal density from the specified one is also provided. For the problem where the group of agents is characterized as an occupation measure, the control for each agent is determined by drawing independent and identically-distributed(i.i.d) samples from the realized analytic function. Finally, simulation results validate our proposed algorithms.

Optimal $\mathbb{H}_2$ Control with Passivity-Constrained Feedback: Convex Approach

2026-07-18T02:42:19Z

We consider the $\Set{H}_2$-optimal feedback control problem, for the case in which the plant is passive with bounded $\Set{L}_2$ gain, and the feedback law is constrained to be output-strictly passive. We show that this problem distills to a convex, infinite-dimensional optimal control problem, in which the optimization domain is the Youla parameter for the closed-loop system. We devise truncated, finite-dimensional optimizations to find sub-optimal controllers, and lower bounds on the optimal objective. Furthermore we show that both these optimizations converge to the optimal objective of the original infinite-dimensional problem as their respective domains are increased. The idea is demonstrated on a simple vibration suppression example.

FST.ai 2.5: Explainable and Uncertainty-Aware AI for Olympic and Para-Taekwondo Decision Support, Athlete Digital Twins, and Federation-Scale Analytics

2026-07-18T02:32:44Z

The rapid digitalisation of elite sport has created new opportunities for integrating artificial intelligence (AI), performance analytics, and decision-support systems into athlete development and competition management. However, existing solutions remain fragmented, typically addressing isolated tasks such as performance analysis, athlete monitoring, or referee support. This paper presents \textbf{FST$\cdot$ai~2.5}, an explainable, uncertainty-aware, and secure AI framework for Olympic and Para-Taekwondo. \textbf{FST$\cdot$ai~2.5} introduces a unified digital ecosystem integrating athlete intelligence, competition analytics, federation-scale data management, AI-assisted decision support, athlete and event digital twins, explainable performance indicators, and adaptive training recommendations. The framework supports World Taekwondo (WT), Member National Associations (MNAs), coaches, referees, analysts, and athletes through transparent, secure, and federation-aware governance. By combining multi-source competition data, athlete-performance information, and contextual evidence, \textbf{FST$\cdot$ai~2.5} provides tactical diagnostics, longitudinal athlete monitoring, performance forecasting, personalised development planning, and federation-wide benchmarking using explainable and uncertainty-aware AI. Prototype deployments demonstrate the feasibility of the proposed framework. Although developed for Olympic and Para-Taekwondo, the methodology is broadly applicable to explainable AI, digital twins, and trustworthy decision support in combat sports and other high-performance sporting environments.

Near-Optimal Lower Bounds for Exact Zeroth-Order Convex Optimization

2026-07-18T00:04:07Z

The fundamental oracle limits of exact function value access remain poorly understood in zeroth-order optimization: even for canonical convex problems, the optimal joint dependence on dimension $d$ and accuracy $ε$ has remained unresolved. We resolve this question for arbitrary adaptive randomized algorithms minimizing nonsmooth $L_0$-Lipschitz convex functions over the $d$-dimensional Euclidean unit ball, where each query returns only the exact scalar value $f(\mathbf x)$ of a fixed objective. With universal $L_0>0$, let $T_ε$ denote the minimum number of queries required to return an $ε$-suboptimal point with probability at least $1/2$, uniformly over the function class. We prove that $$ T_ε\ge c\, \frac{ d\min\{d,ε^{-2}\} }{ \log\!\bigl(e\min\{d,ε^{-2}\}\bigr) }, $$ for all $d\ge d_0$ and $0<ε\leε_0$, for universal constants $c,ε_0>0$ and $d_0\in\mathbb N$. In the low-accuracy regime $ε\ge d^{-1/2}$, this matches the $O(dε^{-2})$ two-point exact value upper bound up to a logarithmic factor. In the high-accuracy regime $ε\le d^{-1/2}$, the lower bound saturates at $Ω(d^2/\log(ed))$, independently of $ε$, matching the $\widetilde O(d^2)$ evaluation oracle upper bound up to polylogarithmic factors. The proof uses a random support function hard family and develops a posterior mean energy method for adaptive exact max observations, in place of first-order zero chain constructions and noise based transcript inequalities.