https://arxiv.org/api/uGXyJ9rkEpzUtPLGgYKCYdOtFGw 2026-03-18T08:45:15Z 48240 0 15 http://arxiv.org/abs/2603.16850v1 Unifying Optimization and Dynamics to Parallelize Sequential Computation: A Guide to Parallel Newton Methods for Breaking Sequential Bottlenecks 2026-03-17T17:55:01Z Massively parallel hardware (GPUs) and long sequence data have made parallel algorithms essential for machine learning at scale. Yet dynamical systems, like recurrent neural networks and Markov chain Monte Carlo, were thought to suffer from sequential bottlenecks. Recent work showed that dynamical systems can in fact be parallelized across the sequence length by reframing their evaluation as a system of nonlinear equations, which can be solved with Newton's method using a parallel associative scan. However, these parallel Newton methods struggled with limitations, primarily inefficiency, instability, and lack of convergence guarantees. This thesis addresses these limitations with methodological and theoretical contributions, drawing particularly from optimization. Methodologically, we develop scalable and stable parallel Newton methods, based on quasi-Newton and trust-region approaches. The quasi-Newton methods are faster and more memory efficient, while the trust-region approaches are significantly more stable. Theoretically, we unify many fixed-point methods into our parallel Newton framework, including Picard and Jacobi iterations. We establish a linear convergence rate for these techniques that depends on the method's approximation accuracy and stability. Moreover, we give a precise condition, rooted in dynamical stability, that characterizes when parallelization provably accelerates a dynamical system and when it cannot. Specifically, the sign of the Largest Lyapunov Exponent of a dynamical system determines whether or not parallel Newton methods converge quickly. In sum, this thesis unlocks scalable and stable methods for parallelizing sequential computation, and provides a firm theoretical basis for when such techniques will and will not work. This thesis also serves as a guide to parallel Newton methods for researchers who want to write the next chapter in this ongoing story. 2026-03-17T17:55:01Z PhD Dissertation; Stanford University Xavier Gonzalez 10.25740/vf943fc9855 http://arxiv.org/abs/2510.14759v2 On the convergence of stochastic variance reduced gradient for linear inverse problems 2026-03-17T16:21:01Z Stochastic variance reduced gradient (SVRG) is an accelerated version of stochastic gradient descent based on variance reduction, and is promising for solving large-scale inverse problems. In this work, we analyze SVRG and a regularized version that incorporates a priori knowledge of the problem, for solving linear inverse problems in Hilbert spaces. We prove that, with suitable constant step size schedules and regularity conditions, the regularized SVRG can achieve optimal convergence rates in terms of the noise level without any early stopping rules, provided that the truncation level is chosen suitably, and standard SVRG is also optimal for problems with nonsmooth solutions under a priori stopping rules. The analysis is based on an explicit error recursion and suitable a priori estimates on the inner loop updates with respect to the anchor point. Numerical experiments are provided to complement the theoretical analysis. 2025-10-16T14:59:11Z 29 pages, 2 figures Bangti Jin Zehui Zhou http://arxiv.org/abs/2603.16644v1 Perturbation Analysis for Preconditioned Normal Equations in Mixed Precision 2026-03-17T15:16:30Z For real matrices of full column-rank, we analyze the conditioning of several types of normal equations that are preconditioned by a randomized preconditioner computed in lower precision. These include symmetrically preconditioned normal equations, half-preconditioned normal equations, seminormal equations and not-normal equations. Our perturbation bounds are realistic and informative, and suggest that the conditioning depends only mildly on the quality of the preconditioner; however, it does depend on the size of the least squares residual -- even if the normal equations do not originate from a least squares problem. We illustrate that a randomized preconditioner can deliver a solution accuracy comparable to that of Matlab's mldivide command, is efficient in practice, and well-suited to GPU implementations. For the computation of the preconditioner, we propose an automatic selection of the precision, based on a fast condition number estimation in lower precision. 2026-03-17T15:16:30Z James E. Garrison Ilse C. F. Ipsen http://arxiv.org/abs/2312.07762v2 Interpretable factorization of clinical questionnaires to identify latent factors of psychopathology 2026-03-17T15:06:54Z Psychiatry research seeks to understand the manifestations of psychopathology in behavior, as measured in questionnaire data, by identifying a small number of latent factors that explain them. While factor analysis is the traditional tool for this purpose, the resulting factors may not be interpretable, and may also be subject to confounding variables. Moreover, missing data are common, and explicit imputation is often required. To overcome these limitations, we introduce interpretability constrained questionnaire factorization (ICQF), a non-negative matrix factorization method with regularization tailored for questionnaire data. Our method aims to promote factor interpretability and solution stability. We provide an optimization procedure with theoretical convergence guarantees, and an automated procedure to detect latent dimensionality accurately. We validate these procedures using realistic synthetic data. We demonstrate the effectiveness of our method in a widely used general-purpose questionnaire, in two independent datasets (the Healthy Brain Network and Adolescent Brain Cognitive Development studies). Specifically, we show that ICQF improves interpretability, as defined by domain experts, while preserving diagnostic information across a range of disorders, and outperforms competing methods for smaller dataset sizes. This suggests that the regularization in our method matches domain characteristics. The python implementation for ICQF is available at https://github.com/jefferykclam/ICQF. 2023-12-12T22:10:38Z Ka Chun Lam Bridget W Mahony Armin Raznahan Francisco Pereira http://arxiv.org/abs/2603.16618v1 A Jacobi Field Approach to Splitting Detection in Schrödinger Bridge 2026-03-17T14:56:23Z We study the problem of detecting the onset of path splitting in stochastic interpolation between probability distributions. This question is especially subtle when the target distribution is nonconvex or supported on disconnected components, where interpolating trajectories may separate into distinct branches. Motivated by the stochastic control and Schrödinger bridge viewpoint, we propose a Jacobi field based indicator for identifying candidate splitting times and locations. Our approach is based on the Jacobi field associated with the linearization of an induced interpolating flow. Starting from a stochastic interpolation ansatz, we construct an Eulerian velocity field by conditional averaging and derive its spatial Jacobian in terms of the local posterior geometry of the target sample cloud. This allows us to interpret the symmetric part of the Jacobian as a local strain tensor and to use its spectral structure to quantify the amplification of infinitesimal perturbations along reference trajectories. Numerical experiments on non-convex and disconnected target distributions show that the proposed indicator consistently localizes the emergence of branching regions and captures the temporal development of splitting. These results suggest that Jacobi field analysis provides a natural mathematical framework for studying local instability and splitting phenomena in stochastic interpolation. 2026-03-17T14:56:23Z Chunhai Jiao Jin Guo Haoyan Zhang Jinqiao Duan Ting Gao http://arxiv.org/abs/2503.01057v2 Sparse Randomized Approximation of Normal Cycles 2026-03-17T14:33:23Z We extend our work for compression of currents and varifolds to a compression algorithm for the embedded normal cycles representation of shape, restricted to the constant normal kernel case, using the Nystrom approximation in Reproducing Kernel Hilbert Spaces (RKHS) and ridge leverage score (RLS) sampling. Our method comes with theoretical guarantees on the compression error decay, and the approximations are shown to be effective for downstream tasks such as nonlinear shape registration in the Large Deformation Diffeomorphic Metric Mapping (LDDMM) framework, even for very high compression ratios. The performance of our algorithm is demonstrated on large-scale shape data from modern geometry processing datasets and is shown to accelerate downstream registration tasks significantly. 2025-03-02T23:34:30Z Allen Paul Neill Campbell Tony Shardlow http://arxiv.org/abs/2603.16571v1 Splitting horizontal and vertical polynomial order in a compatible finite element discretisation for numerical weather prediction 2026-03-17T14:27:25Z The accurate and efficient representation of atmospheric dynamics remains a central challenge in numerical weather prediction. A particular difficulty arises from the strong anisotropy of the atmosphere, in which horizontal and vertical motions occur on very different length scales, motivating numerical discretisations that can reflect this structure. In this study, we introduce a compatible finite element discretisation of the compressible Boussinesq and compressible Euler equations in which the horizontal and vertical polynomial orders are treated independently. The split-order discretisation is constructed using a tensor-product framework that preserves the discrete de Rham complex and associated mimetic properties. Its wave-propagation characteristics are examined through a discrete dispersion analysis that extends previous analyses to configurations with differing horizontal and vertical polynomial orders. The results show that increasing horizontal order improves the representation of gravity waves at low and intermediate wavenumbers, while increasing vertical order can degrade dispersion accuracy near the grid scale and introduce spectral gaps. A series of idealised numerical experiments, including gravity-wave propagation, advective transport, mountain-wave flow, and a global baroclinic-wave test, is used to assess the scheme's accuracy and convergence properties. These experiments demonstrate that increasing the polynomial order in the dominant direction of motion improves convergence, and that increasing the horizontal order yields the greatest gain in accuracy under typical atmospheric conditions. The results indicate that split-order compatible finite element discretisations provide a viable alternative for controlling accuracy and numerical behaviour in atmospheric dynamical cores. 2026-03-17T14:27:25Z Daniel Witt Thomas Bendall Jemma Shipton http://arxiv.org/abs/2603.16539v1 Perturbation Analysis of the QT-Drazin Inverse of Quaternion Tensors via the QT-Product 2026-03-17T14:01:12Z The motivation of this paper is to investigate the perturbation theory for the QT-Drazin inverse of quaternion tensors under the QT-product via the associated $z$-block circulant representation. A fundamental relationship between the QT-Drazin inverse of $\mathtt{bcirc}_z(\mathcal A)$ and the $z$-block circulant form of $\mathcal A^D$ is established. Moreover, the QT-index of a quaternion tensor is characterized by the indices of the diagonal blocks in the corresponding block-diagonalized matrix. As a consequence, a representation of the QT-Drazin inverse in terms of the QT-Moore--Penrose inverse is derived, which offers a practical approach for its direct computation in MATLAB. Furthermore, a decomposition theory for the QT-Drazin inverse is developed by combining the structure of $z$-block circulant matrices with the Jordan decomposition of quaternion matrices. Numerical examples are provided to demonstrate the theoretical results and computational feasibility. 2026-03-17T14:01:12Z Yue Zhao Daochang Zhang Jingqian Li Dijana Mosic http://arxiv.org/abs/2503.14978v2 Inferring diffusivity from killed diffusion 2026-03-17T14:00:23Z We consider diffusion of independent molecules in an insulated Euclidean domain with unknown diffusivity parameter. At a random time and position, the molecules may bind and stop diffusing in dependence of a given `binding potential'. The binding process can be modeled by an additive random functional corresponding to the canonical construction of a `killed' diffusion Markov process. We study the problem of conducting inference on the infinite-dimensional diffusion parameter from a histogram plot of the `killing' positions of the process. We show first that these positions follow a Poisson point process whose intensity measure is determined by the solution of a certain Schrödinger equation. The inference problem can then be re-cast as a non-linear inverse problem for this PDE, which we show to be consistently solvable in a Bayesian way under natural conditions on the initial state of the diffusion, provided the binding potential is not too `aggressive'. In the course of our proofs we obtain novel posterior contraction rate results for high-dimensional Poisson count data that are of independent interest. A numerical illustration of the algorithm by standard MCMC methods is also provided. 2025-03-19T08:16:16Z 33 pages, to appear in the Annals of Statistics Richard Nickl Fanny Seizilles http://arxiv.org/abs/2603.16516v1 Neural network parametrized level sets for image segmentation 2026-03-17T13:41:42Z The Chan-Vese functionals have proven to by a first-class method for segmentation and classification. Previously they have been implemented with level-set methods based on a pixel-wise representation of the level-sets. Later parametrized level-set approximations, such as splines, have been studied. In this paper we consider neural networks as parametrized approximations of level-set functions. We show in particular, that parametrized two-layer networks are most efficient to approximate polyhedral segments and classes. We also prove the efficiency for segmentation and classification. 2026-03-17T13:41:42Z Otmar Scherzer Cong Shi Thi Lan Nhi Vu http://arxiv.org/abs/2602.13472v2 Non-Uniform Quantum Fourier Transform 2026-03-17T12:52:00Z The Discrete Fourier Transform (DFT) is central to the analysis of uniformly sampled signals, yet many practical applications involve non-uniform sampling, requiring the Non-Uniform Discrete Fourier Transform (NUDFT). While quantum algorithms for the standard DFT are well established, a corresponding framework for the non-uniform case remains underdeveloped. This work introduces a quantum algorithm for the Non-Uniform Quantum Fourier Transform (NUQFT) based on a low-rank factorization of the NUDFT matrix. The factorization is translated into an explicit quantum construction using block encodings, Quantum Signal Processing, and the Linear Combination of Unitaries framework, yielding an $ε$-accurate block encoding of the NUDFT matrix with controlled approximation error from both classical truncation and quantum implementation. Under standard oracle access assumptions for non-uniform sampling points, we derive explicit, non-asymptotic gate-level resource estimates. The resulting complexity scales polylogarithmically with target precision, quadratically with the number of qubits through the quantum Fourier transform, and logarithmically with a geometry-dependent conditioning parameter induced by the non-uniform grid. This establishes a concrete and resource-efficient quantum analogue of the NUDFT and provides a foundation for quantum algorithms on irregularly sampled data. 2026-02-13T21:26:02Z v2 includes numerical results in Section 6 Junaid Aftab Yuehaw Khoo Haizhao Yang http://arxiv.org/abs/2603.16452v1 The peak heat flux conjecture for the first Dirichlet eigenmode of convex planar domains 2026-03-17T12:35:30Z In this paper, we study the scale-invariant quantity \[\mathcal{G}(Ω)=\frac{\|\partial_n u_1\|_{L^\infty(\partialΩ)}}{λ_1},\]where $u_1$ is the first $L^2$-normalized Dirichlet Laplace eigenfunction of a Euclidean domain $Ω$ and $λ_1$ is its eigenvalue. This is related to the peak boundary heat flux in the long time limit. For convex domains we prove that $\|\partial_n u_1\|_{L^\infty(\partialΩ)}$ is upper-bounded by a (domain-independent) constant multiple of $λ_1$. Using layer potentials, we derive shape-derivative formulae for efficient gradient computations. When combined with high-order Nyström discretization, a fast boundary integral equation solver, and eigenvalue rootfinding, this allows us to numerically optimize $\mathcal{G}$ over a class of rounded polygonal discretized domains. Based on extensive numerical experiments, we then conjecture that, over the set of convex domains, $\mathcal{G}$ is maximized by the semidisk, with the peak flux at the center of the diameter. To lend analytical support to this conjecture, we prove that the semidisk is a critical point of $\mathcal{G}$ under infinitesimal perturbations of its circular arc. 2026-03-17T12:35:30Z Zijian Wang Jeremy G. Hoskins Manas Rachh Alex H. Barnett http://arxiv.org/abs/2603.16424v1 Early-Terminable Energy-Safe Iterative Coupling for Parallel Simulation of Port-Hamiltonian Systems 2026-03-17T11:59:30Z Parallel simulation and control of large-scale robotic systems often rely on partitioned time stepping, yet finite-iteration coupling can inject spurious energy by violating power consistency--even when each subsystem is passive. This letter proposes a novel energy-safe, early-terminable iterative coupling for port-Hamiltonian subsystems by embedding a Douglas--Rachford (DR) splitting scheme in scattering (wave) coordinates. The lossless interconnection is enforced as an orthogonal constraint in the wave domain, while each subsystem contributes a discrete-time scattering port map induced by its one-step integrator. Under a discrete passivity condition on the subsystem time steps and a mild impedance-tuning condition, we prove an augmented-storage inequality certifying discrete passivity of the coupled macro-step for any finite inner-iteration budget, with the remaining mismatch captured by an explicit residual. As the inner budget increases, the partitioned update converges to the monolithic discrete-time update induced by the same integrators, yielding a principled, adaptive accuracy--compute trade-off, supporting energy-consistent real-time parallel simulation under varying computational budgets. Experiments on a coupled-oscillator benchmark validate the passivity certificates at numerical roundoff (on the order of 10e-14 in double precision) and show that the reported RMS state error decays monotonically with increasing inner-iteration budgets, consistent with the hard-coupling limit. 2026-03-17T11:59:30Z Qi Wei Jianfeng Tao Hongyu Nie Wangtao Tan http://arxiv.org/abs/2406.09932v3 Compression of Currents and Varifolds 2026-03-17T11:39:13Z We derive an algorithm for compression of the currents and varifolds representations of shapes, using ridge leverage score (RLS) sampling, and the theory of Nystrom approximation in Reproducing Kernel Hilbert Spaces. Our method is faster than existing compression techniques and comes with theoretical guarantees on the rate of decay of the compression error as a function of the smoothness of the associated shape representation. The obtained compressions are shown to be useful for accelerating downstream tasks such as nonlinear shape registration in the Large Deformation Diffeomorphic Metric Mapping (LDDMM) framework without loss of quality, even for very high compression ratios. The performance of our algorithm is demonstrated on large-scale shape data from modern geometry processing datasets, and is shown to be fast and scalable with rapid error decay. 2024-06-14T11:26:04Z SIAM Journal on Imaging Sciences (2026) 19, 327-363 Allen Paul Neill Campbell Tony Shardlow 10.1137/24M1699656 http://arxiv.org/abs/2603.16393v1 Robust Physics-Guided Diffusion for Full-Waveform Inversion 2026-03-17T11:27:37Z We develop a robust physics-guided diffusion framework for full-waveform inversion that combines a score-based generative prior with likelihood guidance computed through wave-equation simulations. We adopt a transport-based data-consistency potential (Wasserstein-2), incorporating wavefield enhancement via bounded weighting and observation-dependent normalization, thereby improving robustness to amplitude imbalance and time/phase misalignment. On the inference side, we introduce a preconditioned guided reverse-diffusion scheme that adapts the guidance strength and spatial scaling throughout the reverse-time dynamics, yielding a more stable and effective data-consistency guidance step than standard diffusion posterior sampling (DPS). Numerical experiments on OpenFWI datasets demonstrate improved reconstruction quality over deterministic optimization baselines and standard DPS under comparable computational budgets. 2026-03-17T11:27:37Z Jishen Peng Enze Jiang Zheng Ma Xiongbin Yan