http://arxiv.org/api/jPDNj5KJF5udWdedM2ZMAF16blk2025-04-22T00:00:00-04:00419761515http://arxiv.org/abs/2504.14768v12025-04-20T23:39:30Z2025-04-20T23:39:30ZA note on unshifted lattice rules for high-dimensional integration in
weighted unanchored Sobolev spaces This short article studies a deterministic quasi-Monte Carlo lattice rule in
weighted unanchored Sobolev spaces of smoothness $1$. Building on the error
analysis by Kazashi and Sloan, we prove the existence of unshifted rank-1
lattice rules that achieve a worst-case error of $O(n^{-1/4}(\log n)^{1/2})$,
with the implied constant independent of the dimension, under certain
summability conditions on the weights. Although this convergence rate is
inferior to the one achievable for the shifted-averaged root mean squared
worst-case error, the result does not rely on random shifting or transformation
and holds unconditionally without any conjecture, as assumed by Kazashi and
Sloan.
Takashi Goda6 pageshttp://arxiv.org/abs/2504.04951v22025-04-20T20:57:38Z2025-04-07T11:34:56ZAnisotropic space-time goal-oriented error control and mesh adaptivity
for convection-diffusion-reaction equations We present an anisotropic goal-oriented error estimator based on the Dual
Weighted Residual (DWR) method for time-dependent convection-diffusion-reaction
(CDR) equations. Using anisotropic interpolation operators the estimator is
elementwise separated with respect to the single directions in space and time
leading to adaptive, anisotropic mesh refinement in a natural way. To prevent
spurious oscillations the streamline upwind Petrov-Galerkin (SUPG) method is
applied to stabilize the underlying system in the case of high P\'eclet
numbers. Efficiency and robustness of the underlying algorithm are demonstrated
for different goal functionals. The directional error indicators quantify
anisotropy of the solution with respect to the goal, and produce meshes that
efficiently capture sharp layers. Numerical examples show the superiority of
the proposed approach over isotropic adaptive and global mesh refinement using
established benchmarks for convection-dominated transport.
M. BauseM. BruchhäuserB. EndtmayerN. MargenbergI. ToulopoulosT. Wickhttp://arxiv.org/abs/2504.14721v12025-04-20T19:28:20Z2025-04-20T19:28:20ZData-driven model order reduction for T-Product-Based dynamical systems Model order reduction plays a crucial role in simplifying complex systems
while preserving their essential dynamic characteristics, making it an
invaluable tool in a wide range of applications, including robotic systems,
signal processing, and fluid dynamics. However, traditional model order
reduction techniques like balanced truncation are not designed to handle tensor
data directly and instead require unfolding the data, which may lead to the
loss of important higher-order structural information. In this article, we
introduce a novel framework for data-driven model order reduction of
T-product-based dynamical systems (TPDSs), which are often used to capture the
evolution of third-order tensor data such as images and videos through the
T-product. Specifically, we develop advanced T-product-based techniques,
including T-balanced truncation, T-balanced proper orthogonal decomposition,
and the T-eigensystem realization algorithm for input-output TPDSs by
leveraging the unique properties of T-singular value decomposition. We
demonstrate that these techniques offer significant memory and computational
savings while achieving reduction errors that are comparable to those of
conventional methods. The effectiveness of the proposed framework is further
validated through synthetic and real-world examples.
Shenghan MeiZiqin HeYidan MeiXin MaoAnqi DongRen WangCan Chen12 pages, 1 figurehttp://arxiv.org/abs/2404.09730v32025-04-20T16:45:48Z2024-04-15T12:29:28ZConvergence Analysis of Probability Flow ODE for Score-based Generative
Models Score-based generative models have emerged as a powerful approach for
sampling high-dimensional probability distributions. Despite their
effectiveness, their theoretical underpinnings remain relatively
underdeveloped. In this work, we study the convergence properties of
deterministic samplers based on probability flow ODEs from both theoretical and
numerical perspectives. Assuming access to $L^2$-accurate estimates of the
score function, we prove the total variation between the target and the
generated data distributions can be bounded above by
$\mathcal{O}(d^{3/4}\delta^{1/2})$ in the continuous time level, where $d$
denotes the data dimension and $\delta$ represents the $L^2$-score matching
error. For practical implementations using a $p$-th order Runge-Kutta
integrator with step size $h$, we establish error bounds of
$\mathcal{O}(d^{3/4}\delta^{1/2} + d\cdot(dh)^p)$ at the discrete level.
Finally, we present numerical studies on problems up to 128 dimensions to
verify our theory.
Daniel Zhengyu HuangJiaoyang HuangZhengjiang Lin10.1109/TIT.2025.355705037 pages, 7 figures; To appear in IEEE Transactions on Information
Theoryhttp://arxiv.org/abs/2504.07835v32025-04-20T16:18:37Z2025-04-10T15:12:29ZPychop: Emulating Low-Precision Arithmetic in Numerical Methods and
Neural Networks Motivated by the growing demand for low-precision arithmetic in computational
science, we exploit lower-precision emulation in Python -- widely regarded as
the dominant programming language for numerical analysis and machine learning.
Low-precision training has revolutionized deep learning by enabling more
efficient computation and reduced memory and energy consumption while
maintaining model fidelity. To better enable numerical experimentation with and
exploration of low precision computation, we developed the Pychop library,
which supports customizable floating-point formats and a comprehensive set of
rounding modes in Python, allowing users to benefit from fast, low-precision
emulation in numerous applications. Pychop also introduces interfaces for both
PyTorch and JAX, enabling efficient low-precision emulation on GPUs for neural
network training and inference with unparalleled flexibility.
In this paper, we offer a comprehensive exposition of the design,
implementation, validation, and practical application of Pychop, establishing
it as a foundational tool for advancing efficient mixed-precision algorithms.
Furthermore, we present empirical results on low-precision emulation for image
classification and object detection using published datasets, illustrating the
sensitivity of the use of low precision and offering valuable insights into its
impact. Pychop enables in-depth investigations into the effects of numerical
precision, facilitates the development of novel hardware accelerators, and
integrates seamlessly into existing deep learning workflows. Software and
experimental code are publicly available at
https://github.com/inEXASCALE/pychop.
Erin CarsonXinye Chenhttp://arxiv.org/abs/2504.08223v22025-04-20T13:05:35Z2025-04-11T03:11:51ZStochastic momentum ADMM for nonconvex and nonsmooth optimization with
application to PnP algorithm This paper proposes SMADMM, a single-loop Stochastic Momentum Alternating
Direction Method of Multipliers for solving a class of nonconvex and nonsmooth
composite optimization problems. SMADMM achieves the optimal oracle complexity
of $\mathcal{O}(\epsilon^{-3/2})$ in the online setting. Unlike previous
stochastic ADMM algorithms that require large mini-batches or a double-loop
structure, SMADMM uses only $\mathcal{O}(1)$ stochastic gradient evaluations
per iteration and avoids costly restarts. To further improve practicality, we
incorporate dynamic step sizes and penalty parameters, proving that SMADMM
maintains its optimal complexity without the need for large initial batches. We
also develop PnP-SMADMM by integrating plug-and-play priors, and establish its
theoretical convergence under mild assumptions. Extensive experiments on
classification, CT image reconstruction, and phase retrieval tasks demonstrate
that our approach outperforms existing stochastic ADMM methods both in accuracy
and efficiency, validating our theoretical results.
Kangkang DengShuchang ZhangBoyu WangJiachen JinJuan ZhouHongxia Wang27 Pageshttp://arxiv.org/abs/2305.06261v32025-04-20T07:04:52Z2023-05-10T15:47:43ZMultiscale analysis via pseudo-reversing and applications to
manifold-valued sequences Modeling data using manifold values is a powerful concept with numerous
advantages, particularly in addressing nonlinear phenomena. This approach
captures the intrinsic geometric structure of the data, leading to more
accurate descriptors and more efficient computational processes. However, even
fundamental tasks like compression and data enhancement present meaningful
challenges in the manifold setting. This paper introduces a multiscale
transform that aims to represent manifold-valued sequences at different scales,
enabling novel data processing tools for various applications. Similar to
traditional methods, our construction is based on a refinement operator that
acts as an upsampling operator and a corresponding downsampling operator.
Inspired by Wiener's lemma, we term the latter as the reverse of the former. It
turns out that some upsampling operators, for example, least-squares-based
refinement, do not have a practical reverse. Therefore, we introduce the notion
of pseudo-reversing and explore its analytical properties and asymptotic
behavior. We derive analytical properties of the induced multiscale transform
and conclude the paper with numerical illustrations showcasing different
aspects of the pseudo-reversing and two data processing applications involving
manifolds.
Wael MattarNir Sharonhttp://arxiv.org/abs/2504.14498v12025-04-20T05:33:02Z2025-04-20T05:33:02ZAssessing the Performance of Mixed-Precision ILU(0)-Preconditioned
Multiple-Precision Real and Complex Krylov Subspace Methods Krylov subspace methods are linear solvers based on matrix-vector
multiplications and vector operations. While easily parallelizable, they are
sensitive to rounding errors and may experience convergence issues. ILU(0), an
incomplete LU factorization with zero fill-in, is a well-known preconditioning
technique that enhances convergence for sparse matrices. In this paper, we
implement a double-precision and multiple-precision ILU(0) preconditioner,
compatible with product-type Krylov subspace methods, and evaluate its
performance.
Tomonori Kouyahttp://arxiv.org/abs/2303.03984v32025-04-20T03:08:44Z2023-03-07T15:33:12ZEnhanced Adaptive Gradient Algorithms for Nonconvex-PL Minimax
Optimization Minimax optimization recently is widely applied in many machine learning
tasks such as generative adversarial networks, robust learning and
reinforcement learning. In the paper, we study a class of nonconvex-nonconcave
minimax optimization with nonsmooth regularization, where the objective
function is possibly nonconvex on primal variable $x$, and it is nonconcave and
satisfies the Polyak-Lojasiewicz (PL) condition on dual variable $y$. Moreover,
we propose a class of enhanced momentum-based gradient descent ascent methods
(i.e., MSGDA and AdaMSGDA) to solve these stochastic nonconvex-PL minimax
problems. In particular, our AdaMSGDA algorithm can use various adaptive
learning rates in updating the variables $x$ and $y$ without relying on any
specifical types. Theoretically, we prove that our methods have the best known
sample complexity of $\tilde{O}(\epsilon^{-3})$ only requiring one sample at
each loop in finding an $\epsilon$-stationary solution. Some numerical
experiments on PL-game and Wasserstein-GAN demonstrate the efficiency of our
proposed methods.
Feihu HuangChunyu XuanXinrui WangSiqi ZhangSongcan ChenPublished in AISTATS 2025http://arxiv.org/abs/2407.09994v22025-04-19T16:29:33Z2024-07-13T20:17:41ZDistributed computing for physics-based data-driven reduced modeling at
scale: Application to a rotating detonation rocket engine High-performance computing (HPC) has revolutionized our ability to perform
detailed simulations of complex real-world processes. A prominent contemporary
example is from aerospace propulsion, where HPC is used for rotating detonation
rocket engine (RDRE) simulations in support of the design of next-generation
rocket engines; however, these simulations take millions of core hours even on
powerful supercomputers, which makes them impractical for engineering tasks
like design exploration and risk assessment. Data-driven reduced-order models
(ROMs) aim to address this limitation by constructing computationally cheap yet
sufficiently accurate approximations that serve as surrogates for the
high-fidelity model. This paper contributes a distributed memory algorithm that
achieves fast and scalable construction of predictive physics-based ROMs
trained from sparse datasets of extremely large state dimension. The algorithm
learns structured physics-based ROMs that approximate the dynamical systems
underlying those datasets.This enables model reduction for problems at a scale
and complexity that exceeds the capabilities of standard, serial approaches. We
demonstrate our algorithm's scalability using up to $2,048$ cores on the
Frontera supercomputer at the Texas Advanced Computing Center. We focus on a
real-world three-dimensional RDRE for which one millisecond of simulated
physical time requires one million core hours on a supercomputer. Using a
training dataset of $2,536$ snapshots each of state dimension $76$ million, our
distributed algorithm enables the construction of a predictive data-driven
reduced model in just $13$ seconds on $2,048$ cores on Frontera.
Ionut-Gabriel FarcasRayomand P. GundeviaRamakanth MunipalliKaren E. Willcox10.1016/j.cpc.2025.10961922 pages, 8 figuresComputer Physics Communications 313 (2025) 109619http://arxiv.org/abs/2504.14343v12025-04-19T16:21:28Z2025-04-19T16:21:28ZNumerical analysis of a particle system for the calibrated Heston-type
local stochastic volatility model We analyse a Monte Carlo particle method for the simulation of the calibrated
Heston-type local stochastic volatility (H-LSV) model. The common application
of a kernel estimator for a conditional expectation in the calibration
condition results in a McKean-Vlasov (MV) stochastic differential equation
(SDE) with non-standard coefficients. The primary challenges lie in certain
mean-field terms in the drift and diffusion coefficients and the
$1/2$-H\"{o}lder regularity of the diffusion coefficient. We establish the
well-posedness of this equation for a fixed but arbitrarily small bandwidth of
the kernel estimator. Moreover, we prove a strong propagation of chaos result,
ensuring convergence of the particle system under a condition on the Feller
ratio and up to a critical time. For the numerical simulation, we employ an
Euler-Maruyama scheme for the log-spot process and a full truncation Euler
scheme for the CIR volatility process. Under certain conditions on the inputs
and the Feller ratio, we prove strong convergence of the Euler-Maruyama scheme
with rate $1/2$ in time, up to a logarithmic factor. Numerical experiments
illustrate the convergence of the discretisation scheme and validate the
propagation of chaos in practice.
Christoph ReisingerMaria Olympia Tsiannihttp://arxiv.org/abs/2504.14308v12025-04-19T14:13:06Z2025-04-19T14:13:06ZThe Schur complements for $SDD_{1}$ matrices and their application to
linear complementarity problems In this paper we propose a new scaling method to study the Schur complements
of $SDD_{1}$ matrices. Its core is related to the non-negative property of the
inverse $M$-matrix, while numerically improving the Quotient formula. Based on
the Schur complement and a novel norm splitting manner, we establish an upper
bound for the infinity norm of the inverse of $SDD_{1}$ matrices, which depends
solely on the original matrix entries. We apply the new bound to derive an
error bound for linear complementarity problems of $B_{1}$-matrices.
Additionally, new lower and upper bounds for the determinant of $SDD_{1}$
matrices are presented. Numerical experiments validate the effectiveness and
superiority of our results.
Yang HuJianzhou LiuWenlong Zeng26pageshttp://arxiv.org/abs/2504.14297v12025-04-19T13:48:55Z2025-04-19T13:48:55ZTime discretization in convected linearized thermo-visco-elastodynamics
at large displacements The fully-implicit time discretization (i.e. the backward Euler formula) is
applied to compressible nonlinear dynamical models of thermo-viscoelastic
solids in the Eulerian description, i.e. in the actual deforming configuration,
formulated in terms of rates. The Kelvin-Voigt rheology or also, in the
deviatoric part, the Jeffreys rheology (covering creep or plasticity) are
considered, using the additive Green-Naghdi's decomposition of total strain
into the elastic and the inelastic strains formulated in terms of (objective)
rates exploiting the Zaremba-Jaumann time derivative. A linearized convective
model at large displacements is considered, focusing on the case where the
internal energy additively splits the (convex) mechanical and the thermal
parts. The time-discrete suitably regularized scheme is devised. The numerical
stability and, considering the multipolar 2nd-grade viscosity, also convergence
towards weak solutions are proved, exploiting the convexity of the kinetic
energy when written in terms of linear momentum instead of velocity and
estimating the temperature gradient from the entropy-like inequality.
Tomáš Roubíčekhttp://arxiv.org/abs/2502.07797v32025-04-19T12:35:23Z2025-01-28T13:50:51ZA combined Lax-Wendroff/interpolation approach with finite element
method for a three-dimensional system of tectonic deformation model:
application to landslides in Cameroon This paper develops an efficient computational technique to assess the
landslide responses to tectonic deformation and to predict the implications of
large bedrocks landslides on the short and long-term development of the
disasters. The considered equations represent a three-dimensional system of
geological structure deformation subject to suitable initial and boundary
conditions. The space derivatives are approximated using the finite element
procedure while the approximation in time derivative is obtained using the
Lax-Wendroff and interpolation techniques. The new approach is so called a
combined Lax-Wendroff/interpolation method with finite element method. The
modified Lax-Wendroff/interpolation scheme is employed to efficiently treat the
time derivative term and to provide a suitable time step restriction for
stability. Under this time step requirement, both stability and error estimates
of the new algorithm are deeply analyzed using a constructed strong norm. The
theory suggests that the developed computational technique is second-order
accurate in time and spatial convergent with order O(h^{p}), where $h$ denotes
the space size and p is a positive integer. A wide set of numerical examples
are carried out to confirm the theoretical results and to demonstrate the
utility and validity of the proposed numerical scheme. An application to
landslides observed in west and center regions in Cameroon from October 2019 to
November 2024, are discussed.
Eric Ngondiep26 pages, 6 tables, 30 figureshttp://arxiv.org/abs/2504.14116v12025-04-19T00:23:09Z2025-04-19T00:23:09ZAnalysis of a finite element method for PDEs in evolving domains with
topological changes The paper presents the first rigorous error analysis of an unfitted finite
element method for a linear parabolic problem posed on an evolving domain
$\Omega(t)$ that may undergo a topological change, such as, for example, a
domain splitting. The domain evolution is assumed to be $C^2$-smooth away from
a critical time $t_c$, at which the topology may change instantaneously. To
accommodate such topological transitions in the error analysis, we introduce
several structural assumptions on the evolution of $\Omega(t)$ in the vicinity
of the critical time. These assumptions allow a specific stability estimate
even across singularities. Based on this stability result we derive
optimal-order discretization error bounds, provided the continuous solution is
sufficiently smooth. We demonstrate the applicability of our assumptions with
examples of level-set domains undergoing topological transitions and discuss
cases where the analysis fails. The theoretical error estimate is confirmed by
the results of a numerical experiment.
Maxim A. OlshanskiiArnold Reusken