https://arxiv.org/api/ZBmgWuE/WNrLzUNWHqNq5nCoYyc 2026-06-21T16:32:36Z 2664 165 15 http://arxiv.org/abs/2512.21678v2 Some Patterns of Duplications in the outputs of Mersenne Twister Pseudorandom Number Generator MT19937 2026-01-02T00:34:40Z

The Mersenne Twister MT19937 pseudorandom number generator, introduced by the last two authors in 1998, is still widely used. It passes all existing statistical tests, except for the linear complexity test, which measures the ratio of the even-odd of the number of 1's among specific bits (and hence should not be important for most applications). Harase reported that MT19937 is rejected by some birthday-spacing tests, which are rather artificially designed. In this paper, we report that MT19937 fails in a natural test based on the distribution of run-lengths on which we found an identical value in the output 32-bit integers. The number of observations of the run-length 623 is some 40 times larger than the expectation (and than the numbers of the observations of 622 and 624, etc.), which implies that the corresponding p-value is almost 0. We mathematically analyze the phenomena, and obtain a theorem which explains these failures. It seems not to be a serious defect of MT19937, because finding the defect requires astronomical efforts. Still, the phenomena should be reported to the academic society relating to pseudorandom number generation.

2025-12-25T13:45:06Z 8 pages, no figure Alain Schumacher Takuji Nishimura Makoto Matsumoto http://arxiv.org/abs/2505.19304v2 f4ncgb: High Performance Gröbner Basis Computations in Free Algebras 2026-01-01T22:11:41Z

We present f4ncgb, a new open-source C++ library for Gröbner basis computations in free algebras, which transfers recent advancements in commutative Gröbner basis software to the noncommutative setting. As our experiments show, f4ncgb establishes a new state-of-the-art for noncommutative Gröbner basis computations. We also discuss implementation details and design choices.

2025-05-25T20:24:22Z 21 pages, 2 figures, 3 tables Maximilian Heisinger Clemens Hofstadler http://arxiv.org/abs/2410.09053v3 Fast Symbolic Integer-Linear Spectra 2025-12-31T22:14:03Z

Here we contribute a fast symbolic eigenvalue solver for matrices whose eigenvalues are $\mathbb{Z}$-linear combinations of their entries, alongside efficient general and stochastic $M^{X}$ generators. Users can interact with a few degrees of freedom to create linear operators, making high-dimensional symbolic analysis feasible for when numerical analyses are insufficient.

2024-09-18T06:59:39Z Jonny Luntzel Abraham Miller http://arxiv.org/abs/2508.12835v2 Rapid Variable Resolution Particle Initialization for Complex Geometries 2025-12-31T13:35:04Z

The accuracy of meshless methods like Smoothed Particle Hydrodynamics (SPH) is highly dependent on the quality of the particle distribution. Existing particle initialization techniques often struggle to simultaneously achieve adaptive resolution, handle intricate boundaries, and efficiently generate well-packed distributions inside and outside a boundary. This work presents a fast and robust particle initialization method that achieves these goals using standard SPH building blocks. Our approach enables simultaneous initialization of fluid and solid regions, supports arbitrary geometries, and achieves high-quality, quasi-uniform particle arrangements without complex procedures like surface bonding. Extensive results in both 2D and 3D demonstrate that the obtained particle distributions exhibit good boundary conformity, low spatial disorder, and minimal density variation, all with significantly reduced computational cost compared to existing approaches. This work paves the way for automated particle initialization to accurately model flow in and around bodies with meshless methods, particularly with SPH.

2025-08-18T11:19:10Z 39 pages, 24 figures Computer Physics Communications 320 (2026) 109992 Navaneet Villodi Prabhu Ramachandran 10.1016/j.cpc.2025.109992 http://arxiv.org/abs/2510.16172v2 Fast, Differentiable, GPU-Accelerated Ray Tracing for Multiple Diffraction and Reflection Paths 2025-12-31T13:24:02Z

We present a fast, differentiable, GPU-accelerated optimization method for ray path tracing in environments containing planar reflectors and straight diffraction edges. Based on Fermat's principle, our approach reformulates the path-finding problem as the minimization of total path length, enabling efficient parallel execution on modern GPU architectures. Unlike existing methods that require separate algorithms for reflections and diffractions, our unified formulation maintains consistent problem dimensions across all interaction sequences, making it particularly suitable for vectorized computation. Through implicit differentiation, we achieve efficient gradient computation without differentiating through solver iterations, significantly outperforming traditional automatic differentiation approaches. Numerical simulations demonstrate convergence rates comparable to specialized Newton methods while providing superior scalability for large-scale applications. The method integrates seamlessly with differentiable programming libraries such as JAX and DrJIT, enabling new possibilities in inverse design and optimization for wireless propagation modeling. The source code is openly available at https://github.com/jeertmans/fpt-jax.

2025-10-17T19:27:00Z 5 pages, 3 figures, accepted at EuCAP 2026 Jérome Eertmans Sophie Lequeu Benoît Legat Laurent Jacques Claude Oestges http://arxiv.org/abs/2505.12980v2 Algorithms for Nonlinear Mixed-Integer Location Estimation 2025-12-31T06:05:43Z

For three decades, carrier-phase observations have been used to obtain the most accurate location estimates using global navigation satellite systems (GNSS). These estimates are computed by minimizing a nonlinear mixed-integer least-squares problem. Existing algorithms linearize the problem, orthogonally project it to eliminate real variables, and then solve the integer least-square problem. There is now considerable interest in developing similar localization techniques for terrestrial and indoor settings. We show that algorithms that linearize first fail in these settings and we propose several algorithms for computing the estimates. Some of our algorithms are elimination algorithms that start by eliminating the non-linear terms in the constraints; others construct a geometric arrangement that allows us to efficiently enumerate integer solutions (in polynomial time). We focus on simplified localization problems in which the measurements are range (distance) measurements and carrier phase range measurements, with no nuisance parameters. The simplified problem allows us to focus on the core question of untangling the nonlinearity and the integer nature of some parameters. We show using simulations that the new algorithms are effective at close ranges at which the linearize-first approach fails.

2025-05-19T11:17:17Z Ophir Uziel Efi Fogel Dan Halperin Sivan Toledo http://arxiv.org/abs/2511.03566v3 Improving Directions in Mixed Integer Bilevel Linear Optimization 2025-12-30T18:33:53Z

We consider the central role of improving directions in solution methods for mixed integer bilevel linear optimization problems (MIBLPs). Current state-of-the-art methods for solving MIBLPs employ the branch-and-cut framework originally developed for solving mixed integer linear optimization problems. This approach relies on oracles for two kinds of subproblems: those for checking whether a candidate pair of leader's and follower's decisions is bilevel feasible, and those required for generating valid inequalities. Typically, these two types of oracles are managed separately, but in this work, we explore their close connection and propose a solution framework based on solving a single type of subproblem: determining whether there exists a so-called improving feasible direction for the follower's problem. Solution of this subproblem yields information that can be used both to check feasibility and to generate strong valid inequalities. Building on prior works, we expose the foundational role of improving directions in enforcing the follower's optimality condition and extend a previously known hierarchy of optimality-based relaxations to the mixed-integer setting, showing that the associated relaxed feasible regions coincide exactly with the closure associated with intersection cuts derived from improving directions. Numerical results with an implementation using a modified version of the open source solver MibS show that this approach can yield practical improvements.

2025-11-05T15:48:56Z Federico Battista Ted K. Ralphs http://arxiv.org/abs/2312.04018v4 Ricci-Notation Tensor Framework for Model-based Approaches to Imaging 2025-12-29T22:57:04Z

Model-based approaches to imaging, like specialized image enhancements in astronomy, facilitate explanations of relationships between observed inputs and computed outputs. These models may be expressed with extended matrix-vector (EMV) algebra, especially when they involve only scalars, vectors, and matrices, and with n-mode or index notations, when they involve multidimensional arrays, also called numeric tensors or, simply, tensors. While this paper features an example, inspired by exoplanet imaging, that employs tensors to reveal (inverse) 2D fast Fourier transforms in an image enhancement model, the work is actually about the tensor algebra and software, or tensor frameworks, available for model-based imaging. The paper proposes a Ricci-notation tensor (RT) framework, comprising a dual-variant index notation, with Einstein summation convention, and codesigned object-oriented software, called the RTToolbox for MATLAB. Extensions to Ricci notation offer novel representations for entrywise, pagewise, and broadcasting operations popular in EMV frameworks for imaging. Complementing the EMV algebra computable with MATLAB, the RTToolbox demonstrates programmatic and computational efficiency via careful design of numeric tensor and dual-variant index classes. Compared to its closest competitor, also a numeric tensor framework that uses index notation, the RT framework enables superior ways to model imaging problems and, thereby, to develop solutions.

2023-12-07T03:25:40Z 15 pages, 7 figures, 5 tables Journal of Imaging Science and Technology, 68(4), 2024 Dileepan Joseph Electrical and Computer Engineering, University of Alberta 10.2352/J.ImagingSci.Technol.2024.68.4.040504 http://arxiv.org/abs/2502.03000v5 Armadillo: An Efficient Framework for Numerical Linear Algebra 2025-12-29T13:37:07Z

A major challenge in the deployment of scientific software solutions is the adaptation of research prototypes to production-grade code. While high-level languages like MATLAB are useful for rapid prototyping, they lack the resource efficiency required for scalable production applications, necessitating translation into lower level languages like C++. Further, for machine learning and signal processing applications, the underlying linear algebra primitives, generally provided by the standard BLAS and LAPACK libraries, are unwieldy and difficult to use, requiring manual memory management and other tedium. To address this challenge, the Armadillo C++ linear algebra library provides an intuitive interface for writing linear algebra expressions that are easily compiled into efficient production-grade implementations. We describe the expression optimisations we have implemented in Armadillo, exploiting template metaprogramming. We demonstrate that these optimisations result in considerable efficiency gains on a variety of benchmark linear algebra expressions.

2025-02-05T08:52:37Z International Conference on Computer and Automation Engineering, 2025 Conrad Sanderson Ryan Curtin 10.1109/ICCAE64891.2025.10980539 http://arxiv.org/abs/2507.09435v2 GeoWarp: An automatically differentiable and GPU-accelerated implicit MPM framework for geomechanics based on NVIDIA Warp 2025-12-27T05:18:52Z

The material point method (MPM), a hybrid Lagrangian-Eulerian particle method, is increasingly used to simulate large-deformation and history-dependent behavior of geomaterials. While explicit time integration dominates current MPM implementations due to its algorithmic simplicity, such schemes are unsuitable for quasi-static and long-term processes typical in geomechanics. Implicit MPM formulations are free of these limitations but remain less adopted, largely due to the difficulty of computing the Jacobian matrix required for Newton-type solvers, especially when consistent tangent operators should be derived for complex constitutive models. In this paper, we introduce GeoWarp -- an implicit MPM framework for geomechanics built on NVIDIA Warp -- that exploits GPU parallelism and reverse-mode automatic differentiation to compute Jacobians without manual derivation. To enhance efficiency, we develop a sparse Jacobian construction algorithm that leverages the localized particle-grid interactions intrinsic to MPM. The framework is verified through forward and inverse examples in large-deformation elastoplasticity and coupled poromechanics. Results demonstrate that GeoWarp provides a robust, scalable, and extensible platform for differentiable implicit MPM simulation in computational geomechanics.

2025-07-13T00:11:36Z Adv. Eng. Softw. 212 (2026) 104072 Yidong Zhao Xuan Li Chenfanfu Jiang Jinhyun Choo 10.1016/j.advengsoft.2025.104072 http://arxiv.org/abs/2109.04193v2 OGRe: An Object-Oriented General Relativity Package for Mathematica 2025-12-25T02:09:03Z

We present OGRe, a modern Mathematica package for tensor calculus, designed to be both powerful and user-friendly. The package can be used in a variety of contexts where tensor calculations are needed, in both mathematics and physics, but it is especially suitable for general relativity. By implementing an object-oriented design paradigm, OGRe allows calculating arbitrarily complicated tensor formulas easily, and automatically transforms between index configurations and coordinate systems behind the scenes as needed, eliminating user errors by making it impossible for the user to combine tensors in inconsistent ways. Other features include displaying tensors in various forms, automatic calculation of curvature tensors and geodesic equations, easy importing and exporting of tensors between sessions, optimized algorithms and parallelization for improved performance, and more.

2021-09-06T00:31:23Z 4 pages, final version published in JOSS. NOTE: The software has been updated since this publication. Full and up-to-date documentation and source code for the latest version are available at https://github.com/bshoshany/OGRe Journal of Open Source Software, 6(65), 3416 (2021) Barak Shoshany 10.21105/joss.03416 http://arxiv.org/abs/2409.03803v3 OGRePy: An Object-Oriented General Relativity Package for Python 2025-12-24T23:02:19Z

OGRePy is a modern, open-source Python package designed to perform symbolic tensor calculations, with a particular focus on applications in general relativity. Built on an object-oriented architecture, OGRePy encapsulates tensors, metrics, and coordinate systems as self-contained objects, automatically handling raising and lowering of indices, coordinate transformations, contractions, partial or covariant derivatives, and all tensor operations. By leveraging the capabilities of SymPy and Jupyter Notebook, OGRePy provides a robust, user-friendly environment that facilitates both research and teaching in general relativity and differential geometry. This Python package reproduces the functionality of the popular Mathematica package OGRe, while greatly improving upon it by making use of Python's native object-oriented syntax. In this paper, we describe OGRePy's design and implementation, and discuss its potential for reuse across research and education in mathematics and physics.

2024-09-05T03:40:27Z 4 pages, final version published in JORS. NOTE: The software has been updated since this publication. Full and up-to-date documentation and source code for the latest version are available at https://github.com/bshoshany/OGRePy Journal of Open Research Software, 13: 9 (2025) Barak Shoshany 10.5334/jors.558 http://arxiv.org/abs/2512.20015v1 HeylandCircle: A Computational Framework for the Geometric Reconstruction of the Heyland Circle Diagram 2025-12-23T03:15:50Z

The Heyland circle diagram is a classical graphical tool for representing the steady-state behavior of induction machines using no-load and blocked-rotor test data. While widely used in alternating-current machinery texts, the diagram is typically presented as a hand-constructed aid and lacks a standardized computational formulation. This paper presents HeylandCircle, a computational framework that reconstructs the classical Heyland circle diagram directly from standard test parameters. The framework formalizes the traditional geometric construction as a deterministic, reproducible sequence of geometric operations, establishing a clear mapping between measured data, fixed geometric objects, and steady-state operating points. Quantities such as power factor, slip, output power, torque, and efficiency are obtained through explicit geometric relationships on the constructed diagram. Validation using a representative textbook example demonstrates close agreement with classical results. The framework provides a computational realization of the traditional Heyland diagram suitable for instruction, analysis, and systematic extension.

2025-12-23T03:15:50Z 8 pages, 1 figure Anubhav Gupta Abhinav Gupta http://arxiv.org/abs/2512.22215v1 SPUMA: a minimally invasive approach to the GPU porting of OPENFOAM 2025-12-22T10:04:36Z

High Performance Computing (HPC) on hybrid clusters represents a significant opportunity for Computational Fluid Dynamics (CFD), especially when modern accelerators are utilized effectively. However, despite the widespread adoption of GPUs, programmability remains a challenge, particularly in open-source contexts. In this paper, we present SPUMA, a full GPU porting of OPENFOAM targeting NVIDIA and AMD GPUs. The implementation strategy is based on a portable programming model and the adoption of a memory pool manager that leverages the unified memory feature of modern GPUs. This approach is discussed alongside several numerical tests conducted on two pre-exascale clusters in Europe, LUMI and Leonardo, which host AMD MI250X and NVIDIA A100 GPUs, respectively. In the performance analysis section, we present results related to memory usage profiling and kernel wall-time, the impact of the memory pool, and energy consumption obtained by simulating the well-known DrivAer industrial test case. GPU utilization strongly affects strong scalability results, reaching 65% efficiency on both LUMI and Leonardo when approaching a load of 8 million cells per GPU. Weak scalability results, obtained on 20 GPUs with the OpenFOAM native multigrid solver, range from 75% on Leonardo to 85% on LUMI. Notably, efficiency is no lower than 90% when switching to the NVIDIA AmgX linear algebra solver. Our tests also reveal that one A100 GPU on Leonardo is equivalent 200-300 Intel Sapphire Rapids cores, provided the GPUs are sufficiently oversubscribed (more than 10 million of cells per GPU). Finally, energy consumption is reduced by up to 82% compared to analogous simulations executed on CPUs.

2025-12-22T10:04:36Z 43 pages Simone Bnà Giuseppe Giaquinto Ettore Fadiga Tommaso Zanelli Francesco Bottau http://arxiv.org/abs/2406.06095v4 An extension of C++ with memory-centric specifications for HPC to reduce memory footprints and streamline MPI development 2025-12-20T09:40:31Z

C++ leans towards a memory-inefficient storage of structs: The compiler inserts padding bits, while it is not able to exploit knowledge about the range of integers, enums or bitsets. Furthermore, the language provides no support for arbitrary floating-point precisions. We propose a language extension based upon attributes through which developers can guide the compiler what memory arrangements would be beneficial: Can multiple booleans or integers with limited range be squeezed into one bit field, do floating-point numbers hold fewer significant bits than in the IEEE standard, and is a programmer willing to trade attribute ordering guarantees for a more compact object representation? The extension offers the opportunity to fall back to normal alignment and native C++ floating point representations via plain C++ assignments, no dependencies upon external libraries are introduced, and the resulting code remains (syntactically) standard C++. As MPI remains the de-facto standard for distributed memory calculations in C++, we furthermore propose additional attributes which streamline the MPI datatype modelling in combination with our memory optimisation extensions. Our work implements the language annotations within LLVM and demonstrates their potential impact through smoothed particle hydrodynamics benchmarks. They uncover the potential gains in terms of performance and development productivity.

2024-06-10T08:26:27Z Pawel K. Radtke Cristian G. Barrera-Hinojosa Mladen Ivkovic Tobias Weinzierl