https://arxiv.org/api/W8wZ7kzC8hm+xuE1d85+Cshnd60 2026-06-22T17:21:16Z 2664 495 15 http://arxiv.org/abs/2210.14219v2 Redistributor: Transforming Empirical Data Distributions 2024-07-05T22:18:53Z We present an algorithm and package, Redistributor, which forces a collection of scalar samples to follow a desired distribution. When given independent and identically distributed samples of some random variable $S$ and the continuous cumulative distribution function of some desired target $T$, it provably produces a consistent estimator of the transformation $R$ which satisfies $R(S)=T$ in distribution. As the distribution of $S$ or $T$ may be unknown, we also include algorithms for efficiently estimating these distributions from samples. This allows for various interesting use cases in image processing, where Redistributor serves as a remarkably simple and easy-to-use tool that is capable of producing visually appealing results. For color correction it outperforms other model-based methods and excels in achieving photorealistic style transfer, surpassing deep learning methods in content preservation. The package is implemented in Python and is optimized to efficiently handle large datasets, making it also suitable as a preprocessing step in machine learning. The source code is available at https://github.com/paloha/redistributor. 2022-10-25T17:59:03Z 16 pages, 13 figures - Added more use cases and comparisons with other methods Pavol Harar Dennis Elbrächter Monika Dörfler Kory D. Johnson http://arxiv.org/abs/2407.13712v1 Enabling MPI communication within Numba/LLVM JIT-compiled Python code using numba-mpi v1.0 2024-07-01T17:12:34Z The numba-mpi package offers access to the Message Passing Interface (MPI) routines from Python code that uses the Numba just-in-time (JIT) compiler. As a result, high-performance and multi-threaded Python code may utilize MPI communication facilities without leaving the JIT-compiled code blocks, which is not possible with the mpi4py package, a higher-level Python interface to MPI. For debugging purposes, numba-mpi retains full functionality of the code even if the JIT compilation is disabled. The numba-mpi API constitutes a thin wrapper around the C API of MPI and is built around Numpy arrays including handling of non-contiguous views over array slices. Project development is hosted at GitHub leveraging the mpi4py/setup-mpi workflow enabling continuous integration tests on Linux (MPICH, OpenMPI & Intel MPI), macOS (MPICH & OpenMPI) and Windows (MS MPI). The paper covers an overview of the package features, architecture and performance. As of v1.0, the following MPI routines are exposed and covered by unit tests: size/rank, [i]send/[i]recv, wait[all|any], test[all|any], allreduce, bcast, barrier, scatter/[all]gather & wtime. The package is implemented in pure Python and depends on numpy, numba and mpi4py (the latter used at initialization and as a source of utility routines only). The performance advantage of using numba-mpi compared to mpi4py is depicted with a simple example, with entirety of the code included in listings discussed in the text. Application of numba-mpi for handling domain decomposition in numerical solvers for partial differential equations is presented using two external packages that depend on numba-mpi: py-pde and PyMPDATA-MPI. 2024-07-01T17:12:34Z SoftwareX 28, 101897 (2024) Kacper Derlatka Maciej Manna Oleksii Bulenok David Zwicker Sylwester Arabas 10.1016/j.softx.2024.101897 http://arxiv.org/abs/2407.00746v1 Structured Sketching for Linear Systems 2024-06-30T16:12:36Z For linear systems $Ax=b$ we develop iterative algorithms based on a sketch-and-project approach. By using judicious choices for the sketch, such as the history of residuals, we develop weighting strategies that enable short recursive formulas. The proposed algorithms have a low memory footprint and iteration complexity compared to regular sketch-and-project methods. In a set of numerical experiments the new methods compare well to GMRES, SYMMLQ and state-of-the-art randomized solvers. 2024-06-30T16:12:36Z Johannes J Brust Michael A Saunders http://arxiv.org/abs/2406.16154v1 Automating Variational Differentiation 2024-06-23T16:28:46Z Many problems in Physics and Chemistry are formulated as the minimization of a functional. Therefore, methods for solving these problems typically require differentiating maps whose input and/or output are functions -- commonly referred to as variational differentiation. Such maps are not addressed at the mathematical level by the chain rule, which underlies modern symbolic and algorithmic differentiation (AD) systems. Although there are algorithmic solutions such as tracing and reverse accumulation, they do not provide human readability and introduce strict programming constraints that bottleneck performance, especially in high-performance computing (HPC) environments. In this manuscript, we propose a new computer theoretic model of differentiation by combining the pullback of the $\mathbf{B}$ and $\mathbf{C}$ combinators from the combinatory logic. Unlike frameworks based on the chain rule, this model differentiates a minimal complete basis for the space of computable functions. Consequently, the model is capable of analytic backpropagation and variational differentiation while supporting complex numbers. To demonstrate the generality of this approach we build a system named CombDiff, which can differentiate nontrivial variational problems such as Hartree-Fock (HF) theory and multilayer perceptrons. 2024-06-23T16:28:46Z 16 pages Kangbo Li Anil Damle http://arxiv.org/abs/2406.15559v1 Introducing Moment: A toolkit for semi-definite programming with moment matrices 2024-06-21T18:00:08Z Non-commutative polynomial optimization is a powerful technique with numerous applications in quantum nonlocality, quantum key distribution, causal inference, many-body physics, amongst others. The standard approach is to reduce such optimizations to a hierarchy of semi-definite programs, which can be solved numerically using well-understood interior-point methods. A key, but computationally costly, step is the formulation of moment matrices, whose size (and hence cost) grows exponentially with the depth of the hierarchy. It is therefore essential to have highly-optimized software to construct moment matrices. Here, we introduce Moment: a toolkit that produces moment matrix relaxations from the specification of a non-commutative optimization problem. In order to obtain the absolute best performance, Moment is written in C++, and for convenience of use provides an interface via MATLAB. We benchmark Moment's performance, and see that it can be up to four orders of magnitude faster than current software with similar functionality. 2024-06-21T18:00:08Z 49 + 13 pages, 4 figures, 5 tables Andrew J. P. Garner Mateus Araújo http://arxiv.org/abs/2406.14726v1 Utilization of G-Programming Language for Educational Control Application: Case Study of Magnetic Levitation of Elastic Beam 2024-06-20T20:44:10Z This paper presents the practical employment of G-Programming tools to demonstrate, design, and implement traditional control algorithms upon magnetic levitation system. The complexity of controlling this type of fast dynamic and sensitive system is vital for highlighting the capabilities of LabVIEW G-programming in control education. PID and Lead-Lag controllers are designed and implemented within the LabVIEW environment, with the ability to tune and optimize the controllers utilizing the Virtual Instruments (VIs) of the control design and simulation toolkit. The paper enables the reader to understand the modelling, testing the control action, and dynamic simulation of the system. Then, deploying the control law in real time. It can be concluded that the G programming shows a suitable and easy tool for facilitating hands-on, experiential learning and validation in control systems engineering. 2024-06-20T20:44:10Z Abdallah Amr Mostafa Eshra Ayman A. Nada http://arxiv.org/abs/2405.16883v2 Scorch: A Library for Sparse Deep Learning 2024-06-20T06:24:23Z The rapid growth in the size of deep learning models strains the capabilities of traditional dense computation paradigms. Leveraging sparse computation has become increasingly popular for training and deploying large-scale models, but existing deep learning frameworks lack extensive support for sparse operations. To bridge this gap, we introduce Scorch, a library that seamlessly integrates efficient sparse tensor computation into the PyTorch ecosystem, with an initial focus on inference workloads on CPUs. Scorch provides a flexible and intuitive interface for sparse tensors, supporting diverse sparse data structures. Scorch introduces a compiler stack that automates key optimizations, including automatic loop ordering, tiling, and format inference. Combined with a runtime that adapts its execution to both dense and sparse data, Scorch delivers substantial speedups over hand-written PyTorch Sparse (torch.sparse) operations without sacrificing usability. More importantly, Scorch enables efficient computation of complex sparse operations that lack hand-optimized PyTorch implementations. This flexibility is crucial for exploring novel sparse architectures. We demonstrate Scorch's ease of use and performance gains on diverse deep learning models across multiple domains. With only minimal code changes, Scorch achieves 1.05-5.78x speedups over PyTorch Sparse on end-to-end tasks. Scorch's seamless integration and performance gains make it a valuable addition to the PyTorch ecosystem. We believe Scorch will enable wider exploration of sparsity as a tool for scaling deep learning and inform the development of other sparse libraries. 2024-05-27T06:59:20Z 25 pages, 8 figures Bobby Yan Alexander J. Root Trevor Gale David Broman Fredrik Kjolstad http://arxiv.org/abs/2301.12659v2 GPU Accelerated Newton for Taylor Series Solutions of Polynomial Homotopies in Multiple Double Precision 2024-06-18T22:20:02Z A polynomial homotopy is a family of polynomial systems, typically in one parameter $t$. Our problem is to compute power series expansions of the coordinates of the solutions in the parameter $t$, accurately, using multiple double arithmetic. One application of this problem is the location of the nearest singular solution in a polynomial homotopy, via the theorem of Fabry. Power series serve as input to construct Padé approximations. Exploiting the massive parallelism of Graphics Processing Units capable of performing several trillions floating-point operations per second, the objective is to compensate for the cost overhead caused by arithmetic with power series in multiple double precision. The application of Newton's method for this problem requires the evaluation and differentiation of polynomials, followed by solving a blocked lower triangular linear system. Experimental results are obtained on NVIDIA GPUs, in particular the RTX 2080, RTX 4080, P100, V100, and A100. Code generated by the CAMPARY software is used to obtain results in double double, quad double, and octo double precision. The programs in this study are self contained, available in a public github repository under the GPL-v3.0 License. 2023-01-30T04:41:28Z Accepted by CASC 2024, the 27th International Workshop on Computer Algebra in Scientific Computing Jan Verschelde http://arxiv.org/abs/2201.04040v2 PEPit: computer-assisted worst-case analyses of first-order optimization methods in Python 2024-06-17T16:40:20Z PEPit is a Python package aiming at simplifying the access to worst-case analyses of a large family of first-order optimization methods possibly involving gradient, projection, proximal, or linear optimization oracles, along with their approximate, or Bregman variants. In short, PEPit is a package enabling computer-assisted worst-case analyses of first-order optimization methods. The key underlying idea is to cast the problem of performing a worst-case analysis, often referred to as a performance estimation problem (PEP), as a semidefinite program (SDP) which can be solved numerically. To do that, the package users are only required to write first-order methods nearly as they would have implemented them. The package then takes care of the SDP modeling parts, and the worst-case analysis is performed numerically via a standard solver. 2022-01-11T16:35:22Z Reference work for the PEPit package (available at https://github.com/bgoujaud/PEPit) Baptiste Goujaud Céline Moucer François Glineur Julien Hendrickx Adrien Taylor Aymeric Dieuleveut http://arxiv.org/abs/2406.10862v1 OpenCAEPoro: A Parallel Simulation Framework for Multiphase and Multicomponent Porous Media Flows 2024-06-16T09:12:54Z OpenCAEPoro is a parallel numerical simulation software developed in C++ for simulating multiphase and multicomponent flows in porous media. The software utilizes a set of general-purpose compositional model equations, enabling it to handle a diverse range of fluid dynamics, including the black oil model, compositional model, and thermal recovery models. OpenCAEPoro establishes a unified solving framework that integrates many widely used methods, such as IMPEC, FIM, and AIM. This framework allows dynamic collaboration between different methods. Specifically, based on this framework, we have developed an adaptively coupled domain decomposition method, which can provide initial solutions for global methods to accelerate the simulation. The reliability of OpenCAEPoro has been validated through benchmark testing with the SPE comparative solution project. Furthermore, its robust parallel efficiency has been tested in distributed parallel environments, demonstrating its suitability for large-scale simulation problems. 2024-06-16T09:12:54Z 29 pages, 19 figures Shizhe Li Chen-Song Zhang http://arxiv.org/abs/2406.10437v1 Learning from landmarks, curves, surfaces, and shapes in Geomstats 2024-06-14T22:59:03Z We introduce the shape module of the Python package Geomstats to analyze shapes of objects represented as landmarks, curves and surfaces across fields of natural sciences and engineering. The shape module first implements widely used shape spaces, such as the Kendall shape space, as well as elastic spaces of discrete curves and surfaces. The shape module further implements the abstract mathematical structures of group actions, fiber bundles, quotient spaces and associated Riemannian metrics which allow users to build their own shape spaces. The Riemannian geometry tools enable users to compare, average, interpolate between shapes inside a given shape space. These essential operations can then be leveraged to perform statistics and machine learning on shape data. We present the object-oriented implementation of the shape module along with illustrative examples and show how it can be used to perform statistics and machine learning on shape spaces. 2024-06-14T22:59:03Z Luís F. Pereira Alice Le Brigant Adele Myers Emmanuel Hartman Amil Khan Malik Tuerkoen Trey Dold Mengyang Gu Pablo Suárez-Serrato Nina Miolane http://arxiv.org/abs/2406.10301v1 12 Labours tools for developing Functional Tissue Units 2024-06-13T22:55:40Z A brief introduction of the technical approach to model FTUs as an aggregate of cells, whose state transition dynamics are mathematically represented as port-hamiltonians or Differential Algebraic equations is presented. A python library and browser based tool to enable modellers to compose the FTU graph, specify the cellular equations and the interconnection between the cells at the level of physical quantities they exchange consistent with the technical approach is discussed. 2024-06-13T22:55:40Z Jagir R. Hussan http://arxiv.org/abs/2406.09085v1 A Symbolic Computing Perspective on Software Systems 2024-06-13T13:10:47Z Symbolic mathematical computing systems have served as a canary in the coal mine of software systems for more than sixty years. They have introduced or have been early adopters of programming language ideas such ideas as dynamic memory management, arbitrary precision arithmetic and dependent types. These systems have the feature of being highly complex while at the same time operating in a domain where results are well-defined and clearly verifiable. These software systems span multiple layers of abstraction with concerns ranging from instruction scheduling and cache pressure up to algorithmic complexity of constructions in algebraic geometry. All of the major symbolic mathematical computing systems include low-level code for arithmetic, memory management and other primitives, a compiler or interpreter for a bespoke programming language, a library of high level mathematical algorithms, and some form of user interface. Each of these parts invokes multiple deep issues. We present some lessons learned from this environment and free flowing opinions on topics including: * Portability of software across architectures and decades; * Infrastructure to embrace and infrastructure to avoid; * Choosing base abstractions upon which to build; * How to get the most out of a small code base; * How developments in compilers both to optimise and to validate code have always been and remain of critical importance, with plenty of remaining challenges; * The way in which individuals including in particular Alan Mycroft who has been able to span from hand-crafting Z80 machine code up to the most abstruse high level code analysis techniques are needed, and * Why it is important to teach full-stack thinking to the next generation. 2024-06-13T13:10:47Z Arthur C. Norman Stephen M. Watt http://arxiv.org/abs/2406.08186v1 Hiperwalk: Simulation of Quantum Walks with Heterogeneous High-Performance Computing 2024-06-12T13:17:05Z The Hiperwalk package is designed to facilitate the simulation of quantum walks using heterogeneous high-performance computing, taking advantage of the parallel processing power of diverse processors such as CPUs, GPUs, and acceleration cards. This package enables the simulation of both the continuous-time and discrete-time quantum walk models, effectively modeling the behavior of quantum systems on large graphs. Hiperwalk features a user-friendly Python package frontend with comprehensive documentation, as well as a high-performance C-based inner core that leverages parallel computing for efficient linear algebra calculations. This versatile tool empowers researchers to better understand quantum walk behavior, optimize implementation, and explore a wide range of potential applications, including spatial search algorithms. 2024-06-12T13:17:05Z 16 pages, 6 figures Proceedings of Quantum Week 2023 Paulo Motta Gustavo A. Bezerra Anderson F. P. Santos Renato Portugal 10.1109/QCE57702.2023.00055 http://arxiv.org/abs/2406.07751v1 A square root algorithm faster than Newton's method for multiprecision numbers, using floating-point arithmetic 2024-06-11T22:22:13Z In this paper, an optimized version of classical Bombelli's algorithm for computing integer square roots is presented. In particular, floating-point arithmetic is used to compute the initial guess of each digit of the root, following similar ideas to those used in "The Art of Computer Programming" Vol. 2, p. 4.3.1 for division. A program with an implementation of the algorithm in Java is also presented, and its running time is compared with that of the algorithm provided by the Java standard library, which uses the Newton's method. From tests, the algorithm presented here turns out to be much faster. 2024-06-11T22:22:13Z 28 pages Fabio Romano