https://arxiv.org/api/4DF8NG+qlkkA3TLFrmWTE5tcD+A2026-06-22T09:10:24Z266439015http://arxiv.org/abs/2412.12361v2The Ramanujan Library -- Automated Discovery on the Hypergraph of Integer Relations2025-01-19T10:51:22ZFundamental mathematical constants appear in nearly every field of science, from physics to biology. Formulas that connect different constants often bring great insight by hinting at connections between previously disparate fields. Discoveries of such relations, however, have remained scarce events, relying on sporadic strokes of creativity by human mathematicians. Recent developments of algorithms for automated conjecture generation have accelerated the discovery of formulas for specific constants. Yet, the discovery of connections between constants has not been addressed. In this paper, we present the first library dedicated to mathematical constants and their interrelations. This library can serve as a central repository of knowledge for scientists from different areas, and as a collaborative platform for development of new algorithms. The library is based on a new representation that we propose for organizing the formulas of mathematical constants: a hypergraph, with each node representing a constant and each edge representing a formula. Using this representation, we propose and demonstrate a systematic approach for automatically enriching this library using PSLQ, an integer relation algorithm based on QR decomposition and lattice construction. During its development and testing, our strategy led to the discovery of 75 previously unknown connections between constants, including a new formula for the `first continued fraction' constant $C_1$, novel formulas for natural logarithms, and new formulas connecting $π$ and $e$. The latter formulas generalize a century-old relation between $π$ and $e$ by Ramanujan, which until now was considered a singular formula and is now found to be part of a broader mathematical structure. The code supporting this library is a public, open-source API that can serve researchers in experimental mathematics and other fields of science.2024-12-16T21:18:44Z20 pages, 7 figuresICLR 2025 ConferenceItay Beit-HalachmiIdo Kaminerhttp://arxiv.org/abs/2501.08395v1A comparison of two effective methods for reordering columns within supernodes2025-01-14T19:25:19ZIn some recent papers, researchers have found two very good methods for reordering columns within supernodes in sparse Cholesky factors; these reorderings can be very useful for certain factorization methods. The first of these reordering methods is based on modeling the underlying problem as a traveling salesman problem (TSP), and the second of these methods is based on partition refinement (PR). In this paper, we devise a fair way to compare the two methods. While the two methods are virtually the same in the quality of the reorderings that they produce, PR should be the method of choice because PR reorderings can be computed using far less time and storage than TSP reorderings.2025-01-14T19:25:19ZM. Ozan KarsavuranEsmond G. NgBarry W. Peytonhttp://arxiv.org/abs/2304.06881v3Designing a Framework for Solving Multiobjective Simulation Optimization Problems2025-01-09T21:36:01ZMultiobjective simulation optimization (MOSO) problems are optimization problems with multiple conflicting objectives, where evaluation of at least one of the objectives depends on a black-box numerical code or real-world experiment, which we refer to as a simulation. While an extensive body of research is dedicated to developing new algorithms and methods for solving these and related problems, it is challenging and time consuming to integrate these techniques into real world production-ready solvers. This is partly due to the diversity and complexity of modern state-of-the-art MOSO algorithms and methods and partly due to the complexity and specificity of many real-world problems and their corresponding computing environments. The complexity of this problem is only compounded when introducing potentially complex and/or domain-specific surrogate modeling techniques, problem formulations, design spaces, and data acquisition functions. This paper carefully surveys the current state-of-the-art in MOSO algorithms, techniques, and solvers; as well as problem types and computational environments where MOSO is commonly applied. We then present several key challenges in the design of a Parallel Multiobjective Simulation Optimization framework (ParMOO) and how they have been addressed. Finally, we provide two case studies demonstrating how customized ParMOO solvers can be quickly built and deployed to solve real-world MOSO problems.2023-04-14T01:16:53ZTyler H. ChangStefan M. Wild10.1287/ijoc.2023.0250http://arxiv.org/abs/2501.03398v1Rapid Experimentation with Python Considering Optional and Hierarchical Inputs2025-01-06T21:36:43ZSpace-filling experimental design techniques are commonly used in many computer modeling and simulation studies to explore the effects of inputs on outputs. This research presents raxpy, a Python package that leverages expressive annotation of Python functions and classes to simplify space-filling experimentation. It incorporates code introspection to derive a Python function's input space and novel algorithms to automate the design of space-filling experiments for spaces with optional and hierarchical input dimensions. In this paper, we review the criteria for design evaluation given these types of dimensions and compare the proposed algorithms with numerical experiments. The results demonstrate the ability of the proposed algorithms to create improved space-filling experiment designs. The package includes support for parallelism and distributed execution. raxpy is available as free and open-source software under a MIT license.2025-01-06T21:36:43ZNeil RanlyTorrey Wagnerhttp://arxiv.org/abs/2312.13094v5Automated MPI-X code generation for scalable finite-difference solvers2025-01-06T17:33:30ZPartial differential equations (PDEs) are crucial in modeling diverse phenomena across scientific disciplines, including seismic and medical imaging, computational fluid dynamics, image processing, and neural networks. Solving these PDEs at scale is an intricate and time-intensive process that demands careful tuning. This paper introduces automated code-generation techniques specifically tailored for distributed memory parallelism (DMP) to execute explicit finite-difference (FD) stencils at scale, a fundamental challenge in numerous scientific applications. These techniques are implemented and integrated into the Devito DSL and compiler framework, a well-established solution for automating the generation of FD solvers based on a high-level symbolic math input. Users benefit from modeling simulations for real-world applications at a high-level symbolic abstraction and effortlessly harnessing HPC-ready distributed-memory parallelism without altering their source code. This results in drastic reductions both in execution time and developer effort. A comprehensive performance evaluation of Devito's DMP via MPI demonstrates highly competitive strong and weak scaling on CPU and GPU clusters, proving its effectiveness and capability to meet the demands of large-scale scientific simulations.2023-12-20T15:15:56Z11 pages, 12 figures (23 pages with References and Appendix)George BisbasRhodri NelsonMathias LouboutinFabio LuporiniPaul H. J. KellyGerard Gormanhttp://arxiv.org/abs/2501.02573v1LeetDecoding: A PyTorch Library for Exponentially Decaying Causal Linear Attention with CUDA Implementations2025-01-05T15:11:26ZThe machine learning and data science community has made significant while dispersive progress in accelerating transformer-based large language models (LLMs), and one promising approach is to replace the original causal attention in a generative pre-trained transformer (GPT) with \emph{exponentially decaying causal linear attention}. In this paper, we present LeetDecoding, which is the first Python package that provides a large set of computation routines for this fundamental operator. The launch of LeetDecoding was motivated by the current lack of (1) clear understanding of the complexity regarding this operator, (2) a comprehensive collection of existing computation methods (usually spread in seemingly unrelated fields), and (3) CUDA implementations for fast inference on GPU. LeetDecoding's design is easy to integrate with existing linear-attention LLMs, and allows for researchers to benchmark and evaluate new computation methods for exponentially decaying causal linear attention. The usage of LeetDecoding does not require any knowledge of GPU programming and the underlying complexity analysis, intentionally making LeetDecoding accessible to LLM practitioners. The source code of LeetDecoding is provided at \href{https://github.com/Computational-Machine-Intelligence/LeetDecoding}{this GitHub repository}, and users can simply install LeetDecoding by the command \texttt{pip install leet-decoding}.2025-01-05T15:11:26ZThe source code of LeetDecoding is hosted at https://github.com/Computational-Machine-Intelligence/LeetDecodingJiaping WangSimiao ZhangQiao-Chu HeYifan Chenhttp://arxiv.org/abs/2403.02237v2Analytic continuations and numerical evaluation of the Appell $F_1$, $F_3$, Lauricella $F_D^{(3)}$ and Lauricella-Saran $F_S^{(3)}$ and their Application to Feynman Integrals2025-01-01T10:03:37ZWe present our investigation of the study of two variable hypergeometric series, namely Appell $F_{1}$ and $F_{3}$ series, and obtain a comprehensive list of its analytic continuations enough to cover the whole real $(x,y)$ plane, except on their singular loci. We also derive analytic continuations of their 3-variable generalization, the Lauricella $F_{D}^{(3)}$ series and the Lauricella-Saran $F_{S}^{(3)}$ series, leveraging the analytic continuations of $F_{1}$ and $F_{3}$, which ensures that the whole real $(x,y,z)$ space is covered, except on the singular loci of these functions. While these studies are motivated by the frequent occurrence of these multivariable hypergeometric functions in Feynman integral evaluation, they can also be used whenever they appear in other branches of mathematical physics. To facilitate their practical use, we provide four packages: AppellF1$.$wl, AppellF3$.$wl, LauricellaFD$.$wl, and LauricellaSaranFS$.$wl in MATHEMATICA. These packages are applicable for generic as well as non-generic values of parameters, keeping in mind their utilities in the evaluation of the Feynman integrals. We explicitly present various physical applications of these packages in the context of Feynman integral evaluation and compare the results using other packages such as FIESTA. Upon applying the appropriate conventions for numerical evaluation, we find that the results obtained from our packages are consistent. Various Mathematica notebooks demonstrating different numerical results are also provided along with this paper.2024-03-04T17:29:25ZJournal Version, Repository see https://github.com/souvik5151/Appell_Lauricella_Saran_functionsComput.Phys.Commun. 306 (2025) 109386Souvik BeraTanay Pathak10.1016/j.cpc.2024.109386http://arxiv.org/abs/2303.04353v2Cascading GEMM: High Precision from Low Precision2024-12-30T16:01:27ZThis paper lays out insights and opportunities for implementing higher-precision matrix-matrix multiplication (GEMM) from (in terms of) lower-precision high-performance GEMM. The driving case study approximates double-double precision (FP64x2) GEMM in terms of double precision (FP64) GEMM, leveraging how the BLAS-like Library Instantiation Software (BLIS) framework refactors the Goto Algorithm. With this, it is shown how approximate FP64x2 GEMM accuracy can be cast in terms of ten ``cascading'' FP64 GEMMs. Promising results from preliminary performance and accuracy experiments are reported. The demonstrated techniques open up new research directions for more general cascading of higher-precision computation in terms of lower-precision computation for GEMM-like functionality.2023-03-08T03:26:12Z26 pages, 9 figuresDevangi N. ParikhRobert A. van de GeijnGreg M. Henryhttp://arxiv.org/abs/2408.07843v2Portability of Fortran's `do concurrent' on GPUs2024-12-23T22:03:41ZThere is a continuing interest in using standard language constructs for accelerated computing in order to avoid (sometimes vendor-specific) external APIs. For Fortran codes, the {\tt do concurrent} (DC) loop has been successfully demonstrated on the NVIDIA platform. However, support for DC on other platforms has taken longer to implement. Recently, Intel has added DC GPU offload support to its compiler, as has HPE for AMD GPUs. In this paper, we explore the current portability of using DC across GPU vendors using the in-production solar surface flux evolution code, HipFT. We discuss implementation and compilation details, including when/where using directive APIs for data movement is needed/desired compared to using a unified memory system. The performance achieved on both data center and consumer platforms is shown.2024-08-14T22:45:46Z10 pages, 7 figures, To appear in the workshop proceedings of WACCPD24 at SC24Ronald M. CaplanMiko M. StulajterJon A. LinkerJeff LarkinHenry A. GabbShiquan SuIvan RodriguezZachary TschirhartNicholas Malaya10.1109/SCW63240.2024.00240http://arxiv.org/abs/2412.17265v1Evaluating the Design Features of an Intelligent Tutoring System for Advanced Mathematics Learning2024-12-23T04:22:05ZXiaomai is an intelligent tutoring system (ITS) designed to help Chinese college students in learning advanced mathematics and preparing for the graduate school math entrance exam. This study investigates two distinctive features within Xiaomai: the incorporation of free-response questions with automatic feedback and the metacognitive element of reflecting on self-made errors.2024-12-23T04:22:05ZYing FangBo HeZhi LiuSannyuya LiuZhonghua YanJianwen Sun10.1007/978-3-031-64302-6_24http://arxiv.org/abs/2410.10908v2The State of Julia for Scientific Machine Learning2024-12-20T08:17:23ZJulia has been heralded as a potential successor to Python for scientific machine learning and numerical computing, boasting ergonomic and performance improvements. Since Julia's inception in 2012 and declaration of language goals in 2017, its ecosystem and language-level features have grown tremendously. In this paper, we take a modern look at Julia's features and ecosystem, assess the current state of the language, and discuss its viability and pitfalls as a replacement for Python as the de-facto scientific machine learning language. We call for the community to address Julia's language-level issues that are preventing further adoption.2024-10-14T01:43:23ZPresented at the 2024 NeurIPS Machine Learning and the Physical Sciences WorkshopEdward BermanJacob Ginesinhttp://arxiv.org/abs/2310.19051v4Typical Algorithms for Estimating Hurst Exponent of Time Sequence: A Data Analyst's Perspective2024-12-20T01:30:02ZThe Hurst exponent is a significant metric for characterizing time sequences with long-term memory property and it arises in many fields. The available methods for estimating the Hurst exponent can be categorized into time-domain and spectrum-domain methods. Although there are various estimation methods for the Hurst exponent, there are still some disadvantages that should be overcome: firstly, the estimation methods are mathematics-oriented instead of engineering-oriented; secondly, the accuracy and effectiveness of the estimation algorithms are inadequately assessed; thirdly, the framework of classification for the estimation methods are insufficient; and lastly there is a lack of clear guidance for selecting proper estimation in practical problems involved in data analysis. The contributions of this paper lie in four aspects: 1) the optimal sequence partition method is proposed for designing the estimation algorithms for Hurst exponent; 2) the algorithmic pseudo-codes are adopted to describe the estimation algorithms, which improves the understandability and usability of the estimation methods and also reduces the difficulty of implementation with computer programming languages; 3) the performance assessment is carried for the typical estimation algorithms via the ideal time sequence with given Hurst exponent and the practical time sequence captured in applications; 4) the guidance for selecting proper algorithms for estimating the Hurst exponent is presented and discussed. It is expected that the systematic survey of available estimation algorithms could help the users to understand the principles and the assessment of the various estimation methods could help the users to select, implement and apply the estimation algorithms of interest in practical situations in an easy way.2023-10-29T15:56:53Z46 pages, 8 figures, 4 tables, 24 algorithms with pseudo-codesIEEE Access, 12(12): 185528--185556, 2024Hong-Yan ZhangZhi-Qiang FengSi-Yu FengYu Zhou10.1109/ACCESS.2024.3512542http://arxiv.org/abs/2412.14572v1Accelerated Patient-Specific Calibration via Differentiable Hemodynamics Simulations2024-12-19T06:42:57ZOne of the goals of personalized medicine is to tailor diagnostics to individual patients. Diagnostics are performed in practice by measuring quantities, called biomarkers, that indicate the existence and progress of a disease. In common cardiovascular diseases, such as hypertension, biomarkers that are closely related to the clinical representation of a patient can be predicted using computational models. Personalizing computational models translates to considering patient-specific flow conditions, for example, the compliance of blood vessels that cannot be a priori known and quantities such as the patient geometry that can be measured using imaging. Therefore, a patient is identified by a set of measurable and nonmeasurable parameters needed to well-define a computational model; else, the computational model is not personalized, meaning it is prone to large prediction errors. Therefore, to personalize a computational model, sufficient information needs to be extracted from the data. The current methods by which this is done are either inefficient, due to relying on slow-converging optimization methods, or hard to interpret, due to using `black box` deep-learning algorithms. We propose a personalized diagnostic procedure based on a differentiable 0D-1D Navier-Stokes reduced order model solver and fast parameter inference methods that take advantage of gradients through the solver. By providing a faster method for performing parameter inference and sensitivity analysis through differentiability while maintaining the interpretability of well-understood mathematical models and numerical methods, the best of both worlds is combined. The performance of the proposed solver is validated against a well-established process on different geometries, and different parameter inference processes are successfully performed.2024-12-19T06:42:57ZDiego RennerGeorgios Kissashttp://arxiv.org/abs/2308.12644v2EDOLAB: An Open-Source Platform for Education and Experimentation with Evolutionary Dynamic Optimization Algorithms2024-12-15T02:19:17ZMany real-world optimization problems exhibit dynamic characteristics, posing significant challenges for traditional optimization techniques. Evolutionary Dynamic Optimization Algorithms (EDOAs) are designed to address these challenges effectively. However, in existing literature, the reported results for a given EDOA can vary significantly. This inconsistency often arises because the source codes for many EDOAs, which are typically complex, have not been made publicly available, leading to error-prone re-implementations. To support researchers in conducting experiments and comparing their algorithms with various EDOAs, we have developed an open-source MATLAB platform called the Evolutionary Dynamic Optimization LABoratory (EDOLAB). This platform not only facilitates research but also includes an educational module designed for instructional purposes. The education module allows users to observe: a) a 2-dimensional problem space and its morphological changes following each environmental change, b) the behaviors of individuals over time, and c) how the EDOA responds to environmental changes and tracks the moving optimum. The current version of EDOLAB features 25 EDOAs and four fully parametric benchmark generators. The MATLAB source code for EDOLAB is publicly available and can be accessed from [https://github.com/Danial-Yazdani/EDOLAB-MATLAB].2023-08-24T08:37:32ZThis work was submitted to ACM Transactions on Mathematical Software on December 7, 2022Mai PengDelaram YazdaniZeneng SheDanial YazdaniWenjian LuoChanghe LiJuergen BrankeTrung Thanh NguyenAmir H. GandomiShengxiang YangYaochu JinXin Yaohttp://arxiv.org/abs/2412.10129v1TIGRE v3: Efficient and easy to use iterative computed tomographic reconstruction toolbox for real datasets2024-12-13T13:21:47ZComputed Tomography (CT) has been widely adopted in medicine and it is increasingly being used in scientific and industrial applications. Parallelly, research in different mathematical areas concerning discrete inverse problems has led to the development of new sophisticated numerical solvers that can be applied in the context of CT. The Tomographic Iterative GPU-based Reconstruction (TIGRE) toolbox was born almost a decade ago precisely in the gap between mathematics and high performance computing for real CT data, providing user-friendly open-source software tools for image reconstruction. However, since its inception, the tools' features and codebase have had over a twenty-fold increase, and are now including greater geometric flexibility, a variety of modern algorithms for image reconstruction, high-performance computing features and support for other CT modalities, like proton CT. The purpose of this work is two-fold: first, it provides a structured overview of the current version of the TIGRE toolbox, providing appropriate descriptions and references, and serving as a comprehensive and peer-reviewed guide for the user; second, it is an opportunity to illustrate the performance of several of the available solvers showcasing real CT acquisitions, which are typically not be openly available to algorithm developers.2024-12-13T13:21:47ZAnder BiguriTomoyuki SadakaneReuben LindroosYi LiuMalena Sabaté LandmanYi DuManasavee LohvitheeStefanie KaserSepideh HatamikiaRobert BryllEmilien ValatSarinrat WongleeThomas BlumensathCarola-Bibiane Schönlieb