https://arxiv.org/api//fo7UChCY1VWcz/6wdE74upz6C4 2026-06-22T10:10:13Z 2664 405 15 http://arxiv.org/abs/2411.10904v2 Invariant Polydiagonal Subspaces of Matrices and Constraint Programming 2024-12-13T02:29:19Z In a polydiagonal subspace of the Euclidean space, certain components of the vectors are equal (synchrony) or opposite (anti-synchrony). Polydiagonal subspaces invariant under a matrix have many applications in graph theory and dynamical systems, especially coupled cell networks. We describe invariant polydiagonal subspaces in terms of coloring vectors. This approach gives an easy formulation of a constraint satisfaction problem for finding invariant polydiagonal subspaces. Solving the resulting problem with existing state-of-the-art constraint solvers greatly outperforms the currently known algorithms. 2024-11-16T22:38:40Z 16 pages John M. Neuberger Nándor Sieben James W. Swift http://arxiv.org/abs/2410.08760v2 Unlocking FedNL: Self-Contained Compute-Optimized Implementation 2024-12-12T14:43:48Z Federated Learning (FL) is an emerging paradigm that enables intelligent agents to collaboratively train Machine Learning (ML) models in a distributed manner, eliminating the need for sharing their local data. The recent work (arXiv:2106.02969) introduces a family of Federated Newton Learn (FedNL) algorithms, marking a significant step towards applying second-order methods to FL and large-scale optimization. However, the reference FedNL prototype exhibits three serious practical drawbacks: (i) It requires 4.8 hours to launch a single experiment in a sever-grade workstation; (ii) The prototype only simulates multi-node setting; (iii) Prototype integration into resource-constrained applications is challenging. To bridge the gap between theory and practice, we present a self-contained implementation of FedNL, FedNL-LS, FedNL-PP for single-node and multi-node settings. Our work resolves the aforementioned issues and reduces the wall clock time by x1000. With this FedNL outperforms alternatives for training logistic regression in a single-node -- CVXPY (arXiv:1603.00943), and in a multi-node -- Apache Spark (arXiv:1505.06807), Ray/Scikit-Learn (arXiv:1712.05889). Finally, we propose two practical-orientated compressors for FedNL - adaptive TopLEK and cache-aware RandSeqK, which fulfill the theory of FedNL. 2024-10-11T12:19:18Z 55 pages, 12 figures, 12 tables Konstantin Burlachenko Peter Richtárik http://arxiv.org/abs/2412.05300v2 AD-HOC: A C++ Expression Template package for high-order derivatives backpropagation 2024-12-11T21:56:35Z This document presents a new C++ Automatic Differentiation (AD) tool, AD-HOC (Automatic Differentiation for High-Order Calculations). This tool aims to have the following features: -Calculation of user specified derivatives of arbitrary order -To be able to run with similar speeds as handwritten code -All derivatives calculations are computed in a single backpropagation tree pass -No source code generation is used, relying heavily on the C++ compiler to statically build the computation tree before runtime -A simple interface -The ability to be used \textit{in conjunction} with other established, general-purpose dynamic AD tools -Header-only library, with no external dependencies -Open source, with a business-friendly license 2024-11-25T10:24:29Z Juan Lucas Rey http://arxiv.org/abs/2412.07631v2 Direct Low-Dose CT Image Reconstruction on GPU using Out-Of-Core: Precision and Quality Study 2024-12-11T07:18:12Z Algebraic methods applied to the reconstruction of Sparse-view Computed Tomography (CT) can provide both a high image quality and a decrease in the dose received by patients, although with an increased reconstruction time since their computational costs are higher. In our work, we present a new algebraic implementation that obtains an exact solution to the system of linear equations that models the problem and based on single-precision floating-point arithmetic. By applying Out-Of-Core (OOC) techniques, the dimensions of the system can be increased regardless of the main memory size and as long as there is enough secondary storage (disk). These techniques have allowed to process images of 768 x 768 pixels. A comparative study of our method on a GPU using both single-precision and double-precision arithmetic has been carried out. The goal is to assess the single-precision arithmetic implementation both in terms of time improvement and quality of the reconstructed images to determine if it is sufficient to consider it a viable option. Results using single-precision arithmetic approximately halves the reconstruction time of the double-precision implementation, whereas the obtained images retain all internal structures despite having higher noise levels. 2024-12-10T16:11:51Z 22 pages, 12 figures, 9 tables M. Chillarón G. Quintana-Ortí V. Vidal G. Verdú http://arxiv.org/abs/2409.06752v3 A tutorial on automatic differentiation with complex numbers 2024-12-10T18:34:46Z Automatic differentiation is everywhere, but there exists only minimal documentation of how it works in complex arithmetic beyond stating "derivatives in $\mathbb{C}^d$" $\cong$ "derivatives in $\mathbb{R}^{2d}$" and, at best, shallow references to Wirtinger calculus. Unfortunately, the equivalence $\mathbb{C}^d \cong \mathbb{R}^{2d}$ becomes insufficient as soon as we need to derive custom gradient rules, e.g., to avoid differentiating "through" expensive linear algebra functions or differential equation simulators. To combat such a lack of documentation, this article surveys forward- and reverse-mode automatic differentiation with complex numbers, covering topics such as Wirtinger derivatives, a modified chain rule, and different gradient conventions while explicitly avoiding holomorphicity and the Cauchy--Riemann equations (which would be far too restrictive). To be precise, we will derive, explain, and implement a complex version of Jacobian-vector and vector-Jacobian products almost entirely with linear algebra without relying on complex analysis or differential geometry. This tutorial is a call to action, for users and developers alike, to take complex values seriously when implementing custom gradient propagation rules -- the manuscript explains how. 2024-09-10T14:04:58Z Nicholas Krämer http://arxiv.org/abs/2412.07556v1 Optimization-Driven Design of Monolithic Soft-Rigid Grippers 2024-12-10T14:47:09Z Sim-to-real transfer remains a significant challenge in soft robotics due to the unpredictability introduced by common manufacturing processes such as 3D printing and molding. These processes often result in deviations from simulated designs, requiring multiple prototypes before achieving a functional system. In this study, we propose a novel methodology to address these limitations by combining advanced rapid prototyping techniques and an efficient optimization strategy. Firstly, we employ rapid prototyping methods typically used for rigid structures, leveraging their precision to fabricate compliant components with reduced manufacturing errors. Secondly, our optimization framework minimizes the need for extensive prototyping, significantly reducing the iterative design process. The methodology enables the identification of stiffness parameters that are more practical and achievable within current manufacturing capabilities. The proposed approach demonstrates a substantial improvement in the efficiency of prototype development while maintaining the desired performance characteristics. This work represents a step forward in bridging the sim-to-real gap in soft robotics, paving the way towards a faster and more reliable deployment of soft robotic systems. 2024-12-10T14:47:09Z Soft Robotics, 2025 Pierluigi Mansueto Mihai Dragusanu Anjum Saeed Monica Malvezzi Matteo Lapucci Gionata Salvietti http://arxiv.org/abs/2412.10420v1 Monte Carlo Analysis of Boid Simulations with Obstacles: A Physics-Based Perspective 2024-12-10T04:34:23Z Boids, developed by Craig W. Reynolds in 1986, is one of the earliest emergent models where the global pattern emerges from the interaction between many individuals within the local scale. In the original model, Boids follow three rules: separation, alignment, and cohesion; which allow them to move around and create a flock without intention in the empty environment. In the real world, however, the Boids' movement also faces obstacles preventing the flock's direction. In this project, I propose two new simple rules of the Boids model to represent the more realistic movement in nature and analyze the model from the physics perspective using the Monte Carlo method. From those results, the physics metrics related to the forming of the flocking phenomenon show that it is reasonable to explain why birds or fishes prefer to move in a flock, rather than sole movement. 2024-12-10T04:34:23Z 6 pages, 2 figures Quoc Chuong Nguyen http://arxiv.org/abs/2402.02441v5 TopoX: A Suite of Python Packages for Machine Learning on Topological Domains 2024-12-09T02:29:37Z We introduce TopoX, a Python software suite that provides reliable and user-friendly building blocks for computing and machine learning on topological domains that extend graphs: hypergraphs, simplicial, cellular, path and combinatorial complexes. TopoX consists of three packages: TopoNetX facilitates constructing and computing on these domains, including working with nodes, edges and higher-order cells; TopoEmbedX provides methods to embed topological domains into vector spaces, akin to popular graph-based embedding algorithms such as node2vec; TopoModelX is built on top of PyTorch and offers a comprehensive toolbox of higher-order message passing functions for neural networks on topological domains. The extensively documented and unit-tested source code of TopoX is available under MIT license at https://pyt-team.github.io/}{https://pyt-team.github.io/. 2024-02-04T10:41:40Z Mustafa Hajij Mathilde Papillon Florian Frantzen Jens Agerberg Ibrahem AlJabea Rubén Ballester Claudio Battiloro Guillermo Bernárdez Tolga Birdal Aiden Brent Peter Chin Sergio Escalera Simone Fiorellino Odin Hoff Gardaa Gurusankar Gopalakrishnan Devendra Govil Josef Hoppe Maneel Reddy Karri Jude Khouja Manuel Lecha Neal Livesay Jan Meißner Soham Mukherjee Alexander Nikitin Theodore Papamarkou Jaro Prílepok Karthikeyan Natesan Ramamurthy Paul Rosen Aldo Guzmán-Sáenz Alessandro Salatiello Shreyas N. Samaga Simone Scardapane Michael T. Schaub Luca Scofano Indro Spinelli Lev Telyatnikov Quang Truong Robin Walters Maosheng Yang Olga Zaghen Ghada Zamzmi Ali Zia Nina Miolane http://arxiv.org/abs/2412.02882v2 iSEEtree: interactive explorer for hierarchical data 2024-12-05T07:11:03Z $\textbf{Motivation:}$ Hierarchical data structures are prevalent across several fields of research, as they represent an organised and efficient approach to study complex interconnected systems. Their significance is particularly evident in microbiome analysis, where microbial communities are classified at various taxonomic levels along the phylogenetic tree. In light of this trend, the R/Bioconductor community has established a reproducible analytical framework for hierarchical data, which relies on the highly generic and optimised TreeSummarizedExperiment data container. However, using this framework requires basic proficiency in programming. $\textbf{Results:}$ To reduce the entry requirements, we developed iSEEtree, an R shiny app which provides a visual interface for the analysis and exploration of TreeSummarizedExperiment objects, thereby expanding the interactive graphics capabilities of related work to hierarchical structures. This way, users can interactively explore several aspects of their data without the need for extensive knowledge of R programming. We describe how iSEEtree enables the exploration of hierarchical multi-table data and demonstrate its functionality with applications to microbiome analysis. $\textbf{Availability and Implementation:}$ iSEEtree was implemented in the R programming language and is available on Bioconductor at https://bioconductor.org/packages/iSEEtree under an Artistic 2.0 license. $\textbf{Contact:}$ giulio.benedetti@utu.fi or leo.lahti@utu.fi. 2024-12-03T22:34:38Z 4 pages, 1 figure Giulio Benedetti Ely Seraidarian Theotime Pralas Akewak Jeba Tuomas Borman Leo Lahti http://arxiv.org/abs/2412.15221v1 GPS-2-GTFS: A Python package to process and transform raw GPS data of public transit to GTFS format 2024-12-03T23:30:17Z The gps2gtfs package addresses a critical need for converting raw Global Positioning System (GPS) trajectory data from public transit vehicles into the widely used GTFS (General Transit Feed Specification) format. This transformation enables various software applications to efficiently utilize real-time transit data for purposes such as tracking, scheduling, and arrival time prediction. Developed in Python, gps2gtfs employs techniques like geo-buffer mapping, parallel processing, and data filtering to manage challenges associated with raw GPS data, including high volume, discontinuities, and localization errors. This open-source package, available on GitHub and PyPI, enhances the development of intelligent transportation solutions and fosters improved public transit systems globally. 2024-12-03T23:30:17Z Shiveswarran Ratneswaran Uthayasanker Thayasivam Sivakumar Thillaiambalam http://arxiv.org/abs/2412.02004v1 Open Source Evolutionary Computation with Chips-n-Salsa 2024-12-02T22:18:31Z When it was first introduced, the Chips-n-Salsa Java library provided stochastic local search and related algorithms, with a focus on self-adaptation and parallel execution. For the past four years, we expanded its scope to include evolutionary computation. This paper concerns the evolutionary algorithms that Chips-n-Salsa now provides, which includes multiple evolutionary models, common problem representations, a wide range of mutation and crossover operators, and a variety of benchmark problems. Well-defined Java interfaces enable easily integrating custom representations and evolutionary operators, as well as defining optimization problems. Chips-n-Salsa's evolutionary algorithms include implementations with adaptive mutation and crossover rates, as well as both sequential and parallel execution. Source code is maintained on GitHub, and immutable artifacts are regularly published to the Maven Central Repository to enable easily importing into projects for reproducible builds. Effective development processes such as test-driven development, as well as a variety of static analysis tools help ensure code quality. 2024-12-02T22:18:31Z Proceedings of the 16th International Joint Conference on Computational Intelligence (IJCCI 2024), pages 330-337 Vincent A. Cicirello 10.5220/0013040600003837 http://arxiv.org/abs/2411.18321v1 Learning optimal objective values for MILP 2024-11-27T13:22:31Z Modern Mixed Integer Linear Programming (MILP) solvers use the Branch-and-Bound algorithm together with a plethora of auxiliary components that speed up the search. In recent years, there has been an explosive development in the use of machine learning for enhancing and supporting these algorithmic components. Within this line, we propose a methodology for predicting the optimal objective value, or, equivalently, predicting if the current incumbent is optimal. For this task, we introduce a predictor based on a graph neural network (GNN) architecture, together with a set of dynamic features. Experimental results on diverse benchmarks demonstrate the efficacy of our approach, achieving high accuracy in the prediction task and outperforming existing methods. These findings suggest new opportunities for integrating ML-driven predictions into MILP solvers, enabling smarter decision-making and improved performance. 2024-11-27T13:22:31Z Lara Scavuzzo Karen Aardal Neil Yorke-Smith http://arxiv.org/abs/2411.16509v1 Jaya R Package -- A Parameter-Free Solution for Advanced Single and Multi-Objective Optimization 2024-11-25T15:46:54Z The Jaya R package offers a robust and versatile implementation of the parameter-free Jaya optimization algorithm, suitable for solving both single-objective and multi-objective optimization problems. By integrating advanced features such as constraint handling, adaptive population management, Pareto front tracking for multi-objective trade-offs, and parallel processing for computational efficiency, the package caters to a wide range of optimization challenges. Its intuitive design and flexibility allow users to solve complex, real-world problems across various domains. To demonstrate its practical utility, a case study on energy modeling explores the optimization of renewable energy shares, showcasing the package's ability to minimize carbon emissions and costs while enhancing system reliability. The Jaya R package is an invaluable tool for researchers and practitioners seeking efficient and adaptive optimization solutions. 2024-11-25T15:46:54Z Neeraj Dhanraj Bokde http://arxiv.org/abs/2411.13259v1 Interface for Sparse Linear Algebra Operations 2024-11-20T12:20:45Z The standardization of an interface for dense linear algebra operations in the BLAS standard has enabled interoperability between different linear algebra libraries, thereby boosting the success of scientific computing, in particular in scientific HPC. Despite numerous efforts in the past, the community has not yet agreed on a standardization for sparse linear algebra operations due to numerous reasons. One is the fact that sparse linear algebra objects allow for many different storage formats, and different hardware may favor different storage formats. This makes the definition of a FORTRAN-style all-circumventing interface extremely challenging. Another reason is that opposed to dense linear algebra functionality, in sparse linear algebra, the size of the sparse data structure for the operation result is not always known prior to the information. Furthermore, as opposed to the standardization effort for dense linear algebra, we are late in the technology readiness cycle, and many production-ready software libraries using sparse linear algebra routines have implemented and committed to their own sparse BLAS interface. At the same time, there exists a demand for standardization that would improve interoperability, and sustainability, and allow for easier integration of building blocks. In an inclusive, cross-institutional effort involving numerous academic institutions, US National Labs, and industry, we spent two years designing a hardware-portable interface for basic sparse linear algebra functionality that serves the user needs and is compatible with the different interfaces currently used by different vendors. In this paper, we present a C++ API for sparse linear algebra functionality, discuss the design choices, and detail how software developers preserve a lot of freedom in terms of how to implement functionality behind this API. 2024-11-20T12:20:45Z 43 pages Ahmad Abdelfattah Willow Ahrens Hartwig Anzt Chris Armstrong Ben Brock Aydin Buluc Federico Busato Terry Cojean Tim Davis Jim Demmel Grace Dinh David Gardener Jan Fiala Mark Gates Azzam Haider Toshiyuki Imamura Pedro Valero Lara Jose Moreira Sherry Li Piotr Luszczek Max Melichenko Jose Moeira Yvan Mokwinski Riley Murray Spencer Patty Slaven Peles Tobias Ribizel Jason Riedy Siva Rajamanickam Piyush Sao Manu Shantharam Keita Teranishi Stan Tomov Yu-Hsiang Tsai Heiko Weichelt http://arxiv.org/abs/2411.10143v1 Cascaded Prediction and Asynchronous Execution of Iterative Algorithms on Heterogeneous Platforms 2024-11-15T12:33:58Z Owing to the diverse scales and varying distributions of sparse matrices arising from practical problems, a multitude of choices are present in the design and implementation of sparse matrix-vector multiplication (SpMV). Researchers have proposed many machine learning-based optimization methods for SpMV. However, these efforts only support one area of sparse matrix format selection, SpMV algorithm selection, or parameter configuration, and rarely consider a large amount of time overhead associated with feature extraction, model inference, and compression format conversion. This paper introduces a machine learning-based cascaded prediction method for SpMV computations that spans various computing stages and hierarchies. Besides, an asynchronous and concurrent computing model has been designed and implemented for runtime model prediction and iterative algorithm solving on heterogeneous computing platforms. It not only offers comprehensive support for the iterative algorithm-solving process leveraging machine learning technology, but also effectively mitigates the preprocessing overheads. Experimental results demonstrate that the cascaded prediction introduced in this paper accelerates SpMV by 1.33x on average, and the iterative algorithm, enhanced by cascaded prediction and asynchronous execution, optimizes by 2.55x on average. 2024-11-15T12:33:58Z 12 pages, 9 figures, 7 tables Jianhua Gao Bingjie Liu Yizhuo Wang Weixing Ji Hua Huang