https://arxiv.org/api//fo7UChCY1VWcz/6wdE74upz6C42026-06-22T10:10:13Z266440515http://arxiv.org/abs/2411.10904v2Invariant Polydiagonal Subspaces of Matrices and Constraint Programming2024-12-13T02:29:19ZIn a polydiagonal subspace of the Euclidean space, certain components of the vectors are equal (synchrony) or opposite (anti-synchrony). Polydiagonal subspaces invariant under a matrix have many applications in graph theory and dynamical systems, especially coupled cell networks. We describe invariant polydiagonal subspaces in terms of coloring vectors. This approach gives an easy formulation of a constraint satisfaction problem for finding invariant polydiagonal subspaces. Solving the resulting problem with existing state-of-the-art constraint solvers greatly outperforms the currently known algorithms.2024-11-16T22:38:40Z16 pagesJohn M. NeubergerNándor SiebenJames W. Swifthttp://arxiv.org/abs/2410.08760v2Unlocking FedNL: Self-Contained Compute-Optimized Implementation2024-12-12T14:43:48ZFederated Learning (FL) is an emerging paradigm that enables intelligent agents to collaboratively train Machine Learning (ML) models in a distributed manner, eliminating the need for sharing their local data. The recent work (arXiv:2106.02969) introduces a family of Federated Newton Learn (FedNL) algorithms, marking a significant step towards applying second-order methods to FL and large-scale optimization. However, the reference FedNL prototype exhibits three serious practical drawbacks: (i) It requires 4.8 hours to launch a single experiment in a sever-grade workstation; (ii) The prototype only simulates multi-node setting; (iii) Prototype integration into resource-constrained applications is challenging. To bridge the gap between theory and practice, we present a self-contained implementation of FedNL, FedNL-LS, FedNL-PP for single-node and multi-node settings. Our work resolves the aforementioned issues and reduces the wall clock time by x1000. With this FedNL outperforms alternatives for training logistic regression in a single-node -- CVXPY (arXiv:1603.00943), and in a multi-node -- Apache Spark (arXiv:1505.06807), Ray/Scikit-Learn (arXiv:1712.05889). Finally, we propose two practical-orientated compressors for FedNL - adaptive TopLEK and cache-aware RandSeqK, which fulfill the theory of FedNL.2024-10-11T12:19:18Z55 pages, 12 figures, 12 tablesKonstantin BurlachenkoPeter Richtárikhttp://arxiv.org/abs/2412.05300v2AD-HOC: A C++ Expression Template package for high-order derivatives backpropagation2024-12-11T21:56:35ZThis document presents a new C++ Automatic Differentiation (AD) tool, AD-HOC (Automatic Differentiation for High-Order Calculations). This tool aims to have the following features: -Calculation of user specified derivatives of arbitrary order -To be able to run with similar speeds as handwritten code -All derivatives calculations are computed in a single backpropagation tree pass -No source code generation is used, relying heavily on the C++ compiler to statically build the computation tree before runtime -A simple interface -The ability to be used \textit{in conjunction} with other established, general-purpose dynamic AD tools -Header-only library, with no external dependencies -Open source, with a business-friendly license2024-11-25T10:24:29ZJuan Lucas Reyhttp://arxiv.org/abs/2412.07631v2Direct Low-Dose CT Image Reconstruction on GPU using Out-Of-Core: Precision and Quality Study2024-12-11T07:18:12ZAlgebraic methods applied to the reconstruction of Sparse-view Computed Tomography (CT) can provide both a high image quality and a decrease in the dose received by patients, although with an increased reconstruction time since their computational costs are higher. In our work, we present a new algebraic implementation that obtains an exact solution to the system of linear equations that models the problem and based on single-precision floating-point arithmetic. By applying Out-Of-Core (OOC) techniques, the dimensions of the system can be increased regardless of the main memory size and as long as there is enough secondary storage (disk). These techniques have allowed to process images of 768 x 768 pixels. A comparative study of our method on a GPU using both single-precision and double-precision arithmetic has been carried out. The goal is to assess the single-precision arithmetic implementation both in terms of time improvement and quality of the reconstructed images to determine if it is sufficient to consider it a viable option. Results using single-precision arithmetic approximately halves the reconstruction time of the double-precision implementation, whereas the obtained images retain all internal structures despite having higher noise levels.2024-12-10T16:11:51Z22 pages, 12 figures, 9 tablesM. ChillarónG. Quintana-OrtíV. VidalG. Verdúhttp://arxiv.org/abs/2409.06752v3A tutorial on automatic differentiation with complex numbers2024-12-10T18:34:46ZAutomatic differentiation is everywhere, but there exists only minimal documentation of how it works in complex arithmetic beyond stating "derivatives in $\mathbb{C}^d$" $\cong$ "derivatives in $\mathbb{R}^{2d}$" and, at best, shallow references to Wirtinger calculus. Unfortunately, the equivalence $\mathbb{C}^d \cong \mathbb{R}^{2d}$ becomes insufficient as soon as we need to derive custom gradient rules, e.g., to avoid differentiating "through" expensive linear algebra functions or differential equation simulators. To combat such a lack of documentation, this article surveys forward- and reverse-mode automatic differentiation with complex numbers, covering topics such as Wirtinger derivatives, a modified chain rule, and different gradient conventions while explicitly avoiding holomorphicity and the Cauchy--Riemann equations (which would be far too restrictive). To be precise, we will derive, explain, and implement a complex version of Jacobian-vector and vector-Jacobian products almost entirely with linear algebra without relying on complex analysis or differential geometry. This tutorial is a call to action, for users and developers alike, to take complex values seriously when implementing custom gradient propagation rules -- the manuscript explains how.2024-09-10T14:04:58ZNicholas Krämerhttp://arxiv.org/abs/2412.07556v1Optimization-Driven Design of Monolithic Soft-Rigid Grippers2024-12-10T14:47:09ZSim-to-real transfer remains a significant challenge in soft robotics due to the unpredictability introduced by common manufacturing processes such as 3D printing and molding. These processes often result in deviations from simulated designs, requiring multiple prototypes before achieving a functional system. In this study, we propose a novel methodology to address these limitations by combining advanced rapid prototyping techniques and an efficient optimization strategy. Firstly, we employ rapid prototyping methods typically used for rigid structures, leveraging their precision to fabricate compliant components with reduced manufacturing errors. Secondly, our optimization framework minimizes the need for extensive prototyping, significantly reducing the iterative design process. The methodology enables the identification of stiffness parameters that are more practical and achievable within current manufacturing capabilities. The proposed approach demonstrates a substantial improvement in the efficiency of prototype development while maintaining the desired performance characteristics. This work represents a step forward in bridging the sim-to-real gap in soft robotics, paving the way towards a faster and more reliable deployment of soft robotic systems.2024-12-10T14:47:09ZSoft Robotics, 2025Pierluigi MansuetoMihai DragusanuAnjum SaeedMonica MalvezziMatteo LapucciGionata Salviettihttp://arxiv.org/abs/2412.10420v1Monte Carlo Analysis of Boid Simulations with Obstacles: A Physics-Based Perspective2024-12-10T04:34:23ZBoids, developed by Craig W. Reynolds in 1986, is one of the earliest emergent models where the global pattern emerges from the interaction between many individuals within the local scale. In the original model, Boids follow three rules: separation, alignment, and cohesion; which allow them to move around and create a flock without intention in the empty environment. In the real world, however, the Boids' movement also faces obstacles preventing the flock's direction. In this project, I propose two new simple rules of the Boids model to represent the more realistic movement in nature and analyze the model from the physics perspective using the Monte Carlo method. From those results, the physics metrics related to the forming of the flocking phenomenon show that it is reasonable to explain why birds or fishes prefer to move in a flock, rather than sole movement.2024-12-10T04:34:23Z6 pages, 2 figuresQuoc Chuong Nguyenhttp://arxiv.org/abs/2402.02441v5TopoX: A Suite of Python Packages for Machine Learning on Topological Domains2024-12-09T02:29:37ZWe introduce TopoX, a Python software suite that provides reliable and user-friendly building blocks for computing and machine learning on topological domains that extend graphs: hypergraphs, simplicial, cellular, path and combinatorial complexes. TopoX consists of three packages: TopoNetX facilitates constructing and computing on these domains, including working with nodes, edges and higher-order cells; TopoEmbedX provides methods to embed topological domains into vector spaces, akin to popular graph-based embedding algorithms such as node2vec; TopoModelX is built on top of PyTorch and offers a comprehensive toolbox of higher-order message passing functions for neural networks on topological domains. The extensively documented and unit-tested source code of TopoX is available under MIT license at https://pyt-team.github.io/}{https://pyt-team.github.io/.2024-02-04T10:41:40ZMustafa HajijMathilde PapillonFlorian FrantzenJens AgerbergIbrahem AlJabeaRubén BallesterClaudio BattiloroGuillermo BernárdezTolga BirdalAiden BrentPeter ChinSergio EscaleraSimone FiorellinoOdin Hoff GardaaGurusankar GopalakrishnanDevendra GovilJosef HoppeManeel Reddy KarriJude KhoujaManuel LechaNeal LivesayJan MeißnerSoham MukherjeeAlexander NikitinTheodore PapamarkouJaro PrílepokKarthikeyan Natesan RamamurthyPaul RosenAldo Guzmán-SáenzAlessandro SalatielloShreyas N. SamagaSimone ScardapaneMichael T. SchaubLuca ScofanoIndro SpinelliLev TelyatnikovQuang TruongRobin WaltersMaosheng YangOlga ZaghenGhada ZamzmiAli ZiaNina Miolanehttp://arxiv.org/abs/2412.02882v2iSEEtree: interactive explorer for hierarchical data2024-12-05T07:11:03Z$\textbf{Motivation:}$ Hierarchical data structures are prevalent across several fields of research, as they represent an organised and efficient approach to study complex interconnected systems. Their significance is particularly evident in microbiome analysis, where microbial communities are classified at various taxonomic levels along the phylogenetic tree. In light of this trend, the R/Bioconductor community has established a reproducible analytical framework for hierarchical data, which relies on the highly generic and optimised TreeSummarizedExperiment data container. However, using this framework requires basic proficiency in programming.
$\textbf{Results:}$ To reduce the entry requirements, we developed iSEEtree, an R shiny app which provides a visual interface for the analysis and exploration of TreeSummarizedExperiment objects, thereby expanding the interactive graphics capabilities of related work to hierarchical structures. This way, users can interactively explore several aspects of their data without the need for extensive knowledge of R programming. We describe how iSEEtree enables the exploration of hierarchical multi-table data and demonstrate its functionality with applications to microbiome analysis.
$\textbf{Availability and Implementation:}$ iSEEtree was implemented in the R programming language and is available on Bioconductor at https://bioconductor.org/packages/iSEEtree under an Artistic 2.0 license.
$\textbf{Contact:}$ giulio.benedetti@utu.fi or leo.lahti@utu.fi.2024-12-03T22:34:38Z4 pages, 1 figureGiulio BenedettiEly SeraidarianTheotime PralasAkewak JebaTuomas BormanLeo Lahtihttp://arxiv.org/abs/2412.15221v1GPS-2-GTFS: A Python package to process and transform raw GPS data of public transit to GTFS format2024-12-03T23:30:17ZThe gps2gtfs package addresses a critical need for converting raw Global Positioning System (GPS) trajectory data from public transit vehicles into the widely used GTFS (General Transit Feed Specification) format. This transformation enables various software applications to efficiently utilize real-time transit data for purposes such as tracking, scheduling, and arrival time prediction. Developed in Python, gps2gtfs employs techniques like geo-buffer mapping, parallel processing, and data filtering to manage challenges associated with raw GPS data, including high volume, discontinuities, and localization errors. This open-source package, available on GitHub and PyPI, enhances the development of intelligent transportation solutions and fosters improved public transit systems globally.2024-12-03T23:30:17ZShiveswarran RatneswaranUthayasanker ThayasivamSivakumar Thillaiambalamhttp://arxiv.org/abs/2412.02004v1Open Source Evolutionary Computation with Chips-n-Salsa2024-12-02T22:18:31ZWhen it was first introduced, the Chips-n-Salsa Java library provided stochastic local search and related algorithms, with a focus on self-adaptation and parallel execution. For the past four years, we expanded its scope to include evolutionary computation. This paper concerns the evolutionary algorithms that Chips-n-Salsa now provides, which includes multiple evolutionary models, common problem representations, a wide range of mutation and crossover operators, and a variety of benchmark problems. Well-defined Java interfaces enable easily integrating custom representations and evolutionary operators, as well as defining optimization problems. Chips-n-Salsa's evolutionary algorithms include implementations with adaptive mutation and crossover rates, as well as both sequential and parallel execution. Source code is maintained on GitHub, and immutable artifacts are regularly published to the Maven Central Repository to enable easily importing into projects for reproducible builds. Effective development processes such as test-driven development, as well as a variety of static analysis tools help ensure code quality.2024-12-02T22:18:31ZProceedings of the 16th International Joint Conference on Computational Intelligence (IJCCI 2024), pages 330-337Vincent A. Cicirello10.5220/0013040600003837http://arxiv.org/abs/2411.18321v1Learning optimal objective values for MILP2024-11-27T13:22:31ZModern Mixed Integer Linear Programming (MILP) solvers use the Branch-and-Bound algorithm together with a plethora of auxiliary components that speed up the search. In recent years, there has been an explosive development in the use of machine learning for enhancing and supporting these algorithmic components. Within this line, we propose a methodology for predicting the optimal objective value, or, equivalently, predicting if the current incumbent is optimal. For this task, we introduce a predictor based on a graph neural network (GNN) architecture, together with a set of dynamic features. Experimental results on diverse benchmarks demonstrate the efficacy of our approach, achieving high accuracy in the prediction task and outperforming existing methods. These findings suggest new opportunities for integrating ML-driven predictions into MILP solvers, enabling smarter decision-making and improved performance.2024-11-27T13:22:31ZLara ScavuzzoKaren AardalNeil Yorke-Smithhttp://arxiv.org/abs/2411.16509v1Jaya R Package -- A Parameter-Free Solution for Advanced Single and Multi-Objective Optimization2024-11-25T15:46:54ZThe Jaya R package offers a robust and versatile implementation of the parameter-free Jaya optimization algorithm, suitable for solving both single-objective and multi-objective optimization problems. By integrating advanced features such as constraint handling, adaptive population management, Pareto front tracking for multi-objective trade-offs, and parallel processing for computational efficiency, the package caters to a wide range of optimization challenges. Its intuitive design and flexibility allow users to solve complex, real-world problems across various domains. To demonstrate its practical utility, a case study on energy modeling explores the optimization of renewable energy shares, showcasing the package's ability to minimize carbon emissions and costs while enhancing system reliability. The Jaya R package is an invaluable tool for researchers and practitioners seeking efficient and adaptive optimization solutions.2024-11-25T15:46:54ZNeeraj Dhanraj Bokdehttp://arxiv.org/abs/2411.13259v1Interface for Sparse Linear Algebra Operations2024-11-20T12:20:45ZThe standardization of an interface for dense linear algebra operations in the BLAS standard has enabled interoperability between different linear algebra libraries, thereby boosting the success of scientific computing, in particular in scientific HPC. Despite numerous efforts in the past, the community has not yet agreed on a standardization for sparse linear algebra operations due to numerous reasons. One is the fact that sparse linear algebra objects allow for many different storage formats, and different hardware may favor different storage formats. This makes the definition of a FORTRAN-style all-circumventing interface extremely challenging. Another reason is that opposed to dense linear algebra functionality, in sparse linear algebra, the size of the sparse data structure for the operation result is not always known prior to the information. Furthermore, as opposed to the standardization effort for dense linear algebra, we are late in the technology readiness cycle, and many production-ready software libraries using sparse linear algebra routines have implemented and committed to their own sparse BLAS interface. At the same time, there exists a demand for standardization that would improve interoperability, and sustainability, and allow for easier integration of building blocks. In an inclusive, cross-institutional effort involving numerous academic institutions, US National Labs, and industry, we spent two years designing a hardware-portable interface for basic sparse linear algebra functionality that serves the user needs and is compatible with the different interfaces currently used by different vendors. In this paper, we present a C++ API for sparse linear algebra functionality, discuss the design choices, and detail how software developers preserve a lot of freedom in terms of how to implement functionality behind this API.2024-11-20T12:20:45Z43 pagesAhmad AbdelfattahWillow AhrensHartwig AnztChris ArmstrongBen BrockAydin BulucFederico BusatoTerry CojeanTim DavisJim DemmelGrace DinhDavid GardenerJan FialaMark GatesAzzam HaiderToshiyuki ImamuraPedro Valero LaraJose MoreiraSherry LiPiotr LuszczekMax MelichenkoJose MoeiraYvan MokwinskiRiley MurraySpencer PattySlaven PelesTobias RibizelJason RiedySiva RajamanickamPiyush SaoManu ShantharamKeita TeranishiStan TomovYu-Hsiang TsaiHeiko Weichelthttp://arxiv.org/abs/2411.10143v1Cascaded Prediction and Asynchronous Execution of Iterative Algorithms on Heterogeneous Platforms2024-11-15T12:33:58ZOwing to the diverse scales and varying distributions of sparse matrices arising from practical problems, a multitude of choices are present in the design and implementation of sparse matrix-vector multiplication (SpMV). Researchers have proposed many machine learning-based optimization methods for SpMV. However, these efforts only support one area of sparse matrix format selection, SpMV algorithm selection, or parameter configuration, and rarely consider a large amount of time overhead associated with feature extraction, model inference, and compression format conversion. This paper introduces a machine learning-based cascaded prediction method for SpMV computations that spans various computing stages and hierarchies. Besides, an asynchronous and concurrent computing model has been designed and implemented for runtime model prediction and iterative algorithm solving on heterogeneous computing platforms. It not only offers comprehensive support for the iterative algorithm-solving process leveraging machine learning technology, but also effectively mitigates the preprocessing overheads. Experimental results demonstrate that the cascaded prediction introduced in this paper accelerates SpMV by 1.33x on average, and the iterative algorithm, enhanced by cascaded prediction and asynchronous execution, optimizes by 2.55x on average.2024-11-15T12:33:58Z12 pages, 9 figures, 7 tablesJianhua GaoBingjie LiuYizhuo WangWeixing JiHua Huang