https://arxiv.org/api/Xpzy3NGLXWDTRWqQslS/IxiU8hI2026-04-13T00:38:28Z307445015http://arxiv.org/abs/2502.04514v1Flip Graphs with Symmetry and New Matrix Multiplication Schemes2025-02-06T21:33:00ZThe flip graph algorithm is a method for discovering new matrix multiplication schemes by following random walks on a graph. We introduce a version of the flip graph algorithm for matrix multiplication schemes that admit certain symmetries. This significantly reduces the size of the search space, allowing for more efficient exploration of the flip graph. The symmetry in the resulting schemes also facilitates the process of lifting solutions from $F_2$ to $\mathbb{Z}$. Our results are new schemes for multiplying $5\times 5$ matrices using $93$ multiplications and $6\times 6$ matrices using $153$ multiplications over arbitrary ground fields.2025-02-06T21:33:00ZJakob MoosbauerMichael Poolehttp://arxiv.org/abs/2403.11545v3First-order factors of linear Mahler operators2025-02-03T17:26:03ZWe develop and compare two algorithms for computing first-order right-hand factors in the ring of linear Mahler operators$\ell_r M^r + \dots + \ell_1 M + \ell_0$where $\ell_0, \dots, \ell_r$ are polynomials in~$x$ and $Mx = x^b M$ for some integer $b \geq 2$. In other words, we give algorithms for finding all formal infinite product solutions of linear functional equations$\ell_r(x) f(x^{b^r}) + \dots + \ell_1(x) f(x^b) + \ell_0(x) f(x) = 0$. The first of our algorithms is adapted from Petkovšek's classical algorithm forthe analogous problem in the case of linear recurrences. The second one proceeds by computing a basis of generalized power series solutions of the functional equation and by using Hermite-Pad{é} approximants to detect those linear combinations of the solutions that correspond to first-order factors. We present implementations of both algorithms and discuss their use in combination with criteria from the literature to prove the differential transcendence of power series solutions of Mahler equations.2024-03-18T07:56:49ZDedicated to the memory of Marko Petkovšek. Accepted for publication in the Journal of Symbolic ComputationJournal of Symbolic Computation. Vol. 130, (2025), paper 102424Frédéric ChyzakMATHEXPThomas DreyfusIMBPhilippe DumasMATHEXPMarc MezzarobbaLIX10.1016/j.jsc.2025.102424http://arxiv.org/abs/2409.01416v2Active Symbolic Discovery of Ordinary Differential Equations via Phase Portrait Sketching2025-02-03T02:45:31ZThe symbolic discovery of Ordinary Differential Equations (ODEs) from trajectory data plays a pivotal role in AI-driven scientific discovery. Existing symbolic methods predominantly rely on fixed, pre-collected training datasets, which often result in suboptimal performance, as demonstrated in our case study in Figure 1. Drawing inspiration from active learning, we investigate strategies to query informative trajectory data that can enhance the evaluation of predicted ODEs. However, the butterfly effect in dynamical systems reveals that small variations in initial conditions can lead to drastically different trajectories, necessitating the storage of vast quantities of trajectory data using conventional active learning. To address this, we introduce Active Symbolic Discovery of Ordinary Differential Equations via Phase Portrait Sketching (APPS). Instead of directly selecting individual initial conditions, our APPS first identifies an informative region within the phase space and then samples a batch of initial conditions from this region. Compared to traditional active learning methods, APPS mitigates the gap of maintaining a large amount of data. Extensive experiments demonstrate that APPS consistently discovers more accurate ODE expressions than baseline methods using passively collected datasets.2024-09-02T18:24:39ZExtended Version of the Paper Accepted at AAAI 2025Nan JiangMd NasimYexiang Xuehttp://arxiv.org/abs/2501.17759v1Yin-Yang: Developing Motifs With Long-Term Structure And Controllability2025-01-29T16:50:09ZTransformer models have made great strides in generating symbolically represented music with local coherence. However, controlling the development of motifs in a structured way with global form remains an open research area. One of the reasons for this challenge is due to the note-by-note autoregressive generation of such models, which lack the ability to correct themselves after deviations from the motif. In addition, their structural performance on datasets with shorter durations has not been studied in the literature. In this study, we propose Yin-Yang, a framework consisting of a phrase generator, phrase refiner, and phrase selector models for the development of motifs into melodies with long-term structure and controllability. The phrase refiner is trained on a novel corruption-refinement strategy which allows it to produce melodic and rhythmic variations of an original motif at generation time, thereby rectifying deviations of the phrase generator. We also introduce a new objective evaluation metric for quantifying how smoothly the motif manifests itself within the piece. Evaluation results show that our model achieves better performance compared to state-of-the-art transformer models while having the advantage of being controllable and making the generated musical structure semi-interpretable, paving the way for musical analysis. Our code and demo page can be found at https://github.com/keshavbhandari/yinyang.2025-01-29T16:50:09Z16 Pages, 4 Figures, Accepted at Artificial Intelligence in Music, Sound, Art and Design: 14th International Conference, EvoMUSART 2025Keshav BhandariGeraint A. WigginsSimon Coltonhttp://arxiv.org/abs/2402.07782v2Solving parameter-dependent semi-algebraic systems2025-01-24T10:34:42ZWe consider systems of polynomial equations and inequalities in $\mathbb{Q}[\boldsymbol{y}][\boldsymbol{x}]$ where $\boldsymbol{x} = (x_1, \ldots, x_n)$ and $\boldsymbol{y} = (y_1, \ldots,y_t)$. The $\boldsymbol{y}$ indeterminates are considered as parameters and we assume that when specialising them generically, the set of common complex solutions, to the obtained equations, is finite. We consider the problem of real root classification for such parameter-dependent problems, i.e. identifying the possible number of real solutions depending on the values of the parameters and computing a description of the regions of the space of parameters over which the number of real roots remains invariant.
We design an algorithm for solving this problem. The formulas it outputs enjoy a determinantal structure. Under genericity assumptions, we show that its arithmetic complexity is polynomial in both the maximum degree $d$ and the number $s$ of the input inequalities and exponential in $nt+t^2$. The output formulas consist of polynomials of degree bounded by $(2s+n)d^{n+1}$. This is the first algorithm with such a singly exponential complexity. We report on practical experiments showing that a first implementation of this algorithm can tackle examples which were previously out of reach.2024-02-12T16:47:13Z10 pagesLouis GaillardMohab Safey El Din10.1145/3666000.3669718http://arxiv.org/abs/2404.08404v2A Complexity Map of Probabilistic Reasoning for Neurosymbolic Classification Techniques2025-01-23T09:52:48ZNeurosymbolic artificial intelligence is a growing field of research aiming to combine neural network learning capabilities with the reasoning abilities of symbolic systems. Informed multi-label classification is a sub-field of neurosymbolic AI which studies how to leverage prior knowledge to improve neural classification systems. Recently, a family of neurosymbolic techniques for informed classification based on probabilistic reasoning has gained significant traction. Unfortunately, depending on the language used to represent prior knowledge, solving certain probabilistic reasoning problems can become prohibitively hard when the number of classes increases. Therefore, the asymptotic complexity of probabilistic reasoning is of cardinal importance to assess the scalability of such techniques. In this paper, we develop a unified formalism for four probabilistic reasoning problems. Then, we compile several known and new tractability results into a single complexity map of probabilistic reasoning. We build on top of this complexity map to characterize the domains of scalability of several techniques. We hope this work will help neurosymbolic AI practitioners navigate the scalability landscape of probabilistic neurosymbolic techniques.2024-04-12T11:31:37Z36 pages, 10 figuresArthur LedaguenelCéline HudelotMostepha Khouadjiahttp://arxiv.org/abs/2405.08300v3Vector-Symbolic Architecture for Event-Based Optical Flow2025-01-22T03:19:36ZFrom a perspective of feature matching, optical flow estimation for event cameras involves identifying event correspondences by comparing feature similarity across accompanying event frames. In this work, we introduces an effective and robust high-dimensional (HD) feature descriptor for event frames, utilizing Vector Symbolic Architectures (VSA). The topological similarity among neighboring variables within VSA contributes to the enhanced representation similarity of feature descriptors for flow-matching points, while its structured symbolic representation capacity facilitates feature fusion from both event polarities and multiple spatial scales. Based on this HD feature descriptor, we propose a novel feature matching framework for event-based optical flow, encompassing both model-based (VSA-Flow) and self-supervised learning (VSA-SM) methods. In VSA-Flow, accurate optical flow estimation validates the effectiveness of HD feature descriptors. In VSA-SM, a novel similarity maximization method based on the HD feature descriptor is proposed to learn optical flow in a self-supervised way from events alone, eliminating the need for auxiliary grayscale images. Evaluation results demonstrate that our VSA-based method achieves superior accuracy in comparison to both model-based and self-supervised learning methods on the DSEC benchmark, while remains competitive among both methods on the MVSEC benchmark. This contribution marks a significant advancement in event-based optical flow within the feature matching methodology.2024-05-14T03:50:07ZHongzhi YouYijun CaoWei YuanFanjun WangNing QiaoYongjie Lihttp://arxiv.org/abs/2412.07939v2A Monadic Calculus with Episodic Flows2025-01-21T23:20:19ZWe define computational atoms named "actions" equipped primarily with three operations: reduction, collection, and inspection. We show how actions can be used for decision-making algorithms from simple axioms. We describe the encodings of typical data structures as actions, and provide a method of analysis for algorithms on the basis of data mutation.2024-12-10T21:51:52Z18 pages, 2 figuresSotirios Henninghttp://arxiv.org/abs/2411.16348v2Extracting Linear Relations from Gröbner Bases for Formal Verification of And-Inverter Graphs2025-01-20T12:07:57ZFormal verification techniques based on computer algebra have proven highly effective for circuit verification. The circuit, given as an and-inverter graph, is encoded as a set of polynomials that automatically generates a Gröbner basis with respect to a lexicographic term ordering. Correctness of the circuit can be derived by computing the polynomial remainder of the specification. However, the main obstacle is the monomial blow-up during the rewriting of the specification, which leads to the development of dedicated heuristics to overcome this issue. In this paper, we investigate an orthogonal approach and focus the computational effort on rewriting the Gröbner basis itself. Our goal is to ensure the basis contains linear polynomials that can be effectively used to rewrite the linearized specification. We first prove the soundness and completeness of this technique and then demonstrate its practical application. Our implementation of this method shows promising results on benchmarks related to multiplier verification.2024-11-25T12:55:49ZAccepted at 31st International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS) 2025Daniela KaufmannJérémy Berthomieuhttp://arxiv.org/abs/2501.09729v1Generating particle physics Lagrangians with transformers2025-01-16T18:25:50ZIn physics, Lagrangians provide a systematic way to describe laws governing physical systems. In the context of particle physics, they encode the interactions and behavior of the fundamental building blocks of our universe. By treating Lagrangians as complex, rule-based constructs similar to linguistic expressions, we trained a transformer model -- proven to be effective in natural language tasks -- to predict the Lagrangian corresponding to a given list of particles. We report on the transformer's performance in constructing Lagrangians respecting the Standard Model $\mathrm{SU}(3)\times \mathrm{SU}(2)\times \mathrm{U}(1)$ gauge symmetries. The resulting model is shown to achieve high accuracies (over 90\%) with Lagrangians up to six matter fields, with the capacity to generalize beyond the training distribution, albeit within architectural constraints. We show through an analysis of input embeddings that the model has internalized concepts such as group representations and conjugation operations as it learned to generate Lagrangians. We make the model and training datasets available to the community. An interactive demonstration can be found at: \url{https://huggingface.co/spaces/JoseEliel/generate-lagrangians}.2025-01-16T18:25:50Z32 pages, 11 figues, 18 tablesYong Sheng KoayRikard EnbergStefano MorettiEliel Camargo-Molinahttp://arxiv.org/abs/2501.09201v1Towards Semantics Lifting for Scientific Computing: A Case Study on FFT2025-01-15T23:24:32ZThe rise of automated code generation tools, such as large language models (LLMs), has introduced new challenges in ensuring the correctness and efficiency of scientific software, particularly in complex kernels, where numerical stability, domain-specific optimizations, and precise floating-point arithmetic are critical. We propose a stepwise semantics lifting approach using an extended SPIRAL framework with symbolic execution and theorem proving to statically derive high-level code semantics from LLM-generated kernels. This method establishes a structured path for verifying the source code's correctness via a step-by-step lifting procedure to high-level specification. We conducted preliminary tests on the feasibility of this approach by successfully lifting GPT-generated fast Fourier transform code to high-level specifications.2025-01-15T23:24:32ZAccepted at the Theory and Practice of Static Analysis Workshop (TPSA), in conjunction with the ACM SIGPLAN Symposium on Principles of Programming Languages (POPL), 2025Naifeng ZhangSanil RaoMike FranusichFranz Franchettihttp://arxiv.org/abs/2501.08086v1NOMTO: Neural Operator-based symbolic Model approximaTion and discOvery2025-01-14T12:55:48ZWhile many physical and engineering processes are most effectively described by non-linear symbolic models, existing non-linear symbolic regression (SR) methods are restricted to a limited set of continuous algebraic functions, thereby limiting their applicability to discover higher order non-linear differential relations. In this work, we introduce the Neural Operator-based symbolic Model approximaTion and discOvery (NOMTO) method, a novel approach to symbolic model discovery that leverages Neural Operators to encompass a broad range of symbolic operations. We demonstrate that NOMTO can successfully identify symbolic expressions containing elementary functions with singularities, special functions, and derivatives. Additionally, our experiments demonstrate that NOMTO can accurately rediscover second-order non-linear partial differential equations. By broadening the set of symbolic operations available for discovery, NOMTO significantly advances the capabilities of existing SR methods. It provides a powerful and flexible tool for model discovery, capable of capturing complex relations in a variety of physical systems.2025-01-14T12:55:48ZSergei GarmaevSiddhartha MishraOlga Finkhttp://arxiv.org/abs/2501.07123v1Inferring Interpretable Models of Fragmentation Functions using Symbolic Regression2025-01-13T08:25:14ZMachine learning is rapidly making its path into natural sciences, including high-energy physics. We present the first study that infers, directly from experimental data, a functional form of fragmentation functions. The latter represent a key ingredient to describe physical observables measured in high-energy physics processes that involve hadron production, and predict their values at different energy. Fragmentation functions can not be calculated in theory and have to be determined instead from data. Traditional approaches rely on global fits of experimental data using a pre-assumed functional form inspired from phenomenological models to learn its parameters. This novel approach uses a ML technique, namely symbolic regression, to learn an analytical model from measured charged hadron multiplicities. The function learned by symbolic regression resembles the Lund string function and describes the data well, thus representing a potential candidate for use in global FFs fits. This study represents an approach to follow in such QCD-related phenomenology studies and more generally in sciences.2025-01-13T08:25:14ZNour MakkeSanjay Chawlahttp://arxiv.org/abs/2501.06707v1ELIZA Reanimated: The world's first chatbot restored on the world's first time sharing system2025-01-12T04:23:34ZELIZA, created by Joseph Weizenbaum at MIT in the early 1960s, is usually considered the world's first chatbot. It was developed in MAD-SLIP on MIT's CTSS, the world's first time-sharing system, on an IBM 7094. We discovered an original ELIZA printout in Prof. Weizenbaum's archives at MIT, including an early version of the famous DOCTOR script, a nearly complete version of the MAD-SLIP code, and various support functions in MAD and FAP. Here we describe the reanimation of this original ELIZA on a restored CTSS, itself running on an emulated IBM 7094. The entire stack is open source, so that any user of a unix-like OS can run the world's first chatbot on the world's first time-sharing system.2025-01-12T04:23:34ZIn reviewRupert LaneAnthony HayArthur SchwarzDavid M. BerryJeff Shragerhttp://arxiv.org/abs/2501.06699v1Large Language Models, Knowledge Graphs and Search Engines: A Crossroads for Answering Users' Questions2025-01-12T03:32:12ZMuch has been discussed about how Large Language Models, Knowledge Graphs and Search Engines can be combined in a synergistic manner. A dimension largely absent from current academic discourse is the user perspective. In particular, there remain many open questions regarding how best to address the diverse information needs of users, incorporating varying facets and levels of difficulty. This paper introduces a taxonomy of user information needs, which guides us to study the pros, cons and possible synergies of Large Language Models, Knowledge Graphs and Search Engines. From this study, we derive a roadmap for future research.2025-01-12T03:32:12ZAidan HoganXin Luna DongDenny VrandečićGerhard Weikum