https://arxiv.org/api/2pjONL+7rjzs9LKFXuV9nhKBF7A 2026-04-12T11:46:21Z 3074 330 15 http://arxiv.org/abs/2505.23311v1 Towards LLM-based Generation of Human-Readable Proofs in Polynomial Formal Verification 2025-05-29T10:11:42Z Verification is one of the central tasks in circuit and system design. While simulation and emulation are widely used, complete correctness can only be ensured based on formal proof techniques. But these approaches often have very high run time and memory requirements. Recently, Polynomial Formal Verification (PFV) has been introduced showing that for many instances of practical relevance upper bounds on needed resources can be given. But proofs have to be provided that are human-readable. Here, we study how modern approaches from Artificial Intelligence (AI) based on Large Language Models (LLMs) can be used to generate proofs that later on can be validated based on reasoning engines. Examples are given that show how LLMs can interact with proof engines, and directions for future work are outlined. 2025-05-29T10:11:42Z 4 pages; keynote given at 7th International Symposium on Devices, Circuits and Systems (ISDCS 2025), May 27-30, 2025, IIEST Shibpur, Kolkata, India Rolf Drechsler http://arxiv.org/abs/2502.04350v2 CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance 2025-05-29T00:38:10Z Existing methods fail to effectively steer Large Language Models (LLMs) between textual reasoning and code generation, leaving symbolic computing capabilities underutilized. We introduce CodeSteer, an effective method for guiding LLM code/text generation. We construct a comprehensive benchmark SymBench comprising 37 symbolic tasks with adjustable complexity and also synthesize datasets of 12k multi-turn guidance/generation trajectories and 5.5k guidance comparison pairs. We fine-tune the Llama-3-8B model with a newly designed multi-turn supervised fine-tuning (SFT) and direct preference optimization (DPO). The resulting model, CodeSteerLLM, augmented with the proposed symbolic and self-answer checkers, effectively guides the code/text generation of larger models. Augmenting GPT-4o with CodeSteer raises its average performance score from 53.3 to 86.4, even outperforming the existing best LLM OpenAI o1 (82.7), o1-preview (74.8), and DeepSeek R1 (76.8) across all 37 tasks (28 seen, 9 unseen). Trained for GPT-4o, CodeSteer demonstrates superior generalizability, providing an average 41.8 performance boost on Claude, Mistral, and GPT-3.5. CodeSteer-guided LLMs fully harness symbolic computing to maintain strong performance on highly complex tasks. Models, Datasets, and Codes are available at https://github.com/yongchao98/CodeSteer-v1.0 and https://huggingface.co/yongchao98. 2025-02-04T15:53:59Z 28 pages, 12 figures International Conference on Machine Learning (ICML'2025) Yongchao Chen Yilun Hao Yueying Liu Yang Zhang Chuchu Fan http://arxiv.org/abs/2505.23851v1 ASyMOB: Algebraic Symbolic Mathematical Operations Benchmark 2025-05-28T23:11:14Z Large language models (LLMs) are rapidly approaching the level of proficiency in university-level symbolic mathematics required for applications in advanced science and technology. However, existing benchmarks fall short in assessing the core skills of LLMs in symbolic mathematics-such as integration, differential equations, and algebraic simplification. To address this gap, we introduce ASyMOB, a novel assessment framework focused exclusively on symbolic manipulation, featuring 17,092 unique math challenges, organized by similarity and complexity. ASyMOB enables analysis of LLM generalization capabilities by comparing performance in problems that differ by simple numerical or symbolic `perturbations'. Evaluated LLMs exhibit substantial degradation in performance for all perturbation types (up to -70.3%), suggesting reliance on memorized patterns rather than deeper understanding of symbolic math, even among models achieving high baseline accuracy. Comparing LLM performance to computer algebra systems, we identify examples where they fail while LLMs succeed, as well as problems solved only by combining both approaches. Models capable of integrated code execution yielded higher accuracy compared to their performance without code, particularly stabilizing weaker models (up to +33.1% for certain perturbation types). Notably, the most advanced models (o4-mini, Gemini 2.5 Flash) demonstrate not only high symbolic math proficiency (scoring 96.8% and 97.6% on the unperturbed set), but also remarkable robustness against perturbations, (-21.7% and -21.2% vs. average -50.4% for the other models). This may indicate a recent "phase transition" in the generalization capabilities of frontier LLMs. It remains to be seen whether the path forward lies in deeper integration with sophisticated external tools, or in developing models so capable that symbolic math systems like CAS become unnecessary. 2025-05-28T23:11:14Z Code repository: https://github.com/RamanujanMachine/ASyMOB Complete benchmark dataset: https://huggingface.co/datasets/Shalyt/ASyMOB-Algebraic_Symbolic_Mathematical_Operations_Benchmark Michael Shalyt Rotem Elimelech Ido Kaminer http://arxiv.org/abs/2505.21879v1 Symbolic Foundation Regressor on Complex Networks 2025-05-28T01:53:29Z In science, we are interested not only in forecasting but also in understanding how predictions are made, specifically what the interpretable underlying model looks like. Data-driven machine learning technology can significantly streamline the complex and time-consuming traditional manual process of discovering scientific laws, helping us gain insights into fundamental issues in modern science. In this work, we introduce a pre-trained symbolic foundation regressor that can effectively compress complex data with numerous interacting variables while producing interpretable physical representations. Our model has been rigorously tested on non-network symbolic regression, symbolic regression on complex networks, and the inference of network dynamics across various domains, including physics, biochemistry, ecology, and epidemiology. The results indicate a remarkable improvement in equation inference efficiency, being three times more effective than baseline approaches while maintaining accurate predictions. Furthermore, we apply our model to uncover more intuitive laws of interaction transmission from global epidemic outbreak data, achieving optimal data fitting. This model extends the application boundary of pre-trained symbolic regression models to complex networks, and we believe it provides a foundational solution for revealing the hidden mechanisms behind changes in complex phenomena, enhancing interpretability, and inspiring further scientific discoveries. 2025-05-28T01:53:29Z 60 pages Weiting Liu Jiaxu Cui Jiao Hu En Wang Bo Yang http://arxiv.org/abs/2505.20497v1 Polynomial-Time Algorithms for Black-Box Distributive Expanded Groups 2025-05-26T19:59:22Z Let $Ω$ be a finite set of finitary operation symbols. An $Ω$-expanded group is a group (written additively and called the additive group of the $Ω$-expanded group) with an $Ω$-algebra structure. We use the black-box model of computation in $Ω$-expanded groups. In this model, elements of a finite $Ω$-expanded group $H$ are represented (not necessarily uniquely) by bit strings of the same length, say, $n$. Given representations of elements of $H$, equality testing and the fundamental operations of $H$ are performed by an oracle. Assume that $H$ is distributive, i.e., all its fundamental operations associated with nonnullary operation symbols in $Ω$ are distributive over addition. Suppose $s=(s_1,\dots,s_m)$ is a generating system of $H$. In this paper, we present probabilistic polynomial-time black-box $Ω$-expanded group algorithms for the following problems: (i) given $(1^n,s)$, construct a generating system of the additive group of $H$, (ii) given $(1^n,s,(t_1,\dots,t_k))$ with $t_1,\dots,t_k\in H$, find a generating system of the additive group of the ideal in $H$ generated by $\{t_1,\dots,t_k\}$, and (iii) given $(1^n,s)$, decide whether $H\in\mathfrak V$, where $\mathfrak V$ is an arbitrary finitely based variety of distributive $Ω$-expanded groups with nilpotent additive groups. The error probability of these algorithms is exponentially small in $n$. In particular, this can be applied to groups, rings, $R$-modules, and $R$-algebras, where $R$ is a fixed finitely generated commutative associative ring with $1$. Rings and $R$-algebras may be here with or without $1$, where $1$ is considered as a nullary fundamental operation. 2025-05-26T19:59:22Z 13 pages Mikhail Anokhin http://arxiv.org/abs/2505.20397v1 Computing transcendence and linear relations of 1-periods 2025-05-26T18:00:01Z A 1-period is a complex number given by the integral of a univariate algebraic function, where all data involved -- the integrand and the domain of integration -- are defined over algebraic numbers. We give an algorithm that, given a finite collection of 1-periods, computes the space of all linear relations among them with algebraic coefficients. In particular, the algorithm decides whether a given 1-period is transcendental, and whether two 1-periods are equal. This resolves, in the case of 1-periods, a problem posed by Kontsevich and Zagier, asking for an algorithm to decide equality of periods. The algorithm builds on the work of Huber and Wüstholz, who showed that all linear relations among 1-periods arise from 1-motives; we make this perspective effective by reducing the problem to divisor arithmetic on curves and providing the theoretical foundations for a practical and fully explicit algorithm. To illustrate the broader applicability of our methods, we also give an algorithmic classification of autonomous first-order (non-linear) differential equations. 2025-05-26T18:00:01Z 68 pages, comments welcome Emre Can Sertöz Joël Ouaknine James Worrell http://arxiv.org/abs/2505.18849v1 Fractal Attractors in Random Nonlinear Iterated Function Systems: Existence, Stability, and Dimensional Properties 2025-05-24T19:51:57Z This study develops a comprehensive theoretical and computational framework for Random Nonlinear Iterated Function Systems (RNIFS), a generalization of classical IFS models that incorporates both nonlinearity and stochasticity. We establish mathematical guarantees for the existence and stability of invariant fractal attractors by leveraging contractivity conditions, Lyapunov-type criteria, and measure-theoretic arguments. Empirically, we design a set of high-resolution simulations across diverse nonlinear functions and probabilistic schemes to analyze the emergent attractors geometry and dimensionality. A box-counting method is used to estimate the fractal dimension, revealing attractors with rich internal structure and dimensions ranging from 1.4 to 1.89. Additionally, we present a case study comparing RNIFS to the classical Sierpiński triangle, demonstrating the generalization's ability to preserve global shape while enhancing geometric complexity. These findings affirm the capacity of RNIFS to model intricate, self-similar structures beyond the reach of traditional deterministic systems, offering new directions for the study of random fractals in both theory and applications. 2025-05-24T19:51:57Z Mohamed Aly Bouke http://arxiv.org/abs/2402.07141v3 Reading Rational Univariate Representations on lexicographic Groebner bases 2025-05-23T16:42:05Z In this contribution, we consider a zero-dimensional polynomial system in $n$ variables defined over a field $\mathbb{K}$. In the context of computing a Rational Univariate Representation (RUR) of its solutions, we address the problem of certifying a separating linear form and, once certified, calculating the RUR that comes from it, without any condition on the ideal else than being zero-dimensional. Our key result is that the RUR can be read (closed formula) from lexicographic Groebner bases of bivariate elimination ideals, even in the case where the original ideal that is not in shape position, so that one can use the same core as the well known FGLM method to propose a simple algorithm. Our first experiments, either with a very short code (300 lines) written in Maple or with a Julia code using straightforward implementations performing only classical Gaussian reductions in addition to Groebner bases for the degree reverse lexicographic ordering, show that this new method is already competitive with sophisticated state of the art implementations which do not certify the parameterizations. 2024-02-11T09:50:49Z Alexander Demin Fabrice Rouillier Joao Ruiz http://arxiv.org/abs/2403.13457v2 OSVAuto: automatic proofs about functional specifications in OS verification 2025-05-22T02:06:35Z We present OSVAuto for automatic proofs about functional specifications that commonly arise when verifying operating system kernels. The algorithm behind OSVAuto is designed to support natively those data types that commonly occur in OS verification, including sequences, maps, structures and enumerations. Propositions about these data are encoded into a form that is suitable for SMT solving. For quantifier instantiation, we propose an extension of recent work for automatic proofs about sequences. We evaluate the algorithm on proof obligations adapted from existing verification of the uC-OS/II kernel in Coq, demonstrating that a large number of proof obligations can be solved automatically, significantly reducing the proof effort on the functional side. 2024-03-20T10:07:07Z Yulun Wu Bican Xia Jiale Xu Bohua Zhan Tianqi Zhao http://arxiv.org/abs/2505.15720v1 Linearized Polynomial Chinese remainder codes 2025-05-21T16:29:11Z In this paper, we introduce a new family of codes relevent for rank and sum-rank metrics. These codes are based on an effective Chinese remainders theorem for linearized polynomials over finite fields. We propose a decoding algorithm for some instances of these codes. 2025-05-21T16:29:11Z 26 pages, 7 figures Philippe Gaborit Camille Garnier Olivier Ruatta http://arxiv.org/abs/2308.03885v4 A New Bound on Cofactors of Sparse Polynomials 2025-05-21T06:03:07Z We prove that for polynomials $f, g, h \in \mathbb{Z}[x]$ satisfying $f = gh$ and $f(0) \neq 0$, the $\ell_2$-norm of the cofactor $h$ is bounded by $\|h\|_2 \leq \|f\|_1 \cdot\left( \widetilde{O}\left(\|g\|_0^3 \frac{\text{deg }{(f)}^2}{\sqrt{\text{deg }{(g)}}}\right)\right)^{\|g\|_0 - 1}$, where $\|g\|_0$ is the number of nonzero coefficients of $g$ (its sparsity). We also obtain similar results for polynomials over $\mathbb{C}$. This result significantly improves upon previously known exponential bounds (in $\text{deg }{(f)}$) for general polynomials. It further implies that, under exact division, the polynomial division algorithm runs in quasi-linear time with respect to the input size and the number of terms in the quotient $h$. This resolves a long-standing open problem concerning the exact divisibility of sparse polynomials. In particular, our result demonstrates a quadratic separation between the runtime (and representation size) of exact and non-exact divisibility by sparse polynomials. Notably, prior to our work, it was not even known whether the representation size of the quotient polynomial could be bounded by a sub-quadratic function of its number of terms, specifically of $\text{deg }{(f)}$. 2023-08-07T19:33:53Z Forum of Mathematics, Sigma 14 (2026) e52 Ido Nahshon Amir Shpilka 10.1017/fms.2026.10202 http://arxiv.org/abs/2505.14940v1 To Be or Not To Be: Vector ontologies as a truly formal ontological framework 2025-05-20T21:58:38Z Since Edmund Husserl coined the term "Formal Ontologies" in the early 20th century, a field that identifies itself with this particular branch of sciences has gained increasing attention. Many authors, and even Husserl himself have developed what they claim to be formal ontologies. I argue that under close inspection, none of these so claimed formal ontologies are truly formal in the Husserlian sense. More concretely, I demonstrate that they violate the two most important notions of formal ontology as developed in Husserl's Logical Investigations, namely a priori validity independent of perception and formalism as the total absence of content. I hence propose repositioning the work previously understood as formal ontology as the foundational ontology it really is. This is to recognize the potential of a truly formal ontology in the Husserlian sense. Specifically, I argue that formal ontology following his conditions, allows us to formulate ontological structures, which could capture what is more objectively without presupposing a particular framework arising from perception. I further argue that the ability to design the formal structure deliberately allows us to create highly scalable and interoperable information artifacts. As concrete evidence, I showcase that a class of formal ontology, which uses the axioms of vector spaces, is able to express most of the conceptualizations found in foundational ontologies. Most importantly, I argue that many information systems, specifically artificial intelligence, are likely already using some type of vector ontologies to represent reality in their internal worldviews and elaborate on the evidence that humans do as well. I hence propose a thorough investigation of the ability of vector ontologies to act as a human-machine interoperable ontological framework that allows us to understand highly sophisticated machines and machines to understand us. 2025-05-20T21:58:38Z Kaspar Rothenfusser http://arxiv.org/abs/2505.14790v1 Generalised Burnside and Dixon algorithms for irreducible projective representations 2025-05-20T18:00:39Z Based on the recently proposed character theory of projective representations of finite groups proposed, we generalise several algorithms for computing character tables and matrices of irreducible linear representations to projective representations. In particular, we present an algorithm based on that of Burnside to compute the characters of all irreducible projective representations of a finite group with a given Schur multiplier, and transpose it to exact integer arithmetic following Dixon's character table algorithm. We also describe an algorithm based on that of Dixon to split a projective representation into irreducible subspaces in floating-point arithmetic, and discuss how it can be used to compute matrices for all projective irreps with a given multiplier. Our algorithms bypass the construction of the representation group of the Schur multiplier, which makes them especially attractive for floating-point computations, where exact values of the multiplier are not necessarily available. 2025-05-20T18:00:39Z 11 pages Attila Szabó http://arxiv.org/abs/2505.14273v1 X-KAN: Optimizing Local Kolmogorov-Arnold Networks via Evolutionary Rule-Based Machine Learning 2025-05-20T12:26:03Z Function approximation is a critical task in various fields. However, existing neural network approaches struggle with locally complex or discontinuous functions due to their reliance on a single global model covering the entire problem space. We propose X-KAN, a novel method that optimizes multiple local Kolmogorov-Arnold Networks (KANs) through an evolutionary rule-based machine learning framework called XCSF. X-KAN combines KAN's high expressiveness with XCSF's adaptive partitioning capability by implementing local KAN models as rule consequents and defining local regions via rule antecedents. Our experimental results on artificial test functions and real-world datasets demonstrate that X-KAN significantly outperforms conventional methods, including XCSF, Multi-Layer Perceptron, and KAN, in terms of approximation accuracy. Notably, X-KAN effectively handles functions with locally complex or discontinuous structures that are challenging for conventional KAN, using a compact set of rules (average 7.2 $\pm$ 2.3 rules). These results validate the effectiveness of using KAN as a local model in XCSF, which evaluates the rule fitness based on both accuracy and generality. Our X-KAN implementation is available at https://github.com/YNU-NakataLab/X-KAN. 2025-05-20T12:26:03Z Accepted by the 34th International Joint Conference on Artificial Intelligence (IJCAI 2025) 34th International Joint Conference on Artificial Intelligence (IJCAI 2025) Hiroki Shiraishi Hisao Ishibuchi Masaya Nakata 10.24963/ijcai.2025/993 http://arxiv.org/abs/2505.13980v1 Symbolic and Numerical Tools for $L_{\infty}$-Norm Calculation 2025-05-20T06:26:45Z The computation of the $L_\infty $-norm is an important issue in $H_{\infty}$ control, particularly for analyzing system stability and robustness. This paper focuses on symbolic computation methods for determining the $L_{\infty} $-norm of finite-dimensional linear systems, highlighting their advantages in achieving exact solutions where numerical methods often encounter limitations. Key techniques such as Sturm-Habicht sequences, Rational Univariate Representations (RUR), and Cylindrical Algebraic Decomposition (CAD) are surveyed, with an emphasis on their theoretical foundations, practical implementations, and specific applicability to $ L_{\infty} $-norm computation. A comparative analysis is conducted between symbolic and conventional numerical approaches, underscoring scenarios in which symbolic computation provides superior accuracy, particularly in parametric cases. Benchmark evaluations reveal the strengths and limitations of both approaches, offering insights into the trade-offs involved. Finally, the discussion addresses the challenges of symbolic computation and explores future opportunities for its integration into control theory, particularly for robust and stable system analysis. 2025-05-20T06:26:45Z Grace Younes Alban Quadrat Fabrice Rouillier