https://arxiv.org/api/LYppmoN0TriHBHKcUfEKDwFr7v4 2026-06-13T13:53:57Z 3138 45 15 http://arxiv.org/abs/2605.19379v1 Graph-based automated discovery of concise soil hydraulic functions from data: beyond the Mualem - van Genuchten model 2026-05-19T05:25:53Z

Soil hydraulic functions are fundamental to modelling water flow and transport in vadose-zone hydrology and are central to a wide range of hydrological and geoscientific applications. Yet in practice, these functions are still predominantly specified through expert-designed empirical formulations, such as the Mualem-van Genuchten (MvG) model. Although such models have proved highly influential, their derivation relies on predefined functional assumptions that make it difficult to simultaneously achieve accuracy, compactness, and robustness across diverse soil textures. Here we present a graph-based automated model discovery framework for discovering explicit soil hydraulic functions directly from experimental data. Applied to the original datasets used in the development of the MvG model, the method identifies a concise soil water retention function and its associated unsaturated hydraulic conductivity function whose mathematical structure differs fundamentally from classical empirical forms. Across 249 real soil samples spanning diverse textural classes, the discovered functions achieve more accurate predictions of unsaturated hydraulic conductivity than the MvG model. The fitted parameters also exhibit correlations with soil physical properties. This work demonstrates that data-driven model discovery can move beyond traditional empirical derivation and provide a promising route for developing accurate and explicit constitutive models.

2026-05-19T05:25:53Z Hao Xu Jinshen Sun Yuntian Chen Dongxiao Zhang http://arxiv.org/abs/2605.18980v1 Computing Certificates in Archimedean Univariate Saturated Quadratic Modules 2026-05-18T18:03:07Z

A new symbolic algorithm to compute sums of squares multipliers (certificates) to witness the membership of non-negative univariate polynomials in a saturated univariate quadratic module is presented. Certificates are first computed in terms of natural generators introduced by Kuhlmann and Marshall for an Archimedean saturated quadratic module; natural generators can be easily read-off from a semialgebraic set. In the univariate case, an Archimedean quadratic module is also a preordering since it is closed under multiplication; certificates have different representations when a polynomial is viewed as a member in a quadratic module versus in a preordering An algorithm is given to compute certificates of natural generators in terms of the original generators; it uses a construction introduced by Kuhlmann, Marshall, and Schwartz known as the ``Basic Lemma'', which splits the non-negative factors of generators. To compute a quadratic module certificate, certificates of products of natural generators are computed using a detailed case analysis based on the types of natural generators. An implementation of the algorithms proposed in Maple is also discussed. The certificates obtained using this implementation are compared with those generated by RealCertify. We discuss examples where RealCertify is unable to find certificates while the proposed method is successful.

2026-05-18T18:03:07Z 51 pages, conference Jose Abel Castellanos-Joo Deepak Kapur http://arxiv.org/abs/2605.18454v1 Scheduling That Speaks: An Interpretable Programmatic Reinforcement Learning Framework 2026-05-18T14:19:59Z

Deep reinforcement learning (DRL) has recently emerged as a promising approach to solve combinatorial optimization problems such as job shop scheduling. However, the policies learned by DRL are typically represented by deep neural networks (DNNs), whose opaque neural architectures and non-interpretable policy decisions can lead to critical trust and usability concerns for human decision makers. In addition, the computational requirements of DNNs can further hinder practical deployment in resource constrained environments. In this work, we propose ProRL, a novel interpretable programmatic reinforcement learning framework that achieves high-performance scheduling with human-readable and editable programmatic policies (i.e., programs). We first introduce a domain-specific language for scheduling (DSL-S) to represent scheduling strategies as structured programs. ProRL then explores the program space defined by DSL-S using local search to identify incomplete programs, which are subsequently completed by learning their parameters via Bayesian optimization. ProRL learns which scheduling heuristic rules to select, and hence, it naturally incorporates existing heuristics already used in industrial scenarios. Experiments on widely used benchmark instances demonstrate the strong performance of ProRL against existing heuristics and DRL baselines. Furthermore, ProRL performs well under strongly constrained computational resources, such as training with only 100 episodes. Our code is available at https://github.com/HcPlu/ProRL.

2026-05-18T14:19:59Z Chengpeng Hu Yingqian Zhang Hendrik Baier http://arxiv.org/abs/2605.17505v1 Explicit cost analysis of Toom-4 multiplication for incomplete NTT in lattice-based cryptography 2026-05-17T15:34:27Z

Polynomial multiplication is fundamental in lattice-based cryptography. While the Number Theoretic Transform (NTT) enables fast multiplication, it imposes constraints on the modulus of the coefficient field. Hafiz et al. (2025) addressed this limitation by analyzing the incomplete NTT, which combines a truncated NTT with conventional multiplication methods In this work, we revisit Toom-4 multiplication in the context of incomplete NTT. Although Toom-4 is asymptotically faster than Karatsuba, its precise cost has not been expressed in a form compatible with the incomplete NTT framework. We present a concrete Toom-4 implementation and derive explicit operation counts that separate additions/subtractions and multiplications over the coefficient field. Our analysis based on addition chains yields a simple cost model for incomplete NTT. Using this model, we analyze hybrid strategies combining Toom-4, Karatsuba, and incomplete NTT. We identify parameter ranges where Toom-4 is advantageous and validate the predicted behavior experimentally.

2026-05-17T15:34:27Z 11 pages, Comments are welcome! Sakura Oku Momonari Kudo http://arxiv.org/abs/2511.03389v2 Terracini matroids: algebraic matroids of secants and embedded joins 2026-05-14T16:31:20Z

Applications of algebraic geometry have sparked much recent work on algebraic matroids. An algebraic matroid encodes algebraic dependencies among coordinate functions on a variety. We study the behavior of algebraic matroids under joins and secants of varieties. Motivated by Terracini's lemma, we introduce the notion of a Terracini union of matroids, which captures when the algebraic matroid of a join coincides with the matroid union of the algebraic matroids of its summands. We illustrate applications of our results with a discussion of the implications for toric surfaces and threefolds.

2025-11-05T11:50:03Z Fatemeh Mohammadi Jessica Sidman Louis Theran http://arxiv.org/abs/2603.00073v2 A Separation Method for Quartic Positivity and the Valid Region of Gram-Charlier densities 2026-05-14T13:09:58Z

The positivity of the Gram-Charlier probability density function has been a subject of extensive study for decades. Since Barton and Dennis (1952) introduced numerical positivity conditions, no analytic closed-form expression was available until Kwon (2019, 2022) proposed analytic solutions for the valid region of Gram-Charlier densities. Despite the significance of the analytical solutions, the expressions remain algebraically complex. As these conditions for the Gram-Charlier densities are determined by a quartic polynomial, it is essential to investigate its positivity. In this work, necessary and sufficient conditions for the positivity of a quartic polynomial are derived through a separation method. Based on these conditions, more concise analytic expressions for the positivity of the Gram-Charlier density are proposed.

2026-02-12T04:48:27Z 8 pages, 2 figures Taehun Kim Jung Chan Lee ByoungSeon Choi http://arxiv.org/abs/2602.04707v2 Exact Volumes of Semi-Algebraic Convex Bodies 2026-05-14T08:32:21Z

We compute the volumes of convex bodies that are given by inequalities of concave polynomials. These volumes are found to arbitrary precision thanks to the representation of periods by linear differential equations. Our approach rests on work of Lairez, Mezzarobba, and Safey El Din. We present a novel method to identify the relevant critical values. Convexity allows us to reduce the required number of creative telescoping steps by an exponential factor. We provide an implementation based on the ore_algebra package in SageMath. We present examples computed with our implementation in 2, 3 and 4 dimensions.

2026-02-04T16:19:07Z 9 pages, 3 figures. Revised version Lakshmi Ramesh Nicolas Weiss http://arxiv.org/abs/2605.12704v1 FePySR: A Neural Feature Extraction Framework for Efficient and Scalable Symbolic Regression 2026-05-12T20:04:59Z

A fundamental challenge in symbolic regression (SR) is efficiently recovering complex mathematical expressions from observational data. Although this problem is NP-hard, many expressions of practical interest decompose naturally into combinations of nonlinear feature modules, concentrating structural complexity into a small number of reusable components. Here, we introduce FePySR, a two-stage framework that reduces the SR search space by extracting valid features prior to equation search. FePySR first employs a heterogeneous neural network to constrain observational data to a set of candidate expressions, then performs structural optimization within this refined expression space using PySR. Across five standard benchmarks, FePySR outperforms state-of-the-art methods by achieving higher equation recovery rates. On a set of 75 highly complex synthesized equations, FePySR recovers 36 equations, while producing substantially smaller mean squared errors on the remaining unrecovered cases, with reduced computation time compared to PySR. FePySR's first stage also maintains consistent performance under varying numbers of selected top features and increasing levels of noise in the observational data. Applied to ordinary differential equations governing biological systems, FePySR successfully identifies governing equations in 24 out of 100 tests where PySR recovers none. Taken together, FePySR is a generalizable framework that can enhance the SR solvers, enabling the efficient and reliable recovery of symbolic expressions across scientific domains.

2026-05-12T20:04:59Z Data and Code Availability: https://github.com/laixn/FePySR Zhiming Yu Wangtao Lu Xin Lai http://arxiv.org/abs/2602.09702v2 On semidefinite-representable sets over valued fields 2026-05-12T08:29:15Z

Polyhedra and spectrahedra over the real numbers, or more generally their images under linear maps, are respectively the feasible sets of linear and semidefinite programming, and form the family of semidefinite-representable sets. This paper studies analogues of these sets, as well as the associated optimization problems, when the data are taken over a valued field $K$. For $K$-polyhedra and linear programming over $K$ we present an algorithm based on the computation of Smith normal forms. We prove that fundamental properties of semidefinite-representable sets extend to the valued setting. In particular, we exhibit examples of non-polyhedral $K$-spectrahedra, as well as sets that are semidefinite-representable over $K$ but are not $K$-spectrahedra.

2026-02-10T12:01:20Z 9 pages, 1 figure ISSAC 2026 Corentin Cornou Simone Naldi Tristan Vaccon 10.1145/3815436.3815464 http://arxiv.org/abs/2601.05026v3 A data structure for monomial ideals with applications to signature Gröbner bases 2026-05-11T21:55:07Z

We introduce monomial divisibility diagrams (MDDs), a data structure for monomial ideals that supports insertion of new generators and fast membership tests. MDDs stem from a canonical tree representation by maximally sharing equal subtrees, yielding a directed acyclic graph. We establish basic complexity bounds for membership and insertion, and study empirically the size of MDDs. As an application, we integrate MDDs into the signature Gröbner basis implementation of the Julia package AlgebraicSolving.jl. Membership tests in monomial ideals are used to detect some reductions to zero, and the use of MDDs leads to substantial speed-ups compared to the existing representation by lists of generators with divmasks.

2026-01-08T15:33:58Z Pierre Lairez Rafael Mohr Théo Ternier 10.1145/3815436.3815473 http://arxiv.org/abs/2605.10327v1 SCALAR: A Neurosymbolic Framework for Automated Conjecture and Reasoning in Quantum Circuit Analysis 2026-05-11T10:31:18Z

In this paper, we present SCALAR (Symbolic Conjecture and LLM-Assisted Reasoning), a neurosymbolic framework for automated conjecture generation in quantum circuit analysis built on top of the CUDA-Q open source framework. The system integrates quantum simulation, symbolic conjecture generation, and LLM-based interpretation. We evaluate SCALAR on 82 MaxCut instances from the MQLib benchmark dataset and extend the analysis to 2,000 randomly generated graphs across four topologies: regular, Erdos-Renyi, Barabasi-Albert, and Watts-Strogatz. The framework generates conjectured bounds relating optimal QAOA parameters to graph invariants, including known relationships such as periodicity constraints on the phase separation parameter $γ$. SCALAR also recovers previously reported parameter transfer phenomena across structurally similar instances. Additionally, the system identifies correlations between graph structural features and optimization landscape properties, which we characterize through invariant-based descriptors. Using CUDA-Q tensor network simulator, we scale experiments to instances of up to 77 qubits. We discuss the accuracy, generality, and limitations of the generated conjectures, including sensitivity to graph class and quantum circuit depth.

2026-05-11T10:31:18Z Sean Feeney Pooja Rao Andreas Klappenecker Reuben Tate Yuri Alexeev Stefano Mensa Elica Kyoseva Stephan Eidenbenz http://arxiv.org/abs/2505.10246v4 An Algorithm for Computing the Leading Monomials of a Minimal Groebner Basis of Generic Sequences 2026-05-11T01:49:39Z

We present an efficient algorithm for computing the leading monomials of a minimal Groebner basis of a generic sequence of homogeneous polynomials. Our approach bypasses costly polynomial reductions by exploiting structural properties conjectured to hold for generic sequences-specifically, that their leading monomial ideals are weakly reverse lexicographic and that their Hilbert series follow a known closed-form expression. The algorithm incrementally constructs the set of leading monomials degree by degree by comparing Hilbert functions of monomial ideals with the expected Hilbert series of the input ideal. To enhance computational efficiency, we introduce several optimization techniques that progressively narrow the search space and reduce the number of divisibility checks required at each step. We also refine the loop termination condition using degree bounds, thereby avoiding unnecessary recomputation of Hilbert series. Experimental results confirm that the proposed method substantially reduces both computation time and memory usage compared to conventional Groebner basis computations for computing the leading monomials of a minimal Groebner basis of generic sequences.

2025-05-15T13:00:44Z Kosuke Sakata Tsuyoshi Takagi http://arxiv.org/abs/2605.09696v1 Discovery of Nonlinear Dynamics with Automated Basis Function Generation 2026-05-10T18:30:17Z

Discovering governing equations from observational data remains a fundamental challenge in scientific modeling, particularly when the underlying mathematical structure is unknown. Traditional sparse identification methods like SINDy excel at discovering parsimonious models but require researchers to specify candidate basis functions a priori, a limitation that often leads to model failure when critical terms are omitted or when systems exhibit unconventional dynamics. Purely symbolic regression approaches offer unlimited flexibility but struggle with noise sensitivity and frequently produce overly complex, unstable equations. We present AutoSINDy, a hybrid Discovery-then-Solve framework that combines the exploratory power of symbolic regression with the robust sparsity-promoting capabilities of SINDy. Our method operates in three stages: (1) PySR-based symbolic regression discovers candidate functional forms from bootstrapped data chunks; (2) a curation pipeline decomposes, expands, and filters these expressions using collinearity analysis to construct a minimal yet comprehensive library; and (3) SINDy identifies sparse governing equations from this custom-tailored library. Extensive experiments across canonical nonlinear systems demonstrate that AutoSINDy consistently recovers ground-truth equations even under high observational noise, achieving a ground-truth recovery rate of 92.8% across all trials. Compared with standard SINDy using enriched libraries and standalone symbolic regression, AutoSINDy achieves higher predictive accuracy, superior generalization to unseen trajectories, and substantially lower symbolic complexity.

2026-05-10T18:30:17Z 53 pages, 17 figures. Code available at https://github.com/mabasiri95/AutoSINDy Mohammad Amin Basiri Charles Nicholson http://arxiv.org/abs/2605.09286v1 Matrix equivalence to Smith normal form: new theoretical results for multivariate polynomial matrices 2026-05-10T03:18:49Z

This paper investigates the Smith normal form equivalence problem for multivariate polynomial matrices. Using methods from matrix theory and polynomial ideal theory, we prove that Frost and Storey's 1978 conjecture holds for a broad class of matrices: such a matrix is equivalent to its Smith normal form if and only if its reduced minors of each order generate the unit ideal. Moreover, by extending the original matrix class via automorphisms of the polynomial ring, we show that our framework applies in a substantially more general setting.

2026-05-10T03:18:49Z Dong Lu Yuanyuan Ruan Dingkang Wang Fanghui Xiao http://arxiv.org/abs/2605.07784v1 Computing bases in Hermite normal form of lattices of integer relations 2026-05-08T14:21:30Z

Given a full column rank $M \in \Z^{\ell \times m}$ and an $F \in \Z^{n \times m}$ we present an algorithm to compute the $n \times n$ basis in Hermite form of the integer lattice comprised of all rows $p \in \Z^{1 \times n}$ such that $pF \in \Z^{1 \times m}$ is in the integer lattice generated by the rows of $M$. The algorithm is randomized of the Las Vegas type, that is, it can fail with probability at most $1/2$, but if fail is not returned it guarantees to produce the correct result. When $M$ is square and $F=I_m$, then the computed basis is the Hermite normal form of $M$, and the algorithm uses about the same number of bit operations as required to multiply together two matrices of the same dimension and size of entries as $M$.

2026-05-08T14:21:30Z George Labahn Arne Storjohann