https://arxiv.org/api/LYppmoN0TriHBHKcUfEKDwFr7v42026-06-13T13:53:57Z31384515http://arxiv.org/abs/2605.19379v1Graph-based automated discovery of concise soil hydraulic functions from data: beyond the Mualem - van Genuchten model2026-05-19T05:25:53ZSoil hydraulic functions are fundamental to modelling water flow and transport in vadose-zone hydrology and are central to a wide range of hydrological and geoscientific applications. Yet in practice, these functions are still predominantly specified through expert-designed empirical formulations, such as the Mualem-van Genuchten (MvG) model. Although such models have proved highly influential, their derivation relies on predefined functional assumptions that make it difficult to simultaneously achieve accuracy, compactness, and robustness across diverse soil textures. Here we present a graph-based automated model discovery framework for discovering explicit soil hydraulic functions directly from experimental data. Applied to the original datasets used in the development of the MvG model, the method identifies a concise soil water retention function and its associated unsaturated hydraulic conductivity function whose mathematical structure differs fundamentally from classical empirical forms. Across 249 real soil samples spanning diverse textural classes, the discovered functions achieve more accurate predictions of unsaturated hydraulic conductivity than the MvG model. The fitted parameters also exhibit correlations with soil physical properties. This work demonstrates that data-driven model discovery can move beyond traditional empirical derivation and provide a promising route for developing accurate and explicit constitutive models.2026-05-19T05:25:53ZHao XuJinshen SunYuntian ChenDongxiao Zhanghttp://arxiv.org/abs/2605.18980v1Computing Certificates in Archimedean Univariate Saturated Quadratic Modules2026-05-18T18:03:07ZA new symbolic algorithm to compute sums of squares multipliers (certificates) to witness the membership of non-negative univariate polynomials in a saturated univariate quadratic module is presented. Certificates are first computed in terms of natural generators introduced by Kuhlmann and Marshall for an Archimedean saturated quadratic module; natural generators can be easily read-off from a semialgebraic set. In the univariate case, an Archimedean quadratic module is also a preordering since it is closed under multiplication; certificates have different representations when a polynomial is viewed as a member in a quadratic module versus in a preordering An algorithm is given to compute certificates of natural generators in terms of the original generators; it uses a construction introduced by Kuhlmann, Marshall, and Schwartz known as the ``Basic Lemma'', which splits the non-negative factors of generators. To compute a quadratic module certificate, certificates of products of natural generators are computed using a detailed case analysis based on the types of natural generators.
An implementation of the algorithms proposed in Maple is also discussed. The certificates obtained using this implementation are compared with those generated by RealCertify. We discuss examples where RealCertify is unable to find certificates while the proposed method is successful.2026-05-18T18:03:07Z51 pages, conferenceJose Abel Castellanos-JooDeepak Kapurhttp://arxiv.org/abs/2605.18454v1Scheduling That Speaks: An Interpretable Programmatic Reinforcement Learning Framework2026-05-18T14:19:59ZDeep reinforcement learning (DRL) has recently emerged as a promising approach to solve combinatorial optimization problems such as job shop scheduling. However, the policies learned by DRL are typically represented by deep neural networks (DNNs), whose opaque neural architectures and non-interpretable policy decisions can lead to critical trust and usability concerns for human decision makers. In addition, the computational requirements of DNNs can further hinder practical deployment in resource constrained environments. In this work, we propose ProRL, a novel interpretable programmatic reinforcement learning framework that achieves high-performance scheduling with human-readable and editable programmatic policies (i.e., programs). We first introduce a domain-specific language for scheduling (DSL-S) to represent scheduling strategies as structured programs. ProRL then explores the program space defined by DSL-S using local search to identify incomplete programs, which are subsequently completed by learning their parameters via Bayesian optimization. ProRL learns which scheduling heuristic rules to select, and hence, it naturally incorporates existing heuristics already used in industrial scenarios. Experiments on widely used benchmark instances demonstrate the strong performance of ProRL against existing heuristics and DRL baselines. Furthermore, ProRL performs well under strongly constrained computational resources, such as training with only 100 episodes. Our code is available at https://github.com/HcPlu/ProRL.2026-05-18T14:19:59ZChengpeng HuYingqian ZhangHendrik Baierhttp://arxiv.org/abs/2605.17505v1Explicit cost analysis of Toom-4 multiplication for incomplete NTT in lattice-based cryptography2026-05-17T15:34:27ZPolynomial multiplication is fundamental in lattice-based cryptography. While the Number Theoretic Transform (NTT) enables fast multiplication, it imposes constraints on the modulus of the coefficient field. Hafiz et al. (2025) addressed this limitation by analyzing the incomplete NTT, which combines a truncated NTT with conventional multiplication methods In this work, we revisit Toom-4 multiplication in the context of incomplete NTT. Although Toom-4 is asymptotically faster than Karatsuba, its precise cost has not been expressed in a form compatible with the incomplete NTT framework. We present a concrete Toom-4 implementation and derive explicit operation counts that separate additions/subtractions and multiplications over the coefficient field. Our analysis based on addition chains yields a simple cost model for incomplete NTT. Using this model, we analyze hybrid strategies combining Toom-4, Karatsuba, and incomplete NTT. We identify parameter ranges where Toom-4 is advantageous and validate the predicted behavior experimentally.2026-05-17T15:34:27Z11 pages, Comments are welcome!Sakura OkuMomonari Kudohttp://arxiv.org/abs/2511.03389v2Terracini matroids: algebraic matroids of secants and embedded joins2026-05-14T16:31:20ZApplications of algebraic geometry have sparked much recent work on algebraic matroids. An algebraic matroid encodes algebraic dependencies among coordinate functions on a variety. We study the behavior of algebraic matroids under joins and secants of varieties. Motivated by Terracini's lemma, we introduce the notion of a Terracini union of matroids, which captures when the algebraic matroid of a join coincides with the matroid union of the algebraic matroids of its summands. We illustrate applications of our results with a discussion of the implications for toric surfaces and threefolds.2025-11-05T11:50:03ZFatemeh MohammadiJessica SidmanLouis Theranhttp://arxiv.org/abs/2603.00073v2A Separation Method for Quartic Positivity and the Valid Region of Gram-Charlier densities2026-05-14T13:09:58ZThe positivity of the Gram-Charlier probability density function has been a subject of extensive study for decades. Since Barton and Dennis (1952) introduced numerical positivity conditions, no analytic closed-form expression was available until Kwon (2019, 2022) proposed analytic solutions for the valid region of Gram-Charlier densities. Despite the significance of the analytical solutions, the expressions remain algebraically complex. As these conditions for the Gram-Charlier densities are determined by a quartic polynomial, it is essential to investigate its positivity. In this work, necessary and sufficient conditions for the positivity of a quartic polynomial are derived through a separation method. Based on these conditions, more concise analytic expressions for the positivity of the Gram-Charlier density are proposed.2026-02-12T04:48:27Z8 pages, 2 figuresTaehun KimJung Chan LeeByoungSeon Choihttp://arxiv.org/abs/2602.04707v2Exact Volumes of Semi-Algebraic Convex Bodies2026-05-14T08:32:21ZWe compute the volumes of convex bodies that are given by inequalities of concave polynomials. These volumes are found to arbitrary precision thanks to the representation of periods by linear differential equations. Our approach rests on work of Lairez, Mezzarobba, and Safey El Din. We present a novel method to identify the relevant critical values. Convexity allows us to reduce the required number of creative telescoping steps by an exponential factor. We provide an implementation based on the ore_algebra package in SageMath. We present examples computed with our implementation in 2, 3 and 4 dimensions.2026-02-04T16:19:07Z9 pages, 3 figures. Revised versionLakshmi RameshNicolas Weisshttp://arxiv.org/abs/2605.12704v1FePySR: A Neural Feature Extraction Framework for Efficient and Scalable Symbolic Regression2026-05-12T20:04:59ZA fundamental challenge in symbolic regression (SR) is efficiently recovering complex mathematical expressions from observational data. Although this problem is NP-hard, many expressions of practical interest decompose naturally into combinations of nonlinear feature modules, concentrating structural complexity into a small number of reusable components. Here, we introduce FePySR, a two-stage framework that reduces the SR search space by extracting valid features prior to equation search. FePySR first employs a heterogeneous neural network to constrain observational data to a set of candidate expressions, then performs structural optimization within this refined expression space using PySR. Across five standard benchmarks, FePySR outperforms state-of-the-art methods by achieving higher equation recovery rates. On a set of 75 highly complex synthesized equations, FePySR recovers 36 equations, while producing substantially smaller mean squared errors on the remaining unrecovered cases, with reduced computation time compared to PySR. FePySR's first stage also maintains consistent performance under varying numbers of selected top features and increasing levels of noise in the observational data. Applied to ordinary differential equations governing biological systems, FePySR successfully identifies governing equations in 24 out of 100 tests where PySR recovers none. Taken together, FePySR is a generalizable framework that can enhance the SR solvers, enabling the efficient and reliable recovery of symbolic expressions across scientific domains.2026-05-12T20:04:59ZData and Code Availability: https://github.com/laixn/FePySRZhiming YuWangtao LuXin Laihttp://arxiv.org/abs/2602.09702v2On semidefinite-representable sets over valued fields2026-05-12T08:29:15ZPolyhedra and spectrahedra over the real numbers, or more generally their images under linear maps, are respectively the feasible sets of linear and semidefinite programming, and form the family of semidefinite-representable sets. This paper studies analogues of these sets, as well as the associated optimization problems, when the data are taken over a valued field $K$. For $K$-polyhedra and linear programming over $K$ we present an algorithm based on the computation of Smith normal forms. We prove that fundamental properties of semidefinite-representable sets extend to the valued setting. In particular, we exhibit examples of non-polyhedral $K$-spectrahedra, as well as sets that are semidefinite-representable over $K$ but are not $K$-spectrahedra.2026-02-10T12:01:20Z9 pages, 1 figureISSAC 2026Corentin CornouSimone NaldiTristan Vaccon10.1145/3815436.3815464http://arxiv.org/abs/2601.05026v3A data structure for monomial ideals with applications to signature Gröbner bases2026-05-11T21:55:07ZWe introduce monomial divisibility diagrams (MDDs), a data structure for monomial ideals that supports insertion of new generators and fast membership tests. MDDs stem from a canonical tree representation by maximally sharing equal subtrees, yielding a directed acyclic graph. We establish basic complexity bounds for membership and insertion, and study empirically the size of MDDs. As an application, we integrate MDDs into the signature Gröbner basis implementation of the Julia package AlgebraicSolving.jl. Membership tests in monomial ideals are used to detect some reductions to zero, and the use of MDDs leads to substantial speed-ups compared to the existing representation by lists of generators with divmasks.2026-01-08T15:33:58ZPierre LairezRafael MohrThéo Ternier10.1145/3815436.3815473http://arxiv.org/abs/2605.10327v1SCALAR: A Neurosymbolic Framework for Automated Conjecture and Reasoning in Quantum Circuit Analysis2026-05-11T10:31:18ZIn this paper, we present SCALAR (Symbolic Conjecture and LLM-Assisted Reasoning), a neurosymbolic framework for automated conjecture generation in quantum circuit analysis built on top of the CUDA-Q open source framework. The system integrates quantum simulation, symbolic conjecture generation, and LLM-based interpretation. We evaluate SCALAR on 82 MaxCut instances from the MQLib benchmark dataset and extend the analysis to 2,000 randomly generated graphs across four topologies: regular, Erdos-Renyi, Barabasi-Albert, and Watts-Strogatz. The framework generates conjectured bounds relating optimal QAOA parameters to graph invariants, including known relationships such as periodicity constraints on the phase separation parameter $γ$. SCALAR also recovers previously reported parameter transfer phenomena across structurally similar instances. Additionally, the system identifies correlations between graph structural features and optimization landscape properties, which we characterize through invariant-based descriptors. Using CUDA-Q tensor network simulator, we scale experiments to instances of up to 77 qubits. We discuss the accuracy, generality, and limitations of the generated conjectures, including sensitivity to graph class and quantum circuit depth.2026-05-11T10:31:18ZSean FeeneyPooja RaoAndreas KlappeneckerReuben TateYuri AlexeevStefano MensaElica KyosevaStephan Eidenbenzhttp://arxiv.org/abs/2505.10246v4An Algorithm for Computing the Leading Monomials of a Minimal Groebner Basis of Generic Sequences2026-05-11T01:49:39ZWe present an efficient algorithm for computing the leading monomials of a minimal Groebner basis of a generic sequence of homogeneous polynomials. Our approach bypasses costly polynomial reductions by exploiting structural properties conjectured to hold for generic sequences-specifically, that their leading monomial ideals are weakly reverse lexicographic and that their Hilbert series follow a known closed-form expression. The algorithm incrementally constructs the set of leading monomials degree by degree by comparing Hilbert functions of monomial ideals with the expected Hilbert series of the input ideal. To enhance computational efficiency, we introduce several optimization techniques that progressively narrow the search space and reduce the number of divisibility checks required at each step. We also refine the loop termination condition using degree bounds, thereby avoiding unnecessary recomputation of Hilbert series. Experimental results confirm that the proposed method substantially reduces both computation time and memory usage compared to conventional Groebner basis computations for computing the leading monomials of a minimal Groebner basis of generic sequences.2025-05-15T13:00:44ZKosuke SakataTsuyoshi Takagihttp://arxiv.org/abs/2605.09696v1Discovery of Nonlinear Dynamics with Automated Basis Function Generation2026-05-10T18:30:17ZDiscovering governing equations from observational data remains a fundamental challenge in scientific modeling, particularly when the underlying mathematical structure is unknown. Traditional sparse identification methods like SINDy excel at discovering parsimonious models but require researchers to specify candidate basis functions a priori, a limitation that often leads to model failure when critical terms are omitted or when systems exhibit unconventional dynamics. Purely symbolic regression approaches offer unlimited flexibility but struggle with noise sensitivity and frequently produce overly complex, unstable equations. We present AutoSINDy, a hybrid Discovery-then-Solve framework that combines the exploratory power of symbolic regression with the robust sparsity-promoting capabilities of SINDy. Our method operates in three stages: (1) PySR-based symbolic regression discovers candidate functional forms from bootstrapped data chunks; (2) a curation pipeline decomposes, expands, and filters these expressions using collinearity analysis to construct a minimal yet comprehensive library; and (3) SINDy identifies sparse governing equations from this custom-tailored library. Extensive experiments across canonical nonlinear systems demonstrate that AutoSINDy consistently recovers ground-truth equations even under high observational noise, achieving a ground-truth recovery rate of 92.8% across all trials. Compared with standard SINDy using enriched libraries and standalone symbolic regression, AutoSINDy achieves higher predictive accuracy, superior generalization to unseen trajectories, and substantially lower symbolic complexity.2026-05-10T18:30:17Z53 pages, 17 figures. Code available at https://github.com/mabasiri95/AutoSINDyMohammad Amin BasiriCharles Nicholsonhttp://arxiv.org/abs/2605.09286v1Matrix equivalence to Smith normal form: new theoretical results for multivariate polynomial matrices2026-05-10T03:18:49ZThis paper investigates the Smith normal form equivalence problem for multivariate polynomial matrices. Using methods from matrix theory and polynomial ideal theory, we prove that Frost and Storey's 1978 conjecture holds for a broad class of matrices: such a matrix is equivalent to its Smith normal form if and only if its reduced minors of each order generate the unit ideal. Moreover, by extending the original matrix class via automorphisms of the polynomial ring, we show that our framework applies in a substantially more general setting.2026-05-10T03:18:49ZDong LuYuanyuan RuanDingkang WangFanghui Xiaohttp://arxiv.org/abs/2605.07784v1Computing bases in Hermite normal form of lattices of integer relations2026-05-08T14:21:30ZGiven a full column rank $M \in \Z^{\ell \times m}$ and an $F \in \Z^{n \times m}$ we present an algorithm to compute the $n \times n$ basis in Hermite form of the integer lattice comprised of all rows $p \in \Z^{1 \times n}$ such that $pF \in \Z^{1 \times m}$ is in the integer lattice generated by the rows of $M$. The algorithm is randomized of the Las Vegas type, that is, it can fail with probability at most $1/2$, but if fail is not returned it guarantees to produce the correct result. When $M$ is square and $F=I_m$, then the computed basis is the Hermite normal form of $M$, and the algorithm uses about the same number of bit operations as required to multiply together two matrices of the same dimension and size of entries as $M$.2026-05-08T14:21:30ZGeorge LabahnArne Storjohann