https://arxiv.org/api/LYppmoN0TriHBHKcUfEKDwFr7v42026-03-28T12:41:55Z30694515http://arxiv.org/abs/2602.20557v1GENSR: Symbolic Regression Based in Equation Generative Space2026-02-24T05:14:34ZSymbolic Regression (SR) tries to reveal the hidden equations behind observed data. However, most methods search within a discrete equation space, where the structural modifications of equations rarely align with their numerical behavior, leaving fitting error feedback too noisy to guide exploration. To address this challenge, we propose GenSR, a generative latent space-based SR framework following the `map construction -> coarse localization -> fine search'' paradigm. Specifically, GenSR first pretrains a dual-branch Conditional Variational Autoencoder (CVAE) to reparameterize symbolic equations into a generative latent space with symbolic continuity and local numerical smoothness. This space can be regarded as a well-structured `map'' of the equation space, providing directional signals for search. At inference, the CVAE coarsely localizes the input data to promising regions in the latent space. Then, a modified CMA-ES refines the candidate region, leveraging smooth latent gradients. From a Bayesian perspective, GenSR reframes the SR task as maximizing the conditional distribution $p(\mathrm{Equ.} \mid \mathrm{Num.})$, with CVAE training achieving this objective through the Evidence Lower Bound (ELBO). This new perspective provides a theoretical guarantee for the effectiveness of GenSR. Extensive experiments show that GenSR jointly optimizes predictive accuracy, expression simplicity, and computational efficiency, while remaining robust under noise.2026-02-24T05:14:34ZQian LiYuxiao HuJuncheng LiuYuntian Chenhttp://arxiv.org/abs/2506.07751v4AbstRaL: Augmenting LLMs' Reasoning by Reinforcing Abstract Thinking2026-02-23T18:25:13ZRecent studies have shown that large language models (LLMs), especially smaller ones, often lack robustness in grade school math (GSM) reasoning. In particular, they tend to experience performance drops when faced with distribution shifts, such as changes to numerical or nominal variables, or insertions of distracting clauses. A possible strategy to address this involves generating synthetic data to further "instantiate" reasoning problems on potential variations. In this work, we instead focus on the strategy of "abstracting" reasoning problems. This not only helps counteract distribution shifts but also facilitates the connection to symbolic tools for deriving solutions. Focusing on GSM, we find that this abstraction process is better acquired through reinforcement learning (RL) than just supervised fine-tuning, which often fails to produce faithful abstractions. Our method, AbstRaL -- which promotes abstract reasoning in LLMs using RL on granular abstraction data -- significantly mitigates performance degradation on recent GSM perturbation benchmarks. Besides, improving GSM robustness via AbstRaL is shown to also implicitly benefit LLMs' capabilities on OOD mathematical and general reasoning tasks, indicating that abstract thinking broadly enables better generalizability.2025-06-09T13:34:50ZICLR 2026Silin GaoAntoine BosselutSamy BengioEmmanuel Abbehttp://arxiv.org/abs/2602.19886v1Order Bounds for Hypergeometric and q-Hypergeometric Creative Telescoping2026-02-23T14:31:44ZLeveraging a general framework adapted from symbolic integration, a unified reduction-based algorithm for computing telescopers of minimal order for hypergeometric and q-hypergeometric terms has been recently developed. In this paper, we conduct a deeper exploration and put forth a new argument for the termination of the algorithm. This not only provides an independent proof of existence of telescopers, but also allows us to derive unified upper and lower bounds on the order of telescopers for hypergeometric terms and their q-analogues. Compared with known bounds in the literature, our bounds, in the hypergeometric case, are exactly the same as the tight ones obtained in 2016; while in the q-hypergeometric case, no lower bounds were known before, and our upper bound is sometimes better and never worse than the known one.2026-02-23T14:31:44ZHui Huanghttp://arxiv.org/abs/2602.19686v1A Flow Extension to Coroutine Types for Deadlock Detection in Go2026-02-23T10:31:22ZCoroutines, as an abstract programming construct, are a generalization of functions that can suspend execution part- way for later resumption. Coroutine Types are behavioral types to model interactions of coroutines with a single receiving operation followed by a single yielding operation. Coroutine Types have been applied to model-driven engineering, smart contracts, and test case generation. We contribute a Flow extension to Coroutine Types, so that coroutines with more than one receiving and yielding operation can be modeled. We accordingly revise the reduction rules of Coroutine Types. To show the usefulness of the Flow extension, we contribute a type system that maps expressions of the Go programming language to Coroutine Types. If the reduction result is 0, the two channel operations are paired properly and the program has no deadlocks. We choose Go because it is a popular programming language for distributed systems, but a frequent kind of bugs in Go is deadlocks due to the wrong use of concurrency features. We concentrate on the most commonly used semantics in Go: unbuffered channels with the keywords go and defer. Our Flow extension and the type system recognize 17 patterns of channels and goroutine interactions, including mismatched receivers and senders, nested goroutines, etc. We also integrate the Z3 SMT solver to take account of conditional execution and type inheritance. Other static or dynamic deadlock detectors crashed or gave wrong predictions in some patterns. Therefore, our type-based deadlock analyzer not only fills the gap in the landscape of value-based detection, but also complements existing detectors.2026-02-23T10:31:22ZAccepted in ICSESS 2025, MacaoQiqi Jason GuLixue LiuWei Kehttp://arxiv.org/abs/2602.19255v1Statistical Analysis of Hairpins and BasePairs in RNA Secondary Structures2026-02-22T16:11:22ZWe derive precise asymptotic expressions for the expectations, variances, covariance, and quite a few further mixed moments for the number of hairpins and the number of basepairs in RNA secondary structures, and give convincing evidence that the central-scaled distribution of the pair of random variables (hairpins, basepairs) tends in distribution to the bi-variate normal distribution with correlation $\sqrt{5 \sqrt{5} -11}/2= 0.2123322205\dots$2026-02-22T16:11:22ZAJ BuManuel KauersDoron Zeilbergerhttp://arxiv.org/abs/2602.08885v4Breaking the Simplification Bottleneck in Amortized Neural Symbolic Regression2026-02-21T18:07:07ZSymbolic regression (SR) aims to discover interpretable analytical expressions that accurately describe observed data. Amortized SR promises to be much more efficient than the predominant genetic programming SR methods, but currently struggles to scale to realistic scientific complexity. We find that a key obstacle is the lack of a fast reduction of equivalent expressions to a concise normalized form. Amortized SR has addressed this by general-purpose Computer Algebra Systems (CAS) like SymPy, but the high computational cost severely limits training and inference speed. We propose SimpliPy, a rule-based simplification engine achieving a 100-fold speed-up over SymPy at comparable quality. This enables substantial improvements in amortized SR, including scalability to much larger training sets, more efficient use of the per-expression token budget, and systematic training set decontamination with respect to equivalent test expressions. We demonstrate these advantages in our Flash-ANSR framework, which achieves much better accuracy than amortized baselines (NeSymReS, E2E) on the FastSRB benchmark. Moreover, it performs on par with state-of-the-art direct optimization (PySR) while recovering more concise instead of more complex expressions with increasing inference budget.2026-02-09T16:47:00Zmain text: 8 pages, 7 figures; appendix: 12 pages, 11 figures; code available at https://github.com/psaegert/simplipy and https://github.com/psaegert/flash-ansr; v2: Fixed rendering artifact in Figure 7; v3: Fixed Figure 3 title and formula; v4: Fixed Eq (1), example in App. M, Fig 13Paul SaegertUllrich Köthehttp://arxiv.org/abs/2602.18916v1Adaptive Collaboration of Arena-Based Argumentative LLMs for Explainable and Contestable Legal Reasoning2026-02-21T17:47:13ZLegal reasoning requires not only high accuracy but also the ability to justify decisions through verifiable and contestable arguments. However, existing Large Language Model (LLM) approaches, such as Chain-of-Thought (CoT) and Retrieval-Augmented Generation (RAG), often produce unstructured explanations that lack a formal mechanism for verification or user intervention. To address this limitation, we propose Adaptive Collaboration of Argumentative LLMs (ACAL), a neuro-symbolic framework that integrates adaptive multi-agent collaboration with an Arena-based Quantitative Bipolar Argumentation Framework (A-QBAF). ACAL dynamically deploys expert agent teams to construct arguments, employs a clash resolution mechanism to adjudicate conflicting claims, and utilizes uncertainty-aware escalation for borderline cases. Crucially, our framework supports a Human-in-the-Loop (HITL) contestability workflow, enabling users to directly audit and modify the underlying reasoning graph to influence the final judgment. Empirical evaluations on the LegalBench benchmark demonstrate that ACAL outperforms strong baselines across Gemini-2.5-Flash-Lite and Gemini-2.5-Flash architectures, effectively balancing efficient predictive performance with structured transparency and contestability. Our implementation is available at: https://github.com/loc110504/ACAL.2026-02-21T17:47:13ZHoang-Loc CaoPhuc HoTruong Thanh Hung NguyenPhuc Truong Loc NguyenDinh Thien Loc NguyenHung Caohttp://arxiv.org/abs/1811.10062v5On Exact Reznick, Hilbert-Artin and Putinar's Representations2026-02-20T22:12:08ZWe consider the problem of computing exact sums of squares (SOS) decompositions for certain classes of non-negative multivariate polynomials, relying on semidefinite programming (SDP) solvers.
We provide a hybrid numeric-symbolic algorithm computing exact rational SOS decompositions with rational coefficients for polynomials lying in the interior of the SOS cone. The first step of this algorithm computes an approximate SOS decomposition for a perturbation of the input polynomial with an arbitrary-precision SDP solver. Next, an exact SOS decomposition is obtained thanks to the perturbation terms and a compensation phenomenon. We prove that bit complexity estimates on output size and runtime are both singly exponential in the cardinality of the Newton polytope (or doubly exponential in the number of variables). Next, we apply this algorithm to compute exact Reznick, Hilbert-Artin's representation and Putinar's representations respectively for positive definite forms and positive polynomials over basic compact semi-algebraic sets. We also report on practical experiments done with the implementation of these algorithms and existing alternatives such as the critical point method and cylindrical algebraic decomposition.2018-11-25T17:51:55Z35 pages, 4 tables, extended version of the paper from ISSAC'18 conference (available at arXiv::1802.10339), fixed the statement and proof of Proposition 24Victor MagronMohab Safey El Dinhttp://arxiv.org/abs/2511.10164v2Two Constraint Compilation Methods for Lifted Planning2026-02-20T16:29:23ZWe study planning in a fragment of PDDL with qualitative state-trajectory constraints, capturing safety requirements, task ordering conditions, and intermediate sub-goals commonly found in real-world problems. A prominent approach to tackle such problems is to compile their constraints away, leading to a problem that is supported by state-of-the-art planners. Unfortunately, existing compilers do not scale on problems with a large number of objects and high-arity actions, as they necessitate grounding the problem before compilation. To address this issue, we propose two methods for compiling away constraints without grounding, making them suitable for large-scale planning problems. We prove the correctness of our compilers and outline their worst-case time complexity. Moreover, we present a reproducible empirical evaluation on the domains used in the latest International Planning Competition. Our results demonstrate that our methods are efficient and produce planning specifications that are orders of magnitude more succinct than the ones produced by compilers that ground the domain, while remaining competitive when used for planning with a state-of-the-art planner.2025-11-13T10:24:31ZPeriklis MantenoglouLuigi BonassiEnrico ScalaPedro Zuidberg Dos Martireshttp://arxiv.org/abs/2602.17904v1Hilbert's Nullstellensatz is in the Counting Hierarchy2026-02-20T00:01:49ZWe show that Hilbert's Nullstellensatz, the problem of deciding if a system of multivariate polynomial equations has a solution in the algebraic closure of the underlying field, lies in the counting hierarchy. More generally, we show that the number of solutions to a system of equations can be computed in polynomial time with oracle access to the counting hierarchy. Our results hold in particular for polynomials with coefficients in either the rational numbers or a finite field. Previously, the best-known bounds on the complexities of these problems were PSPACE and FPSPACE, respectively. Our main technical contribution is the construction of a uniform family of constant-depth arithmetic circuits that compute the multivariate resultant.2026-02-20T00:01:49ZRobert AndrewsAbhibhav GargÉric Schosthttp://arxiv.org/abs/2602.17826v1Ontology-Guided Neuro-Symbolic Inference: Grounding Language Models with Mathematical Domain Knowledge2026-02-19T20:45:16ZLanguage models exhibit fundamental limitations -- hallucination, brittleness, and lack of formal grounding -- that are particularly problematic in high-stakes specialist fields requiring verifiable reasoning. I investigate whether formal domain ontologies can enhance language model reliability through retrieval-augmented generation. Using mathematics as proof of concept, I implement a neuro-symbolic pipeline leveraging the OpenMath ontology with hybrid retrieval and cross-encoder reranking to inject relevant definitions into model prompts. Evaluation on the MATH benchmark with three open-source models reveals that ontology-guided context improves performance when retrieval quality is high, but irrelevant context actively degrades it -- highlighting both the promise and challenges of neuro-symbolic approaches.2026-02-19T20:45:16ZSubmitted to NeuS 2026. Supplementary materials and code: https://doi.org/10.5281/zenodo.18665030Marcelo Labrehttp://arxiv.org/abs/2602.11041v2Exploiting the Structure in Tensor Decompositions for Matrix Multiplication2026-02-19T20:18:59ZWe present a new algorithm for fast matrix multiplication using tensor decompositions which have special features. Thanks to these features we obtain exponents lower than what the rank of the tensor decomposition suggests. In particular for $6\times 6$ matrix multiplication we reduce the exponent of the recent algorithm by Moosbauer and Poole from $2.8075$ to $2.8016$, while retaining a reasonable leading coefficient.2026-02-11T17:12:34ZManuel KauersJakob MoosbauerIsaac Woodhttp://arxiv.org/abs/2511.09943v2SeQuant Framework for Symbolic and Numerical Tensor Algebra. I. Core Capabilities2026-02-19T13:13:51ZSeQuant is an open-source library for symbolic algebra of tensors over commutative (scalar) and non-commutative (operator) rings. The key innovation supporting most of its functionality is a graph-theoretic tensor network (TN) canonicalizer that can handle tensor networks with symmetries faster than their standard group-theoretic counterparts. The TN canonicalizer is used for routine simplification of conventional tensor expressions, for optimizing application of Wick's theorem (used to canonicalize products of tensors over operator fields), and for manipulation of the intermediate representation leading to the numerical evaluation. Notable features of SeQuant include support for noncovariant tensor networks (which often arise from tensor decompositions) and for tensors with modes that depend parametrically on indices of other tensor modes (such dependencies between degrees of freedom are naturally viewed as nesting of tensors, "tensors of tensors" arising in block-wise data compressions in data science and modern quantum simulation). SeQuant blurs the line between pure symbolic manipulation/code generation and numerical evaluation by including compiler-like components to optimize and directly interpret tensor expressions using external numerical tensor algebra frameworks. The SeQuant source code is available at https://github.com/ValeevGroup/SeQuant.2025-11-13T04:17:05ZBimal GaudelRobert G. AdamAjay MelekamburathConner MasteranNakul TekeAzam BesharatnikAndreas KöhnEdward F. Valeevhttp://arxiv.org/abs/2602.15603v1Symbolic recovery of PDEs from measurement data2026-02-17T14:20:36ZModels based on partial differential equations (PDEs) are powerful for describing a wide range of complex relationships in the natural sciences. Accurately identifying the PDE model, which represents the underlying physical law, is essential for a proper understanding of the problem. This reconstruction typically relies on indirect and noisy measurements of the system's state and, without specifically tailored methods, rarely yields symbolic expressions, thereby hindering interpretability. In this work, we address this issue by considering existing neural network architectures based on rational functions for the symbolic representation of physical laws. These networks leverage the approximation power of rational functions while also benefiting from their flexibility in representing arithmetic operations. Our main contribution is an identifiability result, showing that, in the limit of noiseless, complete measurements, such symbolic networks can uniquely reconstruct the simplest physical law within the PDE model. Specifically, reconstructed laws remain expressible within the symbolic network architecture, with regularization-minimizing parameterizations promoting interpretability and sparsity in case of $L^1$-regularization. In addition, we provide regularity results for symbolic networks. Empirical validation using the ParFam architecture supports these theoretical findings, providing evidence for the practical reconstructibility of physical laws.2026-02-17T14:20:36ZErion MorinaPhilipp SchollMartin Hollerhttp://arxiv.org/abs/2602.15539v1Dynamic Training-Free Fusion of Subject and Style LoRAs2026-02-17T12:42:30ZRecent studies have explored the combination of multiple LoRAs to simultaneously generate user-specified subjects and styles. However, most existing approaches fuse LoRA weights using static statistical heuristics that deviate from LoRA's original purpose of learning adaptive feature adjustments and ignore the randomness of sampled inputs. To address this, we propose a dynamic training-free fusion framework that operates throughout the generation process. During the forward pass, at each LoRA-applied layer, we dynamically compute the KL divergence between the base model's original features and those produced by subject and style LoRAs, respectively, and adaptively select the most appropriate weights for fusion. In the reverse denoising stage, we further refine the generation trajectory by dynamically applying gradient-based corrections derived from objective metrics such as CLIP and DINO scores, providing continuous semantic and stylistic guidance. By integrating these two complementary mechanisms-feature-level selection and metric-guided latent adjustment-across the entire diffusion timeline, our method dynamically achieves coherent subject-style synthesis without any retraining. Extensive experiments across diverse subject-style combinations demonstrate that our approach consistently outperforms state-of-the-art LoRA fusion methods both qualitatively and quantitatively.2026-02-17T12:42:30ZQinglong CaoYuntian ChenChao MaXiaokang Yang