https://arxiv.org/api/Do2ABz2YN+QBlym5ibBD5NBnUSg 2026-06-13T11:15:51Z 3138 15 15 http://arxiv.org/abs/2606.07152v1 A Data-Free Symbolic Regression Approach for Solving Equations 2026-06-05T11:09:23Z

Many equations arising in science currently cannot be solved by available analytical techniques and are therefore solved numerically, without yielding explicit symbolic expressions. Existing symbolic regression approaches can recover symbolic expressions, but require training data obtained from the underlying process, rather than the governing equation alone. We propose the Symbolic Equation Solver (SES), a framework that formulates equation solving as an optimization problem over differentiable symbolic models. SES constructs its objective from the equation together with initial or boundary conditions, eliminating the need for paired input-output data. The learned model is expressed in explicit symbolic form, enabling further analysis. We evaluate SES on representative algebraic and differential equations, including a system of algebraic equations, an equation with transcendental terms, an ordinary differential equation, and partial differential equations with different initial or boundary conditions. Across these settings, SES recovers compact symbolic expressions that match the corresponding analytical solutions.

2026-06-05T11:09:23Z Sergei Garmaev Vinay Sharma Olga Fink http://arxiv.org/abs/2604.23873v2 Enhanced CAD-Based Quantifier Elimination With Multiple Equational Constraints 2026-06-05T08:24:06Z

This paper presents two enhancements to cylindrical algebraic decomposition (CAD) based quantifier elimination (QE) for cases in which multiple equational constraints are present in the given input formula $φ^*$. The first enhancement provides more detail in the output when there is a conceptual partition of the set of variables of $φ^*$ into parameters and unknowns. In such cases, we describe how to partition the parameter space so that: (1) in each open set of the partition the number $ν$ of associated unknowns is a finite constant or is infinite; and (2) for each such open set for which $ν$ is finite, an expression for the unknowns in terms of the parameters is provided. The second enhancement is an efficiency gain achievable in certain situations. Indeed, when certain conditions are met, the second CAD equational projection step can be reduced more significantly than is supported by the prior existing theory. Relevant theorems and worked examples for both enhancements are provided. Application areas include approximation theory, cuspidal manipulator classification, and biological/chemical systems.

2026-04-26T20:32:37Z Preliminary Draft; updated to reference related prior work James H. Davenport Matthew England Scott McCallum http://arxiv.org/abs/2606.06386v1 On GPU Implementation for Multi-Precision Integer Division 2026-06-04T16:51:22Z

This paper presents the issues arising in implementing a fast integer division algorithm on general purpose GPUs. The algorithm uses a Newton iteration based on the shifted inverse operation, keeping all arithmetic in the integer domain and relying on data-parallel operators. The principal contribution is an efficient GPU/CUDA implementation for integer precisions from $2^{15}$ to $2^{18}$ -- sizes not supported by \cgbn{} division. We propose algorithmic refinements, define a cost model in terms of multiplications, build on prefix sums and previous work on multi-precision multiplication, and present an evaluation showing near-optimal performance relative to the model for the target precision.

2026-06-04T16:51:22Z Martin B. Marchioro Aske N. Raahauge Marc I. Løvenskjold Cosmin E. Oancea Stephen M. Watt http://arxiv.org/abs/2606.06344v1 Equivariant Neural Belief Propagation 2026-06-04T16:16:51Z

Probabilistic inference over spatially embedded variables requires beliefs that respect $SE(3)$ symmetry, yet existing equivariant networks produce only scalars and vectors -- not the rank-2 precision tensors needed for anisotropic uncertainty, and single-component messages collapse multi-modal energy landscapes to physically meaningless averages. We introduce Equivariant Neural Belief Propagation (ENBP), a factor-graph framework whose messages are equivariant Gaussian mixture models with sufficient statistics that transform exactly under $SE(3)$. Rank-2 precision matrices are synthesised via equivariant outer products, ingested through differentiable spectral decomposition, and kept tractable by a greedy KL-based mixture reduction that provably commutes with $SE(3)$. On GEOM-QM9 and GEOM-Drugs, ENBP achieves 98.9% conformational coverage at 0.090 $\mathring{A}$ error with sub-second latency -- over $100\times$ faster than diffusion baselines at higher accuracy. On multi-body robotic inference, vanilla loopy BP diverges at 15+ agents while ENBP converges with near-zero collision rates and machine-precision equivariance error (${\sim}10^{-7}$ vs.\ $10^{-1}$ for augmented baselines).

2026-06-04T16:16:51Z 18 pages Zehua Cheng Wei Dai Jiahao Sun http://arxiv.org/abs/2606.06136v1 A Finite Certificate for the Positive $n=9$ Vasc Inequality 2026-06-04T13:19:19Z

We prove the positive-real $n=9$ case of the Vasc cyclic inequality. The proof was obtained with human-guided assistance from the AI agent MechMath Agent Team: the human-readable part reduces the rational inequality to a homogeneous polynomial inequality, fixes a cyclic maximum, and parametrizes each sorted fixed-maximum cone by cumulative gaps; the finite part is a certificate covering all $8!=40320$ sorted cones. MechMath Agent Team generated the certificate verification workflow through Python tool calls, including the case split, verification programs, and terminal classifications. The published certificate has $36815$ coefficient leaves, $2236$ ordinary Polya multiplier leaves, and $1269$ AM-GM midpoint overlay leaves. Human authors audited the mathematical reductions and verification logic, and a separate artifact contains the certificate, an independent verifier, and a from-source rebuild route.

2026-06-04T13:19:19Z Dakai Guo Ruichen Qiu Yichuan Cao Ruyong Feng http://arxiv.org/abs/2606.05042v1 In-Context Graphical Inference 2026-06-03T16:04:00Z

Marginal inference in discrete graphical models forces a choice between exactness and scalability: exact algorithms are intractable for high-treewidth graphs, while iterative approximations (Belief Propagation, variational methods) sacrifice convergence guarantees on frustrated topologies. We argue that this dichotomy stems from a mismatched inductive bias: iterative methods abandon the sequential elimination structure that makes exact inference correct. We introduce In-Context Graphical Inference (ICG-I), an autoregressive Graph Transformer that restores this structure by mimicking Variable Elimination with learned, Tensor- Train-compressed intermediate factors, paired with a Dirichlet output layer and Weighted Conformal Prediction for calibrated, distribution-free coverage guarantees under topological shift. We prove that TT compression errors propagate at most lincarly through the autoregressive chain, that the Dirichlet-Multinomial loss is a proper scoring rule, and that WCP maintains coverage with a quantifiable degradation under estimated density ratios. We conducted intensive experiments to evaluate ICG-I and achieved state-of-the-art performance across all benchmarks. ICG-I reduces MAE from 0.041 (best baseline) to 0.020 on standard instances and achieves 0.048 on N=500 frustrated spin glasses where BP diverges entirely.

2026-06-03T16:04:00Z 19 Pages Zehua Cheng Wei Dai Jiahao Sun http://arxiv.org/abs/2606.05030v1 Imbuing Large Language Models with Bidirectional Logic for Robust Chain Repair 2026-06-03T15:58:48Z

Autoregressive chain-of-thought (CoT) reasoning in large language models (LLMs) is fundamentally forward-directed: each step conditions only on prior tokens. This unidirectional inductive bias renders even capable models susceptible to error snowballing, wherein a single logical or arithmetic mistake in an early step irreversibly corrupts the entire reasoning chain. We introduce Teleological Reasoning Infilling (\TRI{}), a training framework that endows decoder-only transformers with a native \emph{goal-conditioned bridging} capability. The key insight is to reframe erroneous reasoning segments as fill-in-the-middle (FIM) tasks: given a verified prefix premise $P$, a verified downstream milestone $S$, and the original query $Q$, the model must synthesise the logical bridge $M$ that connects $P$ to $S$ rigorously and completely. To achieve this with standard causal architectures, we introduce a Prefix-Suffix-Middle (PSM) sequence rearrangement with three non-overlapping sentinel tokens, enabling $M$ to attend to both $P$ and $S$ without any structural modification to the self-attention mechanism. Training proceeds in two stages: (i) Supervised Fine-Tuning (SFT) on symbolically verified $(P, S, M)$ triples extracted from formal mathematics corpora, and (ii) Direct Preference Optimisation (DPO) with a deterministic symbolic verifier (Lean 4 / Python) as the sole reward oracle, eliminating LLM-judge sycophancy. At inference, TRI operates as a surgical repair module within a dual-system loop: a causal draft model generates an initial trace, the verifier pinpoints failures, and TRI infills only the damaged segment, leaving verified sections intact. Comprehensive experiments on three benchmarks demonstrate that TRI achieves state-of-the-art performance across all tasks, while reducing per-problem token expenditure by 31.2%.

2026-06-03T15:58:48Z 25 Pages In Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2026 Zehua Cheng Wei Dai Jiahao Sun Thomas Lukasiewicz http://arxiv.org/abs/2606.04858v1 Integer points close to a transcendental curve: an algorithmic approach 2026-06-03T13:25:50Z

In this article, we propose an algorithmic approach to determine the integer points located near a transcendental curve. This approach is closely related to a celebrated work by Bombieri and Pila and to the so-called Coppersmith's method. We establish the underlying theoretical foundations, prove the algorithms, study their complexity and present practical experiments; we also compare our approach with previously existing ones. From a practical point of view, we focus on an instance of our general problem, called the Table Maker's Dilemma, whose solving makes it possible to evaluate a given function with correct rounding. Our experiments show a significant speedup. In particular, our results show that the development of a correctly rounded mathematical library for the binary128 format is now possible at a much smaller cost than with previously existing approaches.

2026-06-03T13:25:50Z Nicolas Brisebarre Guillaume Hanrot http://arxiv.org/abs/2601.23169v2 Names Don't Matter: Symbol-Invariant Transformer for Open-Vocabulary Learning 2026-06-02T13:42:55Z

Current neural architectures lack a principled way to handle interchangeable tokens, i.e., symbols that are semantically equivalent yet distinguishable, such as bound variables. As a result, models trained on fixed vocabularies often struggle to generalize to unseen symbols, even when the underlying semantics remain unchanged. We propose a novel Transformer-based mechanism that is provably invariant to the renaming of interchangeable tokens. Our approach employs parallel embedding streams to isolate the contribution of each interchangeable token in the input, combined with an aggregated attention mechanism that enables structured information sharing across streams. Experimental results confirm the theoretical guarantees of our method and demonstrate substantial performance gains on open-vocabulary tasks that require generalization to novel symbols. Project page: https://bu-depend-lab.github.io/Symbol-Invariant-Transformer/

2026-01-30T16:53:01Z ICML 2026 Poster (Camera-Ready Version) İlker Işık Wenchao Li http://arxiv.org/abs/2606.03031v1 AUDITFLOW: Executable Symbolic Environments for Structured Financial Reporting Verification 2026-06-02T02:14:42Z

Structured financial audit verification is difficult for language-model agents because correctness depends on structured evidence rather than text alone. A model must link reported facts to taxonomy concepts, traverse calculation or dimensional relations, and recompute expected values before applying an audit rule. We propose AuditFlow, a graph-grounded multi-agent framework that separates adaptive search from deterministic verification. AuditFlow builds a symbolic environment from a static US-GAAP taxonomy graph and a dynamic XBRL filing graph, and exposes it through typed tools for fact retrieval, taxonomy traversal, numerical checking, and rule evaluation. Two junior auditors inspect each case from regulatory and evidentiary views, while a senior auditor resolves disagreements and can request further investigation. The final reports are fused through evidential aggregation to produce an audit verdict, expected value, evidence trail, and trustworthiness score. On a FinAuditing-derived FinMR sample, AuditFlow reaches 82.09% joint audit accuracy under GPT-5.5, outperforming the strongest baseline by 14.93 points. Removing deterministic checks drops accuracy to 17.91%, showing that the symbolic environment performs the verification step that the model cannot reliably replace.

2026-06-02T02:14:42Z Yan Wang Xuguang Ai Jaisal Patel Xueqing Peng Fengran Mo Yupeng Cao Haohang Li Mingyu Cao Lingfei Qian Víctor Gutiérrez-Basulto http://arxiv.org/abs/2407.15510v2 Algebraic anti-unification 2026-06-01T14:13:08Z

Abstraction is key to human and artificial intelligence as it allows one to identify common structure in otherwise distinct objects or situations. Anti-unification (or generalization) is the branch of theoretical computer science and artificial intelligence that studies abstraction and has found applications in areas such as inductive logic programming, program synthesis, and analogy-making. To date, anti-unification has been studied almost exclusively from a syntactic perspective. In this paper, we initiate an algebraic (i.e.\ semantic) theory of anti-unification in the general setting of universal algebra, thereby extending anti-unification from term-based representations to arbitrary algebras and beyond equational theories. In particular, we introduce the notions of algebraic generalization ordering and minimally general generalization, establish basic structural properties, prove compatibility with homomorphisms and isomorphisms, and investigate computability in finite unary algebras and finite algebras via automata-theoretic methods.

2024-07-22T09:49:46Z Christian Antić http://arxiv.org/abs/2602.08885v5 Breaking the Simplification Bottleneck in Amortized Neural Symbolic Regression 2026-05-29T09:26:40Z

Symbolic regression (SR) aims to discover interpretable analytical expressions that accurately describe observed data. Amortized SR promises to be much more efficient than the predominant genetic programming SR methods, but currently struggles to scale to realistic scientific complexity. We find that a key obstacle is the lack of a fast reduction of equivalent expressions to a concise normalized form. Amortized SR has addressed this with general-purpose Computer Algebra Systems (CAS) like SymPy, but the high computational cost severely limits training and inference speed. We propose SimpliPy, a rule-based simplification engine achieving a 100-fold speed-up over SymPy at comparable quality. This enables substantial improvements in amortized SR, including scalability to much larger training sets, more efficient use of the per-expression token budget, and systematic training set decontamination with respect to equivalent test expressions. We demonstrate these advantages in our Flash-ANSR framework, which achieves much better accuracy than amortized baselines (NeSymReS, E2E) on the FastSRB benchmark. Moreover, it performs on par with state-of-the-art direct optimization (PySR) while recovering more concise rather than more complex expressions with increasing inference budget.

2026-02-09T16:47:00Z main text: 8 pages, 7 figures; appendix: 12 pages, 11 figures; code available at https://github.com/psaegert/simplipy and https://github.com/psaegert/flash-ansr; v2: Fixed rendering artifact in Figure 7; v3: Fixed Figure 3 title and formula; v4: Fixed Eq (1), example in App. M, Fig 13; v5: ICML 2026 Camera-Ready Version Paul Saegert Ullrich Köthe http://arxiv.org/abs/2601.16366v2 Post-Training Neural Network Pruning using Graph Curvature 2026-05-28T20:33:53Z

This paper provides a fresh view of the neural network (NN) pruning problem through the lens of graph theory. To achieve effective pruning, we aim to identify the main NN data flows and the corresponding NN connections that are most and least important for the performance of the full model. Unlike the standard approach to NN data flow analysis, which is based on information theory, we employ the notion of graph curvature, specifically Ollivier-Ricci curvature (ORC). ORC has been successfully used to identify important graph edges in various domains such as road traffic analysis, biological networks, and social networks. In particular, edges with negative ORC are considered bottlenecks and are therefore critical to the graph's overall connectivity, whereas positive-ORC edges are less essential. We use this intuition for NNs to (1) construct a graph induced by the NN structure and introduce the notion of neural curvature (NC) based on ORC; (2) calculate curvatures based on activation patterns for a set of input examples; and (3) demonstrate that NC can be used to rank edges according to their importance for overall NN functionality. We evaluate our method through pruning experiments on a variety of small and medium size models trained on three image datasets: MNIST, CIFAR-10, and CIFAR-100. The results indicate that our method can identify a larger number of unimportant edges compared to existing pruning methods.

2026-01-22T23:35:10Z Shuhang Tan Jayson Sia Paul Bogdan Radoslav Ivanov http://arxiv.org/abs/2605.04916v2 A Foundation Model for Zero-Shot Logical Rule Induction 2026-05-28T07:50:40Z

Inductive Logic Programming (ILP) learns interpretable logical rules from data. Existing methods are transductive: their learned parameters are bound to specific predicates and require retraining for each new task. We introduce Neural Rule Inducer (NRI), a pretrained model for zero-shot rule induction. Rather than encoding literal identities, NRI represents literals using domain-agnostic statistical properties such as class-conditional rates, entropy, and co-occurrence, which generalize across variable identities and counts without retraining. The model consists of a statistical encoder and a parallel slot-based decoder. Parallel decoding preserves the permutation invariance of logical disjunction; an autoregressive decoder would instead impose an arbitrary clause order. Product T-norm relaxation makes rule execution differentiable, allowing end-to-end training on prediction accuracy alone. We evaluate NRI on rule recovery, robustness to label noise and spurious correlations, and zero-shot transfer to real-world benchmarks, and we believe this work opens up the possibility of foundation models for symbolic reasoning. Code and the reference checkpoint are available at https://github.com/phuayj/neural-rule-inducer.

2026-05-06T13:45:31Z Camera-ready version accepted at IJCAI 2026, with full appendices Yin Jun Phua http://arxiv.org/abs/2604.23256v2 Architecture-Induced Recoverability Bias in Differentiable Symbolic Regression 2026-05-27T19:15:50Z

Symbolic regression aims to recover closed-form expressions from numerical data, but in differentiable symbolic regression the recovered expression depends not only on the grammar but also on the fixed architecture through which variables are routed during training. This is relevant to signal-processing settings in which closed-form models and interpretable nonlinear structure are useful. This architecture-specific effect has rarely been isolated directly, because existing comparisons often vary architecture together with operator family, grammar, or search procedure. Three depth-3 architectures are compared across twenty-four operator--shape--leaf combinations, holding operator family, grammar, and training protocol fixed as far as possible while varying the variable-routing architecture. Recovery changes from $0/64$ to $64/64$ trials on the same target under an architecture-plus-native-training-protocol comparison. The best architecture on one target is the worst on another, and trees with two equal-depth subtrees fail in every configuration tested ($0/3{,}776$). As a proof-of-concept mitigation, a small architecture set is trained and the hardened expression with the lowest held-out RMSE is selected. On the jointly-run subset, this improves recovery from $34.4\%$ for the only architecture present in all three configurations to $50.1\%$. On a Shockley diode target, the validation selector recovers cases missed by that baseline architecture, which by itself recovers $0/32$ seeds. Since the jointly-run subset contains only three configurations, the selector result is evidence that validation-based architecture selection is promising, not a complete benchmark. These results support treating architecture as a measurable design variable that should be reported, stress-tested, and selected using held-out validation rather than fixed a priori.

2026-04-25T11:32:09Z 6 pages, 4 figures, 3 tables; submitted to IEEE MLSP 2026 Chakshu Gupta Theodore J. LaGrow