https://arxiv.org/api/Do2ABz2YN+QBlym5ibBD5NBnUSg2026-06-13T11:15:51Z31381515http://arxiv.org/abs/2606.07152v1A Data-Free Symbolic Regression Approach for Solving Equations2026-06-05T11:09:23ZMany equations arising in science currently cannot be solved by available analytical techniques and are therefore solved numerically, without yielding explicit symbolic expressions. Existing symbolic regression approaches can recover symbolic expressions, but require training data obtained from the underlying process, rather than the governing equation alone. We propose the Symbolic Equation Solver (SES), a framework that formulates equation solving as an optimization problem over differentiable symbolic models. SES constructs its objective from the equation together with initial or boundary conditions, eliminating the need for paired input-output data. The learned model is expressed in explicit symbolic form, enabling further analysis. We evaluate SES on representative algebraic and differential equations, including a system of algebraic equations, an equation with transcendental terms, an ordinary differential equation, and partial differential equations with different initial or boundary conditions. Across these settings, SES recovers compact symbolic expressions that match the corresponding analytical solutions.2026-06-05T11:09:23ZSergei GarmaevVinay SharmaOlga Finkhttp://arxiv.org/abs/2604.23873v2Enhanced CAD-Based Quantifier Elimination With Multiple Equational Constraints2026-06-05T08:24:06ZThis paper presents two enhancements to cylindrical algebraic decomposition (CAD) based quantifier elimination (QE) for cases in which multiple equational constraints are present in the given input formula $φ^*$. The first enhancement provides more detail in the output when there is a conceptual partition of the set of variables of $φ^*$ into parameters and unknowns. In such cases, we describe how to partition the parameter space so that: (1) in each open set of the partition the number $ν$ of associated unknowns is a finite constant or is infinite; and (2) for each such open set for which $ν$ is finite, an expression for the unknowns in terms of the parameters is provided. The second enhancement is an efficiency gain achievable in certain situations. Indeed, when certain conditions are met, the second CAD equational projection step can be reduced more significantly than is supported by the prior existing theory. Relevant theorems and worked examples for both enhancements are provided. Application areas include approximation theory, cuspidal manipulator classification, and biological/chemical systems.2026-04-26T20:32:37ZPreliminary Draft; updated to reference related prior workJames H. DavenportMatthew EnglandScott McCallumhttp://arxiv.org/abs/2606.06386v1On GPU Implementation for Multi-Precision Integer Division2026-06-04T16:51:22ZThis paper presents the issues arising in implementing a fast integer division algorithm on general purpose GPUs. The algorithm uses a Newton iteration based on the shifted inverse operation, keeping all arithmetic in the integer domain and relying on data-parallel operators. The principal contribution is an efficient GPU/CUDA implementation for integer precisions from $2^{15}$ to $2^{18}$ -- sizes not supported by \cgbn{} division. We propose algorithmic refinements, define a cost model in terms of multiplications, build on prefix sums and previous work on multi-precision multiplication, and present an evaluation showing near-optimal performance relative to the model for the target precision.2026-06-04T16:51:22ZMartin B. MarchioroAske N. RaahaugeMarc I. LøvenskjoldCosmin E. OanceaStephen M. Watthttp://arxiv.org/abs/2606.06344v1Equivariant Neural Belief Propagation2026-06-04T16:16:51ZProbabilistic inference over spatially embedded variables requires beliefs that respect $SE(3)$ symmetry, yet existing equivariant networks produce only scalars and vectors -- not the rank-2 precision tensors needed for anisotropic uncertainty, and single-component messages collapse multi-modal energy landscapes to physically meaningless averages. We introduce Equivariant Neural Belief Propagation (ENBP), a factor-graph framework whose messages are equivariant Gaussian mixture models with sufficient statistics that transform exactly under $SE(3)$. Rank-2 precision matrices are synthesised via equivariant outer products, ingested through differentiable spectral decomposition, and kept tractable by a greedy KL-based mixture reduction that provably commutes with $SE(3)$. On GEOM-QM9 and GEOM-Drugs, ENBP achieves 98.9% conformational coverage at 0.090 $\mathring{A}$ error with sub-second latency -- over $100\times$ faster than diffusion baselines at higher accuracy. On multi-body robotic inference, vanilla loopy BP diverges at 15+ agents while ENBP converges with near-zero collision rates and machine-precision equivariance error (${\sim}10^{-7}$ vs.\ $10^{-1}$ for augmented baselines).2026-06-04T16:16:51Z18 pagesZehua ChengWei DaiJiahao Sunhttp://arxiv.org/abs/2606.06136v1A Finite Certificate for the Positive $n=9$ Vasc Inequality2026-06-04T13:19:19ZWe prove the positive-real $n=9$ case of the Vasc cyclic inequality. The proof was obtained with human-guided assistance from the AI agent MechMath Agent Team: the human-readable part reduces the rational inequality to a homogeneous polynomial inequality, fixes a cyclic maximum, and parametrizes each sorted fixed-maximum cone by cumulative gaps; the finite part is a certificate covering all $8!=40320$ sorted cones. MechMath Agent Team generated the certificate verification workflow through Python tool calls, including the case split, verification programs, and terminal classifications. The published certificate has $36815$ coefficient leaves, $2236$ ordinary Polya multiplier leaves, and $1269$ AM-GM midpoint overlay leaves. Human authors audited the mathematical reductions and verification logic, and a separate artifact contains the certificate, an independent verifier, and a from-source rebuild route.2026-06-04T13:19:19ZDakai GuoRuichen QiuYichuan CaoRuyong Fenghttp://arxiv.org/abs/2606.05042v1In-Context Graphical Inference2026-06-03T16:04:00ZMarginal inference in discrete graphical models forces a choice between exactness and scalability: exact algorithms are intractable for high-treewidth graphs, while iterative approximations (Belief Propagation, variational methods) sacrifice convergence guarantees on frustrated topologies. We argue that this dichotomy stems from a mismatched inductive bias: iterative methods abandon the sequential elimination structure that makes exact inference correct. We introduce In-Context Graphical Inference (ICG-I), an autoregressive Graph Transformer that restores this structure by mimicking Variable Elimination with learned, Tensor- Train-compressed intermediate factors, paired with a Dirichlet output layer and Weighted Conformal Prediction for calibrated, distribution-free coverage guarantees under topological shift. We prove that TT compression errors propagate at most lincarly through the autoregressive chain, that the Dirichlet-Multinomial loss is a proper scoring rule, and that WCP maintains coverage with a quantifiable degradation under estimated density ratios. We conducted intensive experiments to evaluate ICG-I and achieved state-of-the-art performance across all benchmarks. ICG-I reduces MAE from 0.041 (best baseline) to 0.020 on standard instances and achieves 0.048 on N=500 frustrated spin glasses where BP diverges entirely.2026-06-03T16:04:00Z19 PagesZehua ChengWei DaiJiahao Sunhttp://arxiv.org/abs/2606.05030v1Imbuing Large Language Models with Bidirectional Logic for Robust Chain Repair2026-06-03T15:58:48ZAutoregressive chain-of-thought (CoT) reasoning in large language models (LLMs) is fundamentally forward-directed: each step conditions only on prior tokens. This unidirectional inductive bias renders even capable models susceptible to error snowballing, wherein a single logical or arithmetic mistake in an early step irreversibly corrupts the entire reasoning chain. We introduce Teleological Reasoning Infilling (\TRI{}), a training framework that endows decoder-only transformers with a native \emph{goal-conditioned bridging} capability. The key insight is to reframe erroneous reasoning segments as fill-in-the-middle (FIM) tasks: given a verified prefix premise $P$, a verified downstream milestone $S$, and the original query $Q$, the model must synthesise the logical bridge $M$ that connects $P$ to $S$ rigorously and completely. To achieve this with standard causal architectures, we introduce a Prefix-Suffix-Middle (PSM) sequence rearrangement with three non-overlapping sentinel tokens, enabling $M$ to attend to both $P$ and $S$ without any structural modification to the self-attention mechanism. Training proceeds in two stages: (i) Supervised Fine-Tuning (SFT) on symbolically verified $(P, S, M)$ triples extracted from formal mathematics corpora, and (ii) Direct Preference Optimisation (DPO) with a deterministic symbolic verifier (Lean 4 / Python) as the sole reward oracle, eliminating LLM-judge sycophancy. At inference, TRI operates as a surgical repair module within a dual-system loop: a causal draft model generates an initial trace, the verifier pinpoints failures, and TRI infills only the damaged segment, leaving verified sections intact. Comprehensive experiments on three benchmarks demonstrate that TRI achieves state-of-the-art performance across all tasks, while reducing per-problem token expenditure by 31.2%.2026-06-03T15:58:48Z25 PagesIn Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2026Zehua ChengWei DaiJiahao SunThomas Lukasiewiczhttp://arxiv.org/abs/2606.04858v1Integer points close to a transcendental curve: an algorithmic approach2026-06-03T13:25:50ZIn this article, we propose an algorithmic approach to determine the integer points located near a transcendental curve. This approach is closely related to a celebrated work by Bombieri and Pila and to the so-called Coppersmith's method. We establish the underlying theoretical foundations, prove the algorithms, study their complexity and present practical experiments; we also compare our approach with previously existing ones. From a practical point of view, we focus on an instance of our general problem, called the Table Maker's Dilemma, whose solving makes it possible to evaluate a given function with correct rounding. Our experiments show a significant speedup. In particular, our results show that the development of a correctly rounded mathematical library for the binary128 format is now possible at a much smaller cost than with previously existing approaches.2026-06-03T13:25:50ZNicolas BrisebarreGuillaume Hanrothttp://arxiv.org/abs/2601.23169v2Names Don't Matter: Symbol-Invariant Transformer for Open-Vocabulary Learning2026-06-02T13:42:55ZCurrent neural architectures lack a principled way to handle interchangeable tokens, i.e., symbols that are semantically equivalent yet distinguishable, such as bound variables. As a result, models trained on fixed vocabularies often struggle to generalize to unseen symbols, even when the underlying semantics remain unchanged. We propose a novel Transformer-based mechanism that is provably invariant to the renaming of interchangeable tokens. Our approach employs parallel embedding streams to isolate the contribution of each interchangeable token in the input, combined with an aggregated attention mechanism that enables structured information sharing across streams. Experimental results confirm the theoretical guarantees of our method and demonstrate substantial performance gains on open-vocabulary tasks that require generalization to novel symbols. Project page: https://bu-depend-lab.github.io/Symbol-Invariant-Transformer/2026-01-30T16:53:01ZICML 2026 Poster (Camera-Ready Version)İlker IşıkWenchao Lihttp://arxiv.org/abs/2606.03031v1AUDITFLOW: Executable Symbolic Environments for Structured Financial Reporting Verification2026-06-02T02:14:42ZStructured financial audit verification is difficult for language-model agents because correctness depends on structured evidence rather than text alone. A model must link reported facts to taxonomy concepts, traverse calculation or dimensional relations, and recompute expected values before applying an audit rule. We propose AuditFlow, a graph-grounded multi-agent framework that separates adaptive search from deterministic verification. AuditFlow builds a symbolic environment from a static US-GAAP taxonomy graph and a dynamic XBRL filing graph, and exposes it through typed tools for fact retrieval, taxonomy traversal, numerical checking, and rule evaluation. Two junior auditors inspect each case from regulatory and evidentiary views, while a senior auditor resolves disagreements and can request further investigation. The final reports are fused through evidential aggregation to produce an audit verdict, expected value, evidence trail, and trustworthiness score. On a FinAuditing-derived FinMR sample, AuditFlow reaches 82.09% joint audit accuracy under GPT-5.5, outperforming the strongest baseline by 14.93 points. Removing deterministic checks drops accuracy to 17.91%, showing that the symbolic environment performs the verification step that the model cannot reliably replace.2026-06-02T02:14:42ZYan WangXuguang AiJaisal PatelXueqing PengFengran MoYupeng CaoHaohang LiMingyu CaoLingfei QianVíctor Gutiérrez-Basultohttp://arxiv.org/abs/2407.15510v2Algebraic anti-unification2026-06-01T14:13:08ZAbstraction is key to human and artificial intelligence as it allows one to identify common structure in otherwise distinct objects or situations. Anti-unification (or generalization) is the branch of theoretical computer science and artificial intelligence that studies abstraction and has found applications in areas such as inductive logic programming, program synthesis, and analogy-making. To date, anti-unification has been studied almost exclusively from a syntactic perspective. In this paper, we initiate an algebraic (i.e.\ semantic) theory of anti-unification in the general setting of universal algebra, thereby extending anti-unification from term-based representations to arbitrary algebras and beyond equational theories. In particular, we introduce the notions of algebraic generalization ordering and minimally general generalization, establish basic structural properties, prove compatibility with homomorphisms and isomorphisms, and investigate computability in finite unary algebras and finite algebras via automata-theoretic methods.2024-07-22T09:49:46ZChristian Antićhttp://arxiv.org/abs/2602.08885v5Breaking the Simplification Bottleneck in Amortized Neural Symbolic Regression2026-05-29T09:26:40ZSymbolic regression (SR) aims to discover interpretable analytical expressions that accurately describe observed data. Amortized SR promises to be much more efficient than the predominant genetic programming SR methods, but currently struggles to scale to realistic scientific complexity. We find that a key obstacle is the lack of a fast reduction of equivalent expressions to a concise normalized form. Amortized SR has addressed this with general-purpose Computer Algebra Systems (CAS) like SymPy, but the high computational cost severely limits training and inference speed. We propose SimpliPy, a rule-based simplification engine achieving a 100-fold speed-up over SymPy at comparable quality. This enables substantial improvements in amortized SR, including scalability to much larger training sets, more efficient use of the per-expression token budget, and systematic training set decontamination with respect to equivalent test expressions. We demonstrate these advantages in our Flash-ANSR framework, which achieves much better accuracy than amortized baselines (NeSymReS, E2E) on the FastSRB benchmark. Moreover, it performs on par with state-of-the-art direct optimization (PySR) while recovering more concise rather than more complex expressions with increasing inference budget.2026-02-09T16:47:00Zmain text: 8 pages, 7 figures; appendix: 12 pages, 11 figures; code available at https://github.com/psaegert/simplipy and https://github.com/psaegert/flash-ansr; v2: Fixed rendering artifact in Figure 7; v3: Fixed Figure 3 title and formula; v4: Fixed Eq (1), example in App. M, Fig 13; v5: ICML 2026 Camera-Ready VersionPaul SaegertUllrich Köthehttp://arxiv.org/abs/2601.16366v2Post-Training Neural Network Pruning using Graph Curvature2026-05-28T20:33:53ZThis paper provides a fresh view of the neural network (NN) pruning problem through the lens of graph theory. To achieve effective pruning, we aim to identify the main NN data flows and the corresponding NN connections that are most and least important for the performance of the full model. Unlike the standard approach to NN data flow analysis, which is based on information theory, we employ the notion of graph curvature, specifically Ollivier-Ricci curvature (ORC). ORC has been successfully used to identify important graph edges in various domains such as road traffic analysis, biological networks, and social networks. In particular, edges with negative ORC are considered bottlenecks and are therefore critical to the graph's overall connectivity, whereas positive-ORC edges are less essential. We use this intuition for NNs to (1) construct a graph induced by the NN structure and introduce the notion of neural curvature (NC) based on ORC; (2) calculate curvatures based on activation patterns for a set of input examples; and (3) demonstrate that NC can be used to rank edges according to their importance for overall NN functionality. We evaluate our method through pruning experiments on a variety of small and medium size models trained on three image datasets: MNIST, CIFAR-10, and CIFAR-100. The results indicate that our method can identify a larger number of unimportant edges compared to existing pruning methods.2026-01-22T23:35:10ZShuhang TanJayson SiaPaul BogdanRadoslav Ivanovhttp://arxiv.org/abs/2605.04916v2A Foundation Model for Zero-Shot Logical Rule Induction2026-05-28T07:50:40ZInductive Logic Programming (ILP) learns interpretable logical rules from data. Existing methods are transductive: their learned parameters are bound to specific predicates and require retraining for each new task. We introduce Neural Rule Inducer (NRI), a pretrained model for zero-shot rule induction. Rather than encoding literal identities, NRI represents literals using domain-agnostic statistical properties such as class-conditional rates, entropy, and co-occurrence, which generalize across variable identities and counts without retraining. The model consists of a statistical encoder and a parallel slot-based decoder. Parallel decoding preserves the permutation invariance of logical disjunction; an autoregressive decoder would instead impose an arbitrary clause order. Product T-norm relaxation makes rule execution differentiable, allowing end-to-end training on prediction accuracy alone. We evaluate NRI on rule recovery, robustness to label noise and spurious correlations, and zero-shot transfer to real-world benchmarks, and we believe this work opens up the possibility of foundation models for symbolic reasoning. Code and the reference checkpoint are available at https://github.com/phuayj/neural-rule-inducer.2026-05-06T13:45:31ZCamera-ready version accepted at IJCAI 2026, with full appendicesYin Jun Phuahttp://arxiv.org/abs/2604.23256v2Architecture-Induced Recoverability Bias in Differentiable Symbolic Regression2026-05-27T19:15:50ZSymbolic regression aims to recover closed-form expressions from numerical data, but in differentiable symbolic regression the recovered expression depends not only on the grammar but also on the fixed architecture through which variables are routed during training. This is relevant to signal-processing settings in which closed-form models and interpretable nonlinear structure are useful. This architecture-specific effect has rarely been isolated directly, because existing comparisons often vary architecture together with operator family, grammar, or search procedure. Three depth-3 architectures are compared across twenty-four operator--shape--leaf combinations, holding operator family, grammar, and training protocol fixed as far as possible while varying the variable-routing architecture. Recovery changes from $0/64$ to $64/64$ trials on the same target under an architecture-plus-native-training-protocol comparison. The best architecture on one target is the worst on another, and trees with two equal-depth subtrees fail in every configuration tested ($0/3{,}776$). As a proof-of-concept mitigation, a small architecture set is trained and the hardened expression with the lowest held-out RMSE is selected. On the jointly-run subset, this improves recovery from $34.4\%$ for the only architecture present in all three configurations to $50.1\%$. On a Shockley diode target, the validation selector recovers cases missed by that baseline architecture, which by itself recovers $0/32$ seeds. Since the jointly-run subset contains only three configurations, the selector result is evidence that validation-based architecture selection is promising, not a complete benchmark. These results support treating architecture as a measurable design variable that should be reported, stress-tested, and selected using held-out validation rather than fixed a priori.2026-04-25T11:32:09Z6 pages, 4 figures, 3 tables; submitted to IEEE MLSP 2026Chakshu GuptaTheodore J. LaGrow