https://arxiv.org/api/MsHKXrsGJ7eP521yheyHlppiSJE 2026-06-23T17:49:26Z 9934 1065 15 http://arxiv.org/abs/2510.14558v1 HITrees: Higher-Order Interaction Trees 2025-10-16T11:06:04Z

Recent years have witnessed the rise of compositional semantics as a foundation for formal verification of complex systems. In particular, interaction trees have emerged as a popular denotational semantics. Interaction trees achieve compositionality by providing a reusable library of effects. However, their notion of effects does not support higher-order effects, i.e., effects that take or return monadic computations. Such effects are essential to model complex semantic features like parallel composition and call/cc. We introduce Higher-Order Interaction Trees (HITrees), the first variant of interaction trees to support higher-order effects in a non-guarded type theory. HITrees accomplish this through two key techniques: first, by designing the notion of effects such that the fixpoints of effects with higher-order input can be expressed as inductive types inside the type theory; and second, using defunctionalization to encode higher-order outputs into a first-order representation. We implement HITrees in the Lean proof assistant, accompanied by a comprehensive library of effects including concurrency, recursion, and call/cc. Furthermore, we provide two interpretations of HITrees, as state transition systems and as monadic programs. To demonstrate the expressiveness of HITrees, we apply them to define the semantics of a language with parallel composition and call/cc.

2025-10-16T11:06:04Z Amir Mohammad Fadaei Ayyam Michael Sammler http://arxiv.org/abs/2510.12269v3 Tensor Logic: The Language of AI 2025-10-16T07:40:28Z

Progress in AI is hindered by the lack of a programming language with all the requisite features. Libraries like PyTorch and TensorFlow provide automatic differentiation and efficient GPU implementation, but are additions to Python, which was never intended for AI. Their lack of support for automated reasoning and knowledge acquisition has led to a long and costly series of hacky attempts to tack them on. On the other hand, AI languages like LISP and Prolog lack scalability and support for learning. This paper proposes tensor logic, a language that solves these problems by unifying neural and symbolic AI at a fundamental level. The sole construct in tensor logic is the tensor equation, based on the observation that logical rules and Einstein summation are essentially the same operation, and all else can be reduced to them. I show how to elegantly implement key forms of neural, symbolic and statistical AI in tensor logic, including transformers, formal reasoning, kernel machines and graphical models. Most importantly, tensor logic makes new directions possible, such as sound reasoning in embedding space. This combines the scalability and learnability of neural networks with the reliability and transparency of symbolic reasoning, and is potentially a basis for the wider adoption of AI.

2025-10-14T08:24:08Z 17 pages, 0 figures Pedro Domingos http://arxiv.org/abs/2510.14279v1 Caruca: Effective and Efficient Specification Mining for Opaque Software Components 2025-10-16T04:02:45Z

A wealth of state-of-the-art systems demonstrate impressive improvements in performance, security, and reliability on programs composed of opaque components, such as Unix shell commands. To reason about commands, these systems require partial specifications. However, creating such specifications is a manual, laborious, and error-prone process, limiting the practicality of these systems. This paper presents Caruca, a system for automatic specification mining for opaque commands. To overcome the challenge of language diversity across commands, Caruca first instruments a large language model to translate a command's user-facing documentation into a structured invocation syntax. Using this representation, Caruca explores the space of syntactically valid command invocations and execution environments. Caruca concretely executes each command-environment pair, interposing at the system-call and filesystem level to extract key command properties such as parallelizability and filesystem pre- and post-conditions. These properties can be exported in multiple specification formats and are immediately usable by existing systems. Applying Caruca across 60 GNU Coreutils, POSIX, and third-party commands across several specification-dependent systems shows that Caruca generates correct specifications for all but one case, completely eliminating manual effort from the process and currently powering the full specifications for a state-of-the-art static analysis tool.

2025-10-16T04:02:45Z Evangelos Lamprou Seong-Heon Jung Mayank Keoliya Lukas Lazarek Konstantinos Kallas Michael Greenberg Nikos Vasilakis http://arxiv.org/abs/2504.18920v3 The Algebra of Patterns (Extended Version) 2025-10-15T13:33:29Z

Pattern matching is a popular feature in functional, imperative and object-oriented programming languages. Language designers should therefore invest effort in a good design for pattern matching. Most languages choose a first-match semantics for pattern matching; that is, clauses are tried in the order in which they appear in the program until the first one matches. As a consequence, the order in which the clauses appear cannot be arbitrarily changed, which results in a less declarative programming model. The declarative alternative to this is an order-independent semantics for pattern matching, which is not implemented in most programming languages since it requires more verbose patterns. The reason for this verbosity is that the syntax of patterns is usually not expressive enough to express the complement of a pattern. In this paper, we show a principled way to make order-independent pattern matching practical. Our solution consists of two parts: First, we introduce a boolean algebra of patterns which can express the complement of a pattern. Second, we introduce default clauses to pattern matches. These default clauses capture the essential idea of a fallthrough case without sacrificing the property of order-independence.

2025-04-26T13:33:07Z This revision makes it clear that chapter 5 of the book was written by Philip Wadler David Binder Lean Ermantraut 10.4230/LIPIcs.ECOOP.2025.2 http://arxiv.org/abs/2510.13426v1 Fast Trigonometric Functions using the RLIBM Approach 2025-10-15T11:24:14Z

This paper describes our experience developing polynomial approximations for trigonometric functions that produce correctly rounded results for multiple representations and rounding modes using the RLIBM approach. A key challenge with trigonometric functions concerns range reduction with "pi", which reduces a given input in the domain of a 32-bit float to a small domain. Any rounding error in the value of "pi" is amplified during range reduction, which can result in wrong results. We describe our experience implementing fast range reduction techniques that maintain a large number of bits of "pi" both with floating-point and integer computations. The resulting implementations for trigonometric functions are fast and produce correctly rounded results for all inputs for multiple representations up to 32-bits with a single implementation.

2025-10-15T11:24:14Z In Proceedings VSS 2025, arXiv:2510.12314 EPTCS 432, 2025, pp. 76-89 Sehyeok Park Rutgers University Santosh Nagarakatte Rutgers University 10.4204/EPTCS.432.9 http://arxiv.org/abs/2510.13236v1 Extensibility in Programming Languages: An overview 2025-10-15T07:40:06Z

I here conduct an exploration of programming language extensibility, making an argument for an often overlooked component of conventional language design. Now, this is not a technical detailing of these components, rather, I attempt to provide an overview as I myself have lacked during my time investigating programming languages. Thus, read this as an introduction to the magical world of extensibility. Through a literature review, I identify key extensibility themes - Macros, Modules, Types, and Reflection - highlighting diverse strategies for fostering extensibility. The analysis extends to cross-theme properties such as Parametricism and First-class citizen behaviour, introducing layers of complexity by highlighting the importance of customizability and flexibility in programming language constructs. By outlining these facets of existing programming languages and research, I aim to inspire future language designers to assess and consider the extensibility of their creations critically.

2025-10-15T07:40:06Z Sebastian mateos Nicolajsen http://arxiv.org/abs/2510.13082v1 Imperative Quantum Programming with Ownership and Borrowing in Guppy 2025-10-15T01:54:12Z

Linear types enforce no-cloning and no-deleting theorems in functional quantum programming. However, in imperative quantum programming, they have not gained widespread adoption. This work aims to develop a quantum type system that combines ergonomic linear typing with imperative semantics and maintains safety guarantees. All ideas presented here have been implemented in Quantinuum's Guppy programming language.

2025-10-15T01:54:12Z Presented at the Fifth International Workshop on Programming Languages for Quantum Computing (PLanQC 2025) Mark Koch Agustín Borgna Craig Roy Alan Lawrence Kartik Singhal Seyon Sivarajah Ross Duncan http://arxiv.org/abs/2410.02232v2 The Long Way to Deforestation (Technical Report): A Type Inference and Elaboration Technique for Removing Intermediate Data Structures 2025-10-14T20:51:10Z

Deforestation is a compiler optimization that removes intermediate data structure allocations from functional programs to improve their efficiency. This is an old idea, but previous approaches have proved limited or impractical: they either only worked on compositions of predefined combinators (shortcut fusion), or involved the aggressive unfolding of recursive definitions until a depth limit was reached or a reoccurring pattern was found to tie the recursive knot, resulting in impractical algorithmic complexity and large amounts of code duplication. We present Lumberhack, a general-purpose deforestation approach for purely functional call-by-value programs. Lumberhack uses subtype inference to reason about data structure production and consumption and uses an elaboration pass to fuse the corresponding recursive definitions. It fuses large classes of mutually recursive definitions while avoiding much of the unproductive (and sometimes counter-productive) code duplication inherent in previous approaches. We prove the soundness of Lumberhack using logical relations and experimentally demonstrate significant speedups in the standard nofib benchmark suite.

2024-10-03T06:03:40Z This is the technical report version of the paper published at ICFP 2024; v2 completes the proof for inference system Proceedings of the ACM on Programming Languages, Volume 8, Issue ICFP (August 2024) Yijia Chen Lionel Parreaux 10.1145/3674634 http://arxiv.org/abs/2510.12702v1 Beyond Postconditions: Can Large Language Models infer Formal Contracts for Automatic Software Verification? 2025-10-14T16:37:39Z

Automatic software verifiers have become increasingly effective at the task of checking software against (formal) specifications. Yet, their adoption in practice has been hampered by the lack of such specifications in real world code. Large Language Models (LLMs) have shown promise in inferring formal postconditions from natural language hints embedded in code such as function names, comments or documentation. Using the generated postconditions as specifications in a subsequent verification, however, often leads verifiers to suggest invalid inputs, hinting at potential issues that ultimately turn out to be false alarms. To address this, we revisit the problem of specification inference from natural language in the context of automatic software verification. In the process, we introduce NL2Contract, the task of employing LLMs to translate informal natural language into formal functional contracts, consisting of postconditions as well as preconditions. We introduce metrics to validate and compare different NL2Contract approaches, using soundness, bug discriminative power of the generated contracts and their usability in the context of automatic software verification as key metrics. We evaluate NL2Contract with different LLMs and compare it to the task of postcondition generation nl2postcond. Our evaluation shows that (1) LLMs are generally effective at generating functional contracts sound for all possible inputs, (2) the generated contracts are sufficiently expressive for discriminating buggy from correct behavior, and (3) verifiers supplied with LLM inferred functional contracts produce fewer false alarms than when provided with postconditions alone. Further investigations show that LLM inferred preconditions generally align well with developers intentions which allows us to use automatic software verifiers to catch real-world bugs.

2025-10-14T16:37:39Z under submission Cedric Richter Heike Wehrheim http://arxiv.org/abs/2510.12582v1 GUPPY: Pythonic Quantum-Classical Programming 2025-10-14T14:39:23Z

We present ongoing work on Guppy, a domain-specific language embedded in Python that allows users to write high-level hybrid quantum programs with complex control flow in Pythonic syntax, aiming to run them on actual quantum hardware.

2025-10-14T14:39:23Z Presented at the Fourth International Workshop on Programming Languages for Quantum Computing (PLanQC 2024) Mark Koch Alan Lawrence Kartik Singhal Seyon Sivarajah Ross Duncan http://arxiv.org/abs/2510.12297v1 Ground Stratification for a Logic of Definitions with Induction 2025-10-14T09:02:18Z

The logic underlying the Abella proof assistant includes mechanisms for interpreting atomic predicates through fixed point definitions that can additionally be treated inductively or co-inductively. However, the original formulation of the logic includes a strict stratification condition on definitions that is too restrictive for some applications such as those that use a logical relations based approach to semantic equivalence. Tiu has shown how this restriction can be eased by utilizing a weaker notion referred to as ground stratification. Tiu's results were limited to a version of the logic that does not treat inductive definitions. We show here that they can be extended to cover such definitions. While our results are obtained by using techniques that have been previously deployed in related ways in this context, their use is sensitive to the particular way in which we generalize the logic. In particular, although ground stratification may be used with arbitrary fixed-point definitions, we show that weakening stratification to this form for inductive definitions leads to inconsistency. The particular generalization we describe accords well with the way logical relations are used in practice. Our results are also a intermediate step to building a more flexible form for definitions into the full logic underlying Abella, which additionally includes co-induction, generic quantification, and a mechanism referred to as nominal abstraction for analyzing occurrences of objects in terms that are governed by generic quantifiers.

2025-10-14T09:02:18Z In Proceedings LFMTP 2025, arXiv:2510.11199 EPTCS 431, 2025, pp. 1-16 Nathan Guermond University of Minnesota Gopalan Nadathur University of Minnesota 10.4204/EPTCS.431.1 http://arxiv.org/abs/2510.12295v1 Operational methods in semantics 2025-10-14T08:56:28Z

The focus of these lecture notes is on abstract models and basic ideas and results that relate to the operational semantics of programming languages largely conceived. The approach is to start with an abstract description of the computation steps of programs and then to build on top semantic equivalences, specification languages, and static analyses. While other approaches to the semantics of programming languages are possible, it appears that the operational one is particularly effective in that it requires a moderate level of mathematical sophistication and scales reasonably well to a large variety of programming features. In practice, operational semantics is a suitable framework to build portable language implementations and to specify and test program properties. It is also used routinely to tackle more ambitious tasks such as proving the correctness of a compiler or a static analyzer.

2025-10-14T08:56:28Z Roberto M. Amadio http://arxiv.org/abs/2501.04183v3 Decompiling for Constant-Time Analysis 2025-10-14T08:35:32Z

Cryptographic libraries are a main target of timing side-channel attacks. A practical means to protect against these attacks is to adhere to the constant-time (CT) policy. However, it is hard to write constant-time code, and even constant-time code can be turned vulnerable by mainstream compilers. So how can we verify that binary code is constant-time? The obvious answer is to use binary-level CT tools. To do so, a common approach is to use decompilers or lifters as a front-end for CT analysis tools operating on source code or IR. Unfortunately, this approach is problematic with current decompilers. To illustrate this fact, we use the recent Clangover vulnerability and other constructed examples to show that five popular decompilers eliminate CT violations, rendering them not applicable with the approach. In this paper, we develop foundations to asses whether a decompiler is fit for the Decompile-then-Analyze approach. We propose CT transparency, which states that a transformation neither eliminates nor introduces CT violations, and a general method for proving that a program transformation is CT transparent. Then, we build CT-RetDec, a CT analysis tool based on a modified version of the LLVM-based decompiler RetDec. We evaluate CT-RetDec on a benchmark of real-world vulnerabilities in binaries, and show that the modifications had significant impact on CT-RetDec's performance. As a contribution of independent interest, we found that popular tools for binary-level CT analysis rely on decompiler-like transformations before analysis. We show that two such tools employ transformations that are not CT transparent, and, consequently, that they incorrectly accept non-CT programs. While our examples are very specific and do not invalidate the general approach of these tools, we advocate that tool developers counter such potential issues by proving the transparency of such transformations.

2025-01-07T23:29:01Z Santiago Arranz-Olmos Gilles Barthe Lionel Blatter Youcef Bouzid Sören van der Wall Zhiyuan Zhang http://arxiv.org/abs/2510.12131v1 Functional Reasoning for Distributed Systems with Failures 2025-10-14T04:15:51Z

Distributed system theory literature often argues for correctness using an informal, Hoare-like style of reasoning. While these arguments are intuitive, they have not all been foolproof, and whether they directly correspond to formal proofs is in question. We formally ground this kind of reasoning and connect it to standard formal approaches through language design and meta-analysis, which leads to a functional style of compositional formal reasoning for a class of distributed systems, including cases involving Byzantine faults. The core of our approach is twin languages: Sync and Async, which formalize the insight from distributed system theory that an asynchronous system can be reduced to a synchronous system for more straightforward reasoning under certain conditions. Sync describes a distributed system as a single, synchronous, data-parallel program. It restricts programs syntactically and has a functional denotational semantics suitable for Hoare-style formal reasoning. Async models a distributed system as a collection of interacting monadic programs, one for each non-faulty node in the system. It has a standard trace-based operational semantics, modeling asynchrony with interleaving. Sync compiles to Async and can then be extracted to yield executable code. We prove that any safety property proven for a Sync program in its denotational semantics is preserved in the operational semantics of its compiled Async programs. We implement the twin languages in Rocq and verify the safety properties of two fault-tolerant consensus protocols: BOSCO and SeqPaxos.

2025-10-14T04:15:51Z Haobin Ni Robbert van Renesse Greg Morrisett http://arxiv.org/abs/2312.07946v2 Incremental Computation: What Is the Essence? 2025-10-13T19:55:33Z

Incremental computation aims to compute more efficiently on changed input by reusing previously computed results. We give a high-level overview of works on incremental computation, and highlight the essence underlying all of them, which we call incrementalization -- the discrete counterpart of differentiation in calculus. We present the gist of a systematic method for incrementalization, and a systematic method centered around it -- called Iterate-Incrementalize-Implement -- for program design and optimization, as well as algorithm design and optimization. We illustrate the methods with example applications in arithmetic computations, recursive functions, graph analysis, and distributed algorithms. At a meta-level, with historical contexts and for future directions, we stress the power of high-level data, control, and module abstractions in developing new and better algorithms and programs as well as their precise complexities.

2023-12-13T07:51:21Z v2: Invited and to appear in Foundations and Trends in Programming Languages, Now Publishers v1: Incremental computation: What is the essence? (Invited contribution). In PEPM 2024: ACM SIGPLAN Intl Workshop on Partial Evaluation and Program Manipulation Yanhong A. Liu 10.1145/3635800.3637447