https://arxiv.org/api/nnQZ4wWQ6UE0bj+peRetSyS6Nmo 2026-06-29T02:44:49Z 9954 1755 15 http://arxiv.org/abs/2503.07489v1 Destination Calculus: A Linear λ-Calculus for Purely Functional Memory Writes 2025-03-10T16:08:47Z

Destination passing -- aka. out parameters -- is taking a parameter to fill rather than returning a result from a function. Due to its apparently imperative nature, destination passing has struggled to find its way to pure functional programming. In this paper, we present a pure functional calculus with destinations at its core. Our calculus subsumes all the similar systems, and can be used to reason about their correctness or extension. In addition, our calculus can express programs that were previously not known to be expressible in a pure language. This is guaranteed by a modal type system where modes are used to manage both linearity and scopes. Type safety of our core calculus was proved formally with the Coq proof assistant.

2025-03-10T16:08:47Z 29 pages, 11 figures Preprint of article 89 from PACMPL Volume 9, OOPSLA1, April 2025 Thomas Bagrel Arnaud Spiwack 10.1145/3720423 http://arxiv.org/abs/2503.06812v1 Can Proof Assistants Verify Multi-Agent Systems? 2025-03-10T00:24:29Z

This paper presents the Soda language for verifying multi-agent systems. Soda is a high-level functional and object-oriented language that supports the compilation of its code not only to Scala, a strongly statically typed high-level programming language, but also to Lean, a proof assistant and programming language. Given these capabilities, Soda can implement multi-agent systems, or parts thereof, that can then be integrated into a mainstream software ecosystem on the one hand and formally verified with state-of-the-art tools on the other hand. We provide a brief and informal introduction to Soda and the aforementioned interoperability capabilities, as well as a simple demonstration of how interaction protocols can be designed and verified with Soda. In the course of the demonstration, we highlight challenges with respect to real-world applicability.

2025-03-10T00:24:29Z Julian Alfredo Mendez Timotheus Kampik http://arxiv.org/abs/2505.21506v1 Conformance Checking for Less: Efficient Conformance Checking for Long Event Sequences 2025-03-09T16:42:59Z

Long event sequences (termed traces) and large data logs that originate from sensors and prediction models are becoming increasingly common in our data-rich world. In such scenarios, conformance checking-validating a data log against an expected system behavior (the process model) can become computationally infeasible due to the exponential complexity of finding an optimal alignment. To alleviate scalability challenges for this task, we propose ConLES, a sliding-window conformance checking approach for long event sequences that preserves the interpretability of alignment-based methods. ConLES partitions traces into manageable subtraces and iteratively aligns each against the expected behavior, leading to significant reduction of the search space while maintaining overall accuracy. We use global information that captures structural properties of both the trace and the process model, enabling informed alignment decisions and discarding unpromising alignments, even if they appear locally optimal. Performance evaluations across multiple datasets highlight that ConLES outperforms the leading optimal and heuristic algorithms for long traces, consistently achieving the optimal or near-optimal solution. Unlike other conformance methods that struggle with long event sequences, ConLES significantly reduces the search space, scales efficiently, and uniquely supports both predefined and discovered process models, making it a viable and leading option for conformance checking of long event sequences.

2025-03-09T16:42:59Z 17 pages, 4 figures Eli Bogdanov Izack Cohen Avigdor Gal http://arxiv.org/abs/2312.05657v2 PerfRL: A Small Language Model Framework for Efficient Code Optimization 2025-03-09T05:01:42Z

Code optimization is a challenging task requiring a substantial level of expertise from developers. Nonetheless, this level of human capacity is not sufficient considering the rapid evolution of new hardware architectures and software environments. In light of this, recent research proposes adopting machine learning and artificial intelligence techniques to automate the code optimization process. In this paper, we introduce PerfRL, an innovative framework designed to tackle the problem of code optimization. Our framework leverages the capabilities of small language models (SLMs) and reinforcement learning (RL), facilitating a system where SLMs can assimilate feedback from their environment during the fine-tuning phase, notably through unit tests. When benchmarked against existing models, PerfRL demonstrates superior efficiency in terms of speed and computational resource usage, attributed to its reduced need for training steps and its compatibility with SLMs. Furthermore, it substantially diminishes the risk of logical and syntactical errors. To evaluate our framework, we conduct experiments on the PIE dataset using a lightweight large language model (i.e., CodeT5) and a new reinforcement learning algorithm, namely RRHF. For evaluation purposes, we use a list of evaluation metrics related to optimization quality and speedup. The evaluation results show that our approach achieves similar or better results compared to state-of-the-art models using shorter training times and smaller pre-trained models.

2023-12-09T19:50:23Z Shukai Duan Nikos Kanakaris Xiongye Xiao Heng Ping Chenyu Zhou Nesreen K. Ahmed Guixiang Ma Mihai Capota Theodore L. Willke Shahin Nazarian Paul Bogdan http://arxiv.org/abs/2503.06162v1 Functional Reactive Programming with Effects, A More Permissive Approach 2025-03-08T10:54:21Z

We introduce a functional reactive programming language that extends WORMHOLES, an enhancement of YAMPA with support for effects. Our proposal relaxes the constraint in WORMHOLES that restricts all resources to single-use. Resources are categorized into two kinds: input/output resources and internal resources. Input/output resources model interactions with the environment and follow constraints similar to those in WORMHOLES. Internal resources, on the other hand, enable communication between program components and can be used multiple times. We demonstrate that programs written in our language can be translated into equivalent effect-free YAMPA programs, ensuring that our approach remains compatible with existing functional reactive paradigms.

2025-03-08T10:54:21Z Frédéric Dabrowski Jordan Ischard http://arxiv.org/abs/2410.17045v4 Abstract Operational Methods for Call-by-Push-Value 2025-03-07T21:18:43Z

Levy's call-by-push-value is a comprehensive programming paradigm that combines elements from functional and imperative programming, supports computational effects and subsumes both call-by-value and call-by-name evaluation strategies. In the present work, we develop modular methods to reason about program equivalence in call-by-push-value, and in fine-grain call-by-value, which is a popular lightweight call-by-value sublanguage of the former. Our approach is based on the fundamental observation that presheaf categories of sorted sets are suitable universes to model call-by-(push)-value languages, and that natural, coalgebraic notions of program equivalence such as applicative similarity and logical relations can be developed within. Starting from this observation, we formalize fine-grain call-by-value and call-by-push-value in the higher-order abstract GSOS framework, reduce their key congruence properties to simple syntactic conditions by leveraging existing theory and argue that introducing changes to either language incurs minimal proof overhead.

2024-10-22T14:22:38Z Sergey Goncharov Stelios Tsampas Henning Urbat 10.1145/3704871 http://arxiv.org/abs/2503.05924v1 Satire: Computing Rigorous Bounds for Floating-Point Rounding Error in Mixed-Precision Loop-Free Programs 2025-03-07T20:46:38Z

Techniques that rigorously bound the overall rounding error exhibited by a numerical program are of significant interest for communities developing numerical software. However, there are few available tools today that can be used to rigorously bound errors in programs that employ conditional statements (a basic need) as well as mixed-precision arithmetic (a direction of significant future interest) employing global optimization in error analysis. In this paper, we present a new tool that fills this void while also employing an abstraction-guided optimization approach to allow designers to trade error-bound tightness for gains in analysis time -- useful when searching for design alternatives. We first present the basic rigorous analysis framework of Satire and then show how to extend it to incorporate abstractions, conditionals, and mixed-precision arithmetic. We begin by describing Satire's design and its performance on a collection of benchmark examples. We then describe these aspects of Satire: (1) how the error-bound and tool execution time vary with the abstraction level; (2) the additional machinery to handle conditional expression branches, including defining the concepts of instability jumps and instability window widths and measuring these quantities; and (3) how the error changes when a mix of precision values are used. To showcase how \satire can add value during design, we start with a Conjugate Gradient solver and demonstrate how its step size and search direction are affected by different precision settings. Satire is freely available for evaluation, and can be used during the design of numerical routines to effect design tradeoffs guided by rigorous empirical error guarantees.

2025-03-07T20:46:38Z 22 pgs, 8 figures, 4 tables Tanmay Tirpankar Arnab Das Ganesh Gopalakrishnan http://arxiv.org/abs/2106.05421v4 Data-Driven Invariant Learning for Probabilistic Programs 2025-03-07T16:58:50Z

Morgan and McIver's weakest pre-expectation framework is one of the most well-established methods for deductive verification of probabilistic programs. Roughly, the idea is to generalize binary state assertions to real-valued expectations, which can measure expected values of probabilistic program quantities. While loop-free programs can be analyzed by mechanically transforming expectations, verifying loops usually requires finding an invariant expectation, a difficult task. We propose a new view of invariant expectation synthesis as a regression problem: given an input state, predict the average value of the post-expectation in the output distribution. Guided by this perspective, we develop the first data-driven invariant synthesis method for probabilistic programs. Unlike prior work on probabilistic invariant inference, our approach can learn piecewise continuous invariants without relying on template expectations, and also works with black-box access to the program. We also develop a data-driven approach to learn sub-invariants from data, which can be used to upper- or lower-bound expected values. We implement our approaches and demonstrate their effectiveness on a variety of benchmarks from the probabilistic programming literature.

2021-06-09T22:27:11Z 37 pages Formal Methods in System Design 2024 (CAV Collection) Jialu Bao Nitesh Trivedi Drashti Pathak Justin Hsu Subhajit Roy 10.1007/s10703-024-00466-x http://arxiv.org/abs/2503.05849v1 Establishing tool support for a concept DSL 2025-03-07T09:18:31Z

The quality of software products tends to correlate with the quality of the abstractions adopted early in the design process. Acknowledging this tendency has led to the development of various tools and methodologies for modeling systems thoroughly before implementing them. However, creating effective abstract models of domain problems is difficult, especially if the models are also expected to exhibit qualities such as intuitiveness, being seamlessly integrable with other models, or being easily translatable into code. This thesis describes Conceptual, a DSL for modeling the behavior of software systems using self-contained and highly reusable units of functionally known as concepts. The language's syntax and semantics are formalized based on previous work. Additionally, the thesis proposes a strategy for mapping language constructs from Conceptual into the Alloy modeling language. The suggested strategy is then implemented with a simple compiler, allowing developers to access and utilize Alloy's existing analysis tools for program reasoning. The utility and expressiveness of Conceptual is demonstrated qualitatively through several practical case studies. Using the implemented compiler, a few erroneous specifications are identified in the literature. Moreover, the thesis establishes preliminary tool support in the Visual Studio Code IDE.

2025-03-07T09:18:31Z 84 pages, 13 figures, 25 listings Nikolaj Kühne Jakobsen http://arxiv.org/abs/2503.05163v1 Vbox: Efficient Black-Box Serializability Verification 2025-03-07T06:01:09Z

Verifying the serializability of transaction histories is essential for users to know if the DBMS ensures the claimed serializable isolation level without potential bugs. Black-box serializability verification is a promising approach. Existing verification methods often have one or more limitations such as incomplete detection of data anomalies, long verification time, high memory usage, or dependence on specific concurrency control protocols. In this paper, a new black-box serializability verification method called \textsf{Vbox} is proposed. \textsf{Vbox} is powered by a number of new techniques, including the support for predicate database operations, comprehensive applications of transactions' time information in the verification process, and a simplified satisfiability (SAT) problem formulation and its efficient solver. In this paper, \textsf{Vbox} is verified to be correct, efficient, and capable of detecting more data anomalies, while not relying on any specific concurrency control protocols.

2025-03-07T06:01:09Z Weihua Sun Zhaonian Zou http://arxiv.org/abs/2503.04389v1 Pydrofoil: accelerating Sail-based instruction set simulators 2025-03-06T12:38:32Z

We present Pydrofoil, a multi-stage compiler that generates instruction set simulators (ISSs) from processor instruction set architectures (ISAs) expressed in the high-level, verification-oriented ISA specification language Sail. Pydrofoil shows a > 230x speedup over the C-based ISS generated by Sail on our benchmarks, and is based on the following insights. (i) An ISS is effectively an interpreter loop, and tracing just-in-time (JIT) compilers have proven effective at accelerating those, albeit mostly for dynamically typed languages. (ii) ISS workloads are highly atypical, dominated by intensive bit manipulation operations. Conventional compiler optimisations for general-purpose programming languages have limited impact for speeding up such workloads. We develop suitable domain-specific optimisations. (iii) Neither tracing JIT compilers, nor ahead-of-time (AOT) compilation alone, even with domain-specific optimisations, suffice for the generation of performant ISSs. Pydrofoil therefore implements a hybrid approach, pairing an AOT compiler with a tracing JIT built on the meta-tracing PyPy framework. AOT and JIT use domain-specific optimisations. Our benchmarks demonstrate that combining AOT and JIT compilers provides significantly greater performance gains than using either compiler alone.

2025-03-06T12:38:32Z 32 pages Carl Friedrich Bolz-Tereick Luke Panayi Ferdia McKeogh Tom Spink Martin Berger http://arxiv.org/abs/2503.04001v1 Binomial Tabulation: A Short Story 2025-03-06T01:26:58Z

We reconstruct some of the development in Richard Bird's [2008] paper Zippy Tabulations of Recursive Functions, using dependent types and string diagrams rather than mere simple types. This paper serves as an intuitive introduction to and demonstration of these concepts for the curious functional programmer, who ideally already has some exposure to dependent types and category theory, is not put off by basic concepts like indexed types and functors, and wants to see a more practical example. The paper is presented in the form of a short story, narrated from the perspective of a functional programmer trying to follow the development in Bird's paper. The first section recaps the original simply typed presentation. The second section explores a series of refinements that can be made using dependent types. The third section uses string diagrams to simplify arguments involving functors and naturality. The short story ends there, but the paper concludes with a discussion and reflection in the afterword.

2025-03-06T01:26:58Z Hsiang-Shang Ko Shin-Cheng Mu Jeremy Gibbons http://arxiv.org/abs/2503.03698v1 AEGIS: Towards Formalized and Practical Memory-Safe Execution of C programs via MSWASM 2025-03-05T17:50:43Z

Programs written in unsafe languages such as C are prone to memory safety errors, which can lead to program compromises and serious real-world security consequences. Recently, Memory-Safe WebAssembly (MSWASM) is introduced as a general-purpose intermediate bytecode with built-in memory safety semantics. Programs written in C can be compiled into MSWASM to get complete memory safety protection. In this paper, we present our extensions on MSWASM, which improve its semantics and practicality. First, we formalize MSWASM semantics in Coq/Iris, extending it with inter-module interaction, showing that MSWASM provides fine-grained isolation guarantees analogous to WASM's coarse-grained isolation via linear memory. Second, we present Aegis, a system to adopt the memory safety of MSWASM for C programs in an interoperable way. Aegis pipeline generates Checked C source code from MSWASM modules to enforce spatial memory safety. Checked C is a recent binary-compatible extension of C which can provide guaranteed spatial safety. Our design allows Aegis to protect C programs that depend on legacy C libraries with no extra dependency and with low overhead. Aegis pipeline incurs 67% runtime overhead and near-zero memory overhead on PolyBenchC programs compared to native.

2025-03-05T17:50:43Z Shahram Esmaeilsabzali Arayi Khalatyan Zhijun Mo Sruthi Venkatanarayanan Shengjie Xu http://arxiv.org/abs/2405.10089v2 Do You Even Lift? Strengthening Compiler Security Guarantees Against Spectre Attacks 2025-03-05T17:20:50Z

Mainstream compilers implement different countermeasures to prevent specific classes of speculative execution attacks. Unfortunately, these countermeasures either lack formal guarantees or come with proofs restricted to speculative semantics capturing only a subset of the speculation mechanisms supported by modern CPUs, thereby limiting their practical applicability. Ideally, these security proofs should target a speculative semantics capturing the effects of all speculation mechanisms implemented in modern CPUs. However, this is impractical and requires new secure compilation proofs to support additional speculation mechanisms. In this paper, we address this problem by proposing a novel secure compilation framework that allows lifting the security guarantees provided by Spectre countermeasures from weaker speculative semantics (ignoring some speculation mechanisms) to stronger ones (accounting for the omitted mechanisms) without requiring new secure compilation proofs. Using our lifting framework, we performed the most comprehensive security analysis of Spectre countermeasures implemented in mainstream compilers to date. Our analysis spans 9 different countermeasures against 5 classes of Spectre attacks, which we proved secure against a speculative semantics accounting for five different speculation mechanisms.

2024-05-16T13:34:00Z Xaver Fabian Marco Patrignani Marco Guarnieri Michael Backes http://arxiv.org/abs/2503.03359v1 Iterating Pointers: Enabling Static Analysis for Loop-based Pointers 2025-03-05T10:39:23Z

Pointers are an integral part of C and other programming languages. They enable substantial flexibility from the programmer's standpoint, allowing the user fine, unmediated control over data access patterns. However, accesses done through pointers are often hard to track, and challenging to understand for optimizers, compilers, and sometimes, even for the developers themselves because of the direct memory access they provide. We alleviate this problem by exposing additional information to analyzers and compilers. By separating the concept of a pointer into a data container and an offset, we can optimize C programs beyond what other state-of-the-art approaches are capable of, in some cases even enabling auto-parallelization. Using this process, we are able to successfully analyze and optimize code from OpenSSL, the Mantevo benchmark suite, and the Lempel-Ziv-Oberhumer compression algorithm. We provide the only automatic approach able to find all parallelization opportunities in the HPCCG benchmark from the Mantevo suite the developers identified and even outperform the reference implementation by up to 18%, as well as speed up the PBKDF2 algorithm implementation from OpenSSL by up to 11x.

2025-03-05T10:39:23Z Andrea Lepori ETH Zurich Alexandru Calotoiu ETH Zurich Torsten Hoefler ETH Zurich 10.1145/3701993