https://arxiv.org/api/usKbvgMuhgLyMNj8eIqN3ynTdyg 2026-06-30T04:44:23Z 9958 1770 15 http://arxiv.org/abs/2503.04001v1 Binomial Tabulation: A Short Story 2025-03-06T01:26:58Z

We reconstruct some of the development in Richard Bird's [2008] paper Zippy Tabulations of Recursive Functions, using dependent types and string diagrams rather than mere simple types. This paper serves as an intuitive introduction to and demonstration of these concepts for the curious functional programmer, who ideally already has some exposure to dependent types and category theory, is not put off by basic concepts like indexed types and functors, and wants to see a more practical example. The paper is presented in the form of a short story, narrated from the perspective of a functional programmer trying to follow the development in Bird's paper. The first section recaps the original simply typed presentation. The second section explores a series of refinements that can be made using dependent types. The third section uses string diagrams to simplify arguments involving functors and naturality. The short story ends there, but the paper concludes with a discussion and reflection in the afterword.

2025-03-06T01:26:58Z Hsiang-Shang Ko Shin-Cheng Mu Jeremy Gibbons http://arxiv.org/abs/2503.03698v1 AEGIS: Towards Formalized and Practical Memory-Safe Execution of C programs via MSWASM 2025-03-05T17:50:43Z

Programs written in unsafe languages such as C are prone to memory safety errors, which can lead to program compromises and serious real-world security consequences. Recently, Memory-Safe WebAssembly (MSWASM) is introduced as a general-purpose intermediate bytecode with built-in memory safety semantics. Programs written in C can be compiled into MSWASM to get complete memory safety protection. In this paper, we present our extensions on MSWASM, which improve its semantics and practicality. First, we formalize MSWASM semantics in Coq/Iris, extending it with inter-module interaction, showing that MSWASM provides fine-grained isolation guarantees analogous to WASM's coarse-grained isolation via linear memory. Second, we present Aegis, a system to adopt the memory safety of MSWASM for C programs in an interoperable way. Aegis pipeline generates Checked C source code from MSWASM modules to enforce spatial memory safety. Checked C is a recent binary-compatible extension of C which can provide guaranteed spatial safety. Our design allows Aegis to protect C programs that depend on legacy C libraries with no extra dependency and with low overhead. Aegis pipeline incurs 67% runtime overhead and near-zero memory overhead on PolyBenchC programs compared to native.

2025-03-05T17:50:43Z Shahram Esmaeilsabzali Arayi Khalatyan Zhijun Mo Sruthi Venkatanarayanan Shengjie Xu http://arxiv.org/abs/2405.10089v2 Do You Even Lift? Strengthening Compiler Security Guarantees Against Spectre Attacks 2025-03-05T17:20:50Z

Mainstream compilers implement different countermeasures to prevent specific classes of speculative execution attacks. Unfortunately, these countermeasures either lack formal guarantees or come with proofs restricted to speculative semantics capturing only a subset of the speculation mechanisms supported by modern CPUs, thereby limiting their practical applicability. Ideally, these security proofs should target a speculative semantics capturing the effects of all speculation mechanisms implemented in modern CPUs. However, this is impractical and requires new secure compilation proofs to support additional speculation mechanisms. In this paper, we address this problem by proposing a novel secure compilation framework that allows lifting the security guarantees provided by Spectre countermeasures from weaker speculative semantics (ignoring some speculation mechanisms) to stronger ones (accounting for the omitted mechanisms) without requiring new secure compilation proofs. Using our lifting framework, we performed the most comprehensive security analysis of Spectre countermeasures implemented in mainstream compilers to date. Our analysis spans 9 different countermeasures against 5 classes of Spectre attacks, which we proved secure against a speculative semantics accounting for five different speculation mechanisms.

2024-05-16T13:34:00Z Xaver Fabian Marco Patrignani Marco Guarnieri Michael Backes http://arxiv.org/abs/2503.03359v1 Iterating Pointers: Enabling Static Analysis for Loop-based Pointers 2025-03-05T10:39:23Z

Pointers are an integral part of C and other programming languages. They enable substantial flexibility from the programmer's standpoint, allowing the user fine, unmediated control over data access patterns. However, accesses done through pointers are often hard to track, and challenging to understand for optimizers, compilers, and sometimes, even for the developers themselves because of the direct memory access they provide. We alleviate this problem by exposing additional information to analyzers and compilers. By separating the concept of a pointer into a data container and an offset, we can optimize C programs beyond what other state-of-the-art approaches are capable of, in some cases even enabling auto-parallelization. Using this process, we are able to successfully analyze and optimize code from OpenSSL, the Mantevo benchmark suite, and the Lempel-Ziv-Oberhumer compression algorithm. We provide the only automatic approach able to find all parallelization opportunities in the HPCCG benchmark from the Mantevo suite the developers identified and even outperform the reference implementation by up to 18%, as well as speed up the PBKDF2 algorithm implementation from OpenSSL by up to 11x.

2025-03-05T10:39:23Z Andrea Lepori ETH Zurich Alexandru Calotoiu ETH Zurich Torsten Hoefler ETH Zurich 10.1145/3701993 http://arxiv.org/abs/2502.04157v2 Censor Resistant Instruction Independent Obfuscation for Multiple Programs 2025-03-05T07:57:33Z

This work builds upon and optimizes our prior research on obfuscation as instruction decorrelation which achieves multiple program obfuscation. Leveraging this infrastructure, we further achieve the property of sensor-resistant computation.

2025-02-06T15:40:58Z Ali Ajorian http://arxiv.org/abs/2503.03153v1 Substructural Parametricity 2025-03-05T03:52:59Z

Ordered, linear, and other substructural type systems allow us to expose deep properties of programs at the syntactic level of types. In this paper, we develop a family of unary logical relations that allow us to prove consequences of parametricity for a range of substructural type systems. A key idea is to parameterize the relation by an algebra, which we exemplify with a monoid and commutative monoid to interpret ordered and linear type systems, respectively. We prove the fundamental theorem of logical relations and apply it to deduce extensional properties of inhabitants of certain types. Examples include demonstrating that the ordered types for list append and reversal are inhabited by exactly one function, as are types of some tree traversals. Similarly, the linear type of the identity function on lists is inhabited only by permutations of the input. Our most advanced example shows that the ordered type of the list fold function is inhabited only by the fold function.

2025-03-05T03:52:59Z C. B. Aberlé Chris Martens Frank Pfenning http://arxiv.org/abs/2411.11856v3 Automatically Improving LLM-based Verilog Generation using EDA Tool Feedback 2025-03-04T21:25:03Z

Traditionally, digital hardware designs are written in the Verilog hardware description language (HDL) and debugged manually by engineers. This can be time-consuming and error-prone for complex designs. Large Language Models (LLMs) are emerging as a potential tool to help generate fully functioning HDL code, but most works have focused on generation in the single-shot capacity: i.e., run and evaluate, a process that does not leverage debugging and, as such, does not adequately reflect a realistic development process. In this work, we evaluate the ability of LLMs to leverage feedback from electronic design automation (EDA) tools to fix mistakes in their own generated Verilog. To accomplish this, we present an open-source, highly customizable framework, AutoChip, which combines conversational LLMs with the output from Verilog compilers and simulations to iteratively generate and repair Verilog. To determine the success of these LLMs we leverage the VerilogEval benchmark set. We evaluate four state-of-the-art conversational LLMs, focusing on readily accessible commercial models. EDA tool feedback proved to be consistently more effective than zero-shot prompting only with GPT-4o, the most computationally complex model we evaluated. In the best case, we observed a 5.8% increase in the number of successful designs with a 34.2% decrease in cost over the best zero-shot results. Mixing smaller models with this larger model at the end of the feedback iterations resulted in equally as much success as with GPT-4o using feedback, but incurred 41.9% lower cost (corresponding to an overall decrease in cost over zero-shot by 89.6%).

2024-11-01T17:33:28Z Accepted for publication in TODAES Special Issue on Large Language Models for Electronic System Design Automation Jason Blocklove Shailja Thakur Benjamin Tan Hammond Pearce Siddharth Garg Ramesh Karri http://arxiv.org/abs/2502.15031v2 Notions of Stack-manipulating Computation and Relative Monads (Extended Version) 2025-03-04T16:51:32Z

Monads provide a simple and concise interface to user-defined computational effects in functional programming languages. This enables equational reasoning about effects, abstraction over monadic interfaces and the development of monad transformer stacks to combine different effects. Compiler implementors and assembly code programmers similarly virtualize effects, and would benefit from similar abstractions if possible. However, the implementation details of effects seem disconnected from the high-level monad interface: at this lower level much of the design is in the layout of the runtime stack, which is not accessible in a high-level programming language. We demonstrate that the monadic interface can be faithfully adapted from high-level functional programming to a lower level setting with explicit stack manipulation. We use a polymorphic call-by-push-value (CBPV) calculus as a setting that captures the essence of stack-manipulation, with a type system that allows programs to define domain-specific stack structures. Within this setting, we show that the existing category-theoretic notion of a relative monad can be used to model the stack-based implementation of computational effects. To demonstrate generality, we adapt a variety of standard monads to relative monads. Additionally, we show that stack-manipulating programs can benefit from a generalization of do-notation we call "monadic blocks" that allow all CBPV code to be reinterpreted to work with an arbitrary relative monad. As an application, we show that all relative monads extend automatically to relative monad transformers, a process which is not automatic for monads in pure languages.

2025-02-20T20:38:40Z Yuchen Jiang Runze Xue Max S. New 10.1145/3720434 http://arxiv.org/abs/2503.02557v1 Mimosa: A Language for Asynchronous Implementation of Embedded Systems Software 2025-03-04T12:32:57Z

This paper introduces the Mimosa language, a programming language for the design and implementation of asynchronous reactive systems, describing them as a collection of time-triggered processes which communicate through FIFO buffers. Syntactically, Mimosa builds upon the Lustre data-flow language, augmenting it with a new semantics to allow for the expression of side-effectful computations, and extending it with an asynchronous coordination layer which orchestrates the communication between processes. A formal semantics is given to both the process and coordination layer through a textual and graphical rewriting calculus, respectively, and a prototype interpreter for simulation is provided.

2025-03-04T12:32:57Z Nikolaus Huber Susanne Graf Philipp Rümmer Wang Yi 10.1007/978-3-031-95589-1_5 http://arxiv.org/abs/2409.14591v3 Non-Cartesian Guarded Recursion with Daggers 2025-03-04T10:22:03Z

Guarded recursion is a framework allowing for a formalisation of streams in classical programming languages. The latter take their semantics in cartesian closed categories. However, some programming paradigms do not take their semantics in a cartesian setting; this is the case for concurrency, reversible and quantum programming for example. In this paper, we focus on reversible programming through its categorical model in dagger categories, which are categories that contain an involutive operator on morphisms. After presenting classical guarded recursion, we show how to introduce this framework into dagger categories with sufficient structure for data types, also called dagger rig categories. First, given an arbitrary category, we build another category shown to be suitable for guarded recursion in multiple ways, via enrichment and fixed point theorems. We then study the interaction between this construction and the structure of dagger rig categories, aiming for reversible programming. Finally, we show that our construction is suitable as a model of higher-order reversible programming languages, such as symmetric pattern-matching.

2024-09-22T20:55:30Z Louis Lemonnier http://arxiv.org/abs/2503.04811v1 Introducing Support for Move Operations in Melda CRDT 2025-03-04T09:47:09Z

In this paper, we present an extension to Melda (a library which implements a general purpose delta state JSON CRDT) to support move operations. This enhancement relies on minimal changes to the underlying logic of the data structure, has virtually no runtime overhead and zero storage overhead compared to the original version of the library, ensuring simplicity while addressing multiple use cases. Although concurrent reordering of the elements in a list was already supported in the original version of the library, moving objects between different containers lead to undesired outcomes, namely duplicate entries. To address this problem we revisited the original approach and introduced the necessary changes to support for relocating elements within a JSON structure. We detail those changes and provide some examples.

2025-03-04T09:47:09Z Amos Brocco http://arxiv.org/abs/2207.03418v4 Parallel Dual-Numbers Reverse AD 2025-03-03T14:33:33Z

Where dual-numbers forward-mode automatic differentiation (AD) pairs each scalar value with its tangent value, dual-numbers reverse-mode AD attempts to achieve reverse AD using a similarly simple idea: by pairing each scalar value with a backpropagator function. Its correctness and efficiency on higher-order input languages have been analysed by Brunel, Mazza and Pagani, but this analysis used a custom operational semantics for which it is unclear whether it can be implemented efficiently. We take inspiration from their use of linear factoring to optimise dual-numbers reverse-mode AD to an algorithm that has the correct complexity and enjoys an efficient implementation in a standard functional language with support for mutable arrays, such as Haskell. Aside from the linear factoring ingredient, our optimisation steps consist of well-known ideas from the functional programming community. We demonstrate the use of our technique by providing a practical implementation that differentiates most of Haskell98. Where previous work on dual numbers reverse AD has required sequentialisation to construct the reverse pass, we demonstrate that we can apply our technique to task-parallel source programs and generate a task-parallel derivative computation.

2022-07-07T16:43:14Z This is the journal version for JFP. For the shorter conference version (POPL'23), see arXiv:2207.03418v2. For an earlier preprint, see arXiv:2205.11368 Tom Smeding Matthijs Vákár http://arxiv.org/abs/2404.00338v2 Polymorphic Records for Dynamic Languages 2025-03-03T12:05:48Z

We define and study "row polymorphism" for a type system with set-theoretic types, specifically union, intersection, and negation types. We consider record types that embed row variables and define a subtyping relation by interpreting types into sets of record values and by defining subtyping as the containment of interpretations. We define a functional calculus equipped with operations for field extension, selection, and deletion, its operational semantics, and a type system that we prove to be sound. We provide algorithms for deciding the typing and subtyping relations. This research is motivated by the current trend of defining static type system for dynamic languages and, in our case, by an ongoing effort of endowing the Elixir programming language with a gradual type system.

2024-03-30T12:23:57Z Proceedings of the ACM on Programming Languages, Volume 9, Issue OOPSLA1 Article No.: 132, Pages 1464 - 1491, 2025 Giuseppe Castagna Loïc Peyrot 10.1145/3720497 http://arxiv.org/abs/2503.01390v1 Scalable and Accurate Application-Level Crash-Consistency Testing via Representative Testing 2025-03-03T10:41:57Z

Crash consistency is essential for applications that must persist data. Crash-consistency testing has been commonly applied to find crash-consistency bugs in applications. The crash-state space grows exponentially as the number of operations in the program increases, necessitating techniques for pruning the search space. However, state-of-the-art crash-state space pruning is far from ideal. Some techniques look for known buggy patterns or bound the exploration for efficiency, but they sacrifice coverage and may miss bugs lodged deep within applications. Other techniques eliminate redundancy in the search space by skipping identical crash states, but they still fail to scale to larger applications. In this work, we propose representative testing: a new crash-state space reduction strategy that achieves high scalability and high coverage. Our key observation is that the consistency of crash states is often correlated, even if those crash states are not identical. We build Pathfinder, a crash-consistency testing tool that implements an update behaviors-based heuristic to approximate a small set of representative crash states. We evaluate Pathfinder on POSIX-based and MMIO-based applications, where it finds 18 (7 new) bugs across 8 production-ready systems. Pathfinder scales more effectively to large applications than prior works and finds 4x more bugs in POSIX-based applications and 8x more bugs in MMIO-based applications compared to state-of-the-art systems.

2025-03-03T10:41:57Z OOPSLA 2025 Yile Gu Ian Neal Jiexiao Xu Shaun Christopher Lee Ayman Said Musa Haydar Jacob Van Geffen Rohan Kadekodi Andrew Quinn Baris Kasikci 10.1145/3720431 http://arxiv.org/abs/2503.00822v1 Scalable Memory Recycling for Large Quantum Programs 2025-03-02T09:56:39Z

As quantum computing technology advances, the complexity of quantum algorithms increases, necessitating a shift from low-level circuit descriptions to high-level programming paradigms. This paper addresses the challenges of developing a compilation algorithm that optimizes memory management and scales well for bigger, more complex circuits. Our approach models the high-level quantum code as a control flow graph and presents a workflow that searches for a topological sort that maximizes opportunities for qubit reuse. Various heuristics for qubit reuse strategies handle the trade-off between circuit width and depth. We also explore scalability issues in large circuits, suggesting methods to mitigate compilation bottlenecks. By analyzing the structure of the circuit, we are able to identify sub-problems that can be solved separately, without a significant effect on circuit quality, while reducing runtime significantly. This method lays the groundwork for future advancements in quantum programming and compiler optimization by incorporating scalability into quantum memory management.

2025-03-02T09:56:39Z Israel Reichental Ravid Alon Lior Preminger Matan Vax Amir Naveh