https://arxiv.org/api/6BDywRKVHt0e//EUim3SfMecjFo 2026-06-23T19:42:01Z 9934 1095 15 http://arxiv.org/abs/2510.09073v1 Literate Tracing 2025-10-10T07:17:51Z

As computer systems grow ever larger and more complex, a crucial task in software development is for one person (the system expert) to communicate to another (the system novice) how a certain program works. This paper reports on the author's experiences with a paradigm for program documentation that we call literate tracing. A literate trace explains a software system using annotated, concrete execution traces of the system. Literate traces complement both in-code comments (which often lack global context) and out-of-band design docs (which often lack a concrete connection to the code). We also describe TReX, our tool for making literate traces that are interactive, visual, and guaranteed by construction to be faithful to the program semantics. We have used TReX to write literate traces explaining components of large systems software including the Linux kernel, Git source control system, and GCC compiler.

2025-10-10T07:17:51Z examples at https://lair.masot.net/trex . SPLASH Onward 2025 Matthew Sotoudeh http://arxiv.org/abs/2510.09037v1 Repairing Regex Vulnerabilities via Localization-Guided Instructions 2025-10-10T06:15:43Z

Regular expressions (regexes) are foundational to modern computing for critical tasks like input validation and data parsing, yet their ubiquity exposes systems to regular expression denial of service (ReDoS), a vulnerability requiring automated repair methods. Current approaches, however, are hampered by a trade-off. Symbolic, rule-based system are precise but fails to repair unseen or complex vulnerability patterns. Conversely, large language models (LLMs) possess the necessary generalizability but are unreliable for tasks demanding strict syntactic and semantic correctness. We resolve this impasse by introducing a hybrid framework, localized regex repair (LRR), designed to harness LLM generalization while enforcing reliability. Our core insight is to decouple problem identification from the repair process. First, a deterministic, symbolic module localizes the precise vulnerable subpattern, creating a constrained and tractable problem space. Then, the LLM invoked to generate a semantically equivalent fix for this isolated segment. This combined architecture successfully resolves complex repair cases intractable for rule-based repair while avoiding the semantic errors of LLM-only approaches. Our work provides a validated methodology for solving such problems in automated repair, improving the repair rate by 15.4%p over the state-of-the-art. Our code is available at https://github.com/cdltlehf/LRR.

2025-10-10T06:15:43Z 14 pages, 4 figures, 4 tables Sicheol Sung Joonghyuk Hahn Yo-Sub Han http://arxiv.org/abs/2510.08939v1 Free to Move: Reachability Types with Flow-Sensitive Effects for Safe Deallocation and Ownership Transfer 2025-10-10T02:41:04Z

We present a flow-sensitive effect system for reachability types that supports explicit memory management, including Rust-style move semantics, in higher-order impure functional languages. Our system refines the existing reachability qualifier with polymorphic \emph{use} and \emph{kill} effects that record how references are read, written, transferred, and deallocated. The effect discipline tracks operations performed on each resource using qualifiers, enabling the type system to express ownership transfer, contextual freshness, and destructive updates without regions or linearity. We formalize the calculus, its typing and effect rules, and a compositional operational semantics that validates use-after-free safety. All metatheoretic results, including preservation, progress, and effect soundness, are mechanized. The system models idioms such as reference deallocation, move semantics, reference swapping, while exposing precise safety guarantee. Together, these contributions integrate reachability-based reasoning with explicit resource control, advancing the state of the art in safe manual memory management for higher-order functional languages.

2025-10-10T02:41:04Z Haotian Deng Siyuan He Songlin Jia Yuyan Bao Tiark Rompf http://arxiv.org/abs/2510.08969v1 Concept-Based Generic Programming in C++ 2025-10-09T15:27:30Z

We present programming techniques to illustrate the facilities and principles of C++ generic programming using concepts. Concepts are C++'s way to express constraints on generic code. As an initial example, we provide a simple type system that eliminates narrowing conversions and provides range checking without unnecessary notational or run-time overheads. Concepts are used throughout to provide user-defined extensions to the type system. The aim is to show their utility and the fundamental ideas behind them, rather than to provide a detailed or complete explanation of C++'s language support for generic programming or the extensive support provided by the standard library. Generic programming is an integral part of C++, rather than an isolated sub-language. In particular, key facilities support general programming as well as generic programming (e.g., uniform notation for types, lambdas, variadic templates, and C++26 static reflection). Finally, we give design rationales and origins for key parts of the concept design, including use patterns, the relationship to Object-Oriented Programming, value arguments, notation, concept type-matching, and definition checking.

2025-10-09T15:27:30Z Bjarne Stroustrup http://arxiv.org/abs/2510.07582v1 Type, Ability, and Effect Systems: Perspectives on Purity, Semantics, and Expressiveness 2025-10-08T22:01:00Z

Programming benefits from a clear separation between pure, mathematical computation and impure, effectful interaction with the world. Existing approaches to enforce this separation include monads, type-and-effect systems, and capability systems. All share a tension between precision and usability, and each one has non-obvious strengths and weaknesses. This paper aims to raise the bar in assessing such systems. First, we propose a semantic definition of purity, inspired by contextual equivalence, as a baseline independent of any specific typing discipline. Second, we propose that expressiveness should be measured by the degree of completeness, i.e., how many semantically pure terms can be typed as pure. Using this measure, we focus on minimal meaningful effect and capability systems and show that they are incomparable, i.e., neither subsumes the other in terms of expressiveness. Based on this result, we propose a synthesis and show that type, ability, and effect systems combine their respective strengths while avoiding their weaknesses. As part of our formal model, we provide a logical relation to facilitate proofs of purity and other properties for a variety of effect typing disciplines.

2025-10-08T22:01:00Z Yuyan Bao Tiark Rompf http://arxiv.org/abs/2309.05885v3 Modeling Reachability Types with Logical Relations 2025-10-08T21:32:33Z

Reachability types are a recent proposal to bring Rust-style reasoning about memory properties to higher-level languages, with a focus on higher-order functions, parametric types, and shared mutable state -- features that are only partially supported by current techniques as employed in Rust. While prior work has established key type soundness results for reachability types using the usual syntactic techniques of progress and preservation, stronger metatheoretic properties have so far been unexplored. This paper presents an alternative semantic model of reachability types using logical relations, providing a framework in which we study key properties of interest: (1) semantic type soundness, including of not syntactically well-typed code fragments, (2) termination, especially in the presence of higher-order state, (3) effect safety, especially the absence of observable mutation, and, finally, (4) program equivalence, especially reordering of non-interfering expressions for parallelization or compiler optimization.

2023-09-12T00:13:53Z Yuyan Bao Songlin Jia Guannan Wei Oliver Bračevac Tiark Rompf http://arxiv.org/abs/2510.07051v1 A Duality Theorem for Classical-Quantum States with Applications to Complete Relational Program Logics 2025-10-08T14:19:03Z

Duality theorems play a fundamental role in convex optimization. Recently, it was shown how duality theorems for countable probability distributions and finite-dimensional quantum states can be leveraged for building relatively complete relational program logics for probabilistic and quantum programs, respectively. However, complete relational logics for classical-quantum programs, which combine classical and quantum computations and operate over classical as well as quantum variables, have remained out of reach. The main gap is that while prior duality theorems could readily be derived using optimal transport and semidefinite programming methods, respectively, the combined setting falls out of the scope of these methods and requires new ideas. In this paper, we overcome this gap and establish the desired duality theorem for classical-quantum states. Our argument relies critically on a novel dimension-independent analysis of the convex optimization problem underlying the finite-dimensional quantum setting, which, in particular, allows us to take the limit where the classical state space becomes infinite. Using the resulting duality theorem, we establish soundness and completeness of a new relational program logic, called $\mathsf{cqOTL}$, for classical-quantum programs. In addition, we lift prior restrictions on the completeness of two existing program logics: $\mathsf{eRHL}$ for probabilistic programs (Avanzini et al., POPL 2025) and $\mathsf{qOTL}$ for quantum programs (Barthe et al., LICS 2025).

2025-10-08T14:19:03Z 63 pages, 11 figures, 4 tables Gilles Barthe Minbo Gao Jam Kabeer Ali Khan Matthijs Muis Ivan Renison Keiya Sakabe Michael Walter Yingte Xu Li Zhou http://arxiv.org/abs/2402.12108v2 Weak-Linear Types 2025-10-08T11:15:13Z

Computational interpretations of linear logic allow static control of memory resources: the data produced by the program are endowed through its type with attributes that determine its life cycle, and guarantee safe deallocation. The use of linear types encounters limitations in practice, since linear data, in the traditional sense, do not so often appear in actual programs. Several alternatives have been proposed in the attempt to relax the condition of linearity, adding coercions to the language to allow linear objects to be temporarily aliased. In this work we propose a new alternative, whose virtue is to preserve the simplicity and elegance of the original system.

2024-02-19T12:56:44Z Hector Gramaglia http://arxiv.org/abs/2306.08127v2 Friend or Foe Inside? Exploring In-Process Isolation to Maintain Memory Safety for Unsafe Rust 2025-10-08T07:10:47Z

Rust is a popular memory-safe systems programming language. In order to interact with hardware or call into non-Rust libraries, Rust provides \emph{unsafe} language features that shift responsibility for ensuring memory safety to the developer. Failing to do so, may lead to memory safety violations in unsafe code which can violate safety of the entire application. In this work we explore in-process isolation with Memory Protection Keys as a mechanism to shield safe program sections from safety violations that may happen in unsafe sections. Our approach is easy to use and comprehensive as it prevents heap and stack-based violations. We further compare process-based and in-process isolation mechanisms and the necessary requirements for data serialization, communication, and context switching. Our results show that in-process isolation can be effective and efficient, permits for a high degree of automation, and also enables a notion of application rewinding where the safe program section may detect and safely handle violations in unsafe code.

2023-06-13T20:48:13Z Author's version of manuscript to appear in 2023 IEEE Secure Development Conference (SecDev) Merve Gülmez Thomas Nyman Christoph Baumann Jan Tobias Mühlberg 10.1109/SecDev56634.2023.00020 http://arxiv.org/abs/2510.06420v1 Automated Repeatable Adversary Threat Emulation with Effects Language (EL) 2025-10-07T20:00:27Z

The emulation of multi-step attacks attributed to advanced persistent threats is valuable for training defenders and evaluating defense tools. In this paper, we discuss the numerous challenges and desired attributes associated with such automation. Additionally, we introduce the use of Effects Language (EL), a visual programming language with graph-based operational semantics, as a solution to address many of these challenges and requirements. We formally define the execution semantics of EL, and prove important execution properties. Furthermore, we showcase the application of EL to codify attacks using an example from one of the publicly available attack scenarios. We also demonstrate how EL can be utilized to provide proof-of-attack of complex multi-step attacks. Our results highlight the improvements in time and resource efficiency achieved through the use of EL for repeatable automation.

2025-10-07T20:00:27Z Suresh K. Damodaran Paul D. Rowe http://arxiv.org/abs/2508.21593v2 Growing Mathlib: maintenance of a large scale mathematical library 2025-10-07T17:39:13Z

The Lean mathematical library Mathlib is one of the fastest-growing libraries of formalised mathematics. We describe various strategies to manage this growth, while allowing for change and avoiding maintainer overload. This includes dealing with breaking changes via a deprecation system, using code quality analysis tools (linters) to provide direct user feedback about common pitfalls, speeding up compilation times through conscious library (re-)design, dealing with technical debt as well as writing custom tooling to help with the review and triage of new contributions.

2025-08-29T12:49:58Z 21 pages, 1 figure. To appear at Conference on Intelligent Computer Mathematics (CICM) 2025 in v2: Minor copy-edits, added one more related reference Anne Baanen Matthew Robert Ballard Johan Commelin Bryan Gin-ge Chen Michael Rothgang Damiano Testa http://arxiv.org/abs/2511.11581v1 The Anatomy of a Triton Attention Kernel 2025-10-07T13:34:51Z

A long-standing goal in both industry and academia is to develop an LLM inference platform that is portable across hardware architectures, eliminates the need for low-level hand-tuning, and still delivers best-in-class efficiency. In this work, we demonstrate that portable, efficient cross-platform LLM inference is indeed possible and share our experience. We develop a state-of-the-art paged attention kernel, the core performance-critical component of many LLM deployments, that builds exclusively on the domain-specific just-in-time compiled language Triton to achieve state-of-the-art performance on both NVIDIA and AMD GPUs. We describe our high-level approach, the key algorithmic and system-level improvements, the parameter auto-tuning required to unlock efficiency, and the integrations into a popular inference server that are necessary to bring the performance of a generic Triton attention kernel from 19.7% of the state-of-the-art to 105.9%. Our results highlight how open-source domain-specific languages can be leveraged to unlock model portability across different GPU vendors.

2025-10-07T13:34:51Z Burkhard Ringlein Jan van Lunteren Radu Stoica Thomas Parnell http://arxiv.org/abs/2309.08673v2 A Two-Level Linear Dependent Type Theory 2025-10-06T23:14:30Z

We present a type theory combining both linearity and dependency by stratifying typing rules into a level for logics and a level for programs. The distinction between logics and programs decouples their semantics, allowing the type system to assume tight resource bounds. A natural notion of irrelevancy is established where all proofs and types occurring inside programs are fully erasable without compromising their operational behavior. Through a heap-based operational semantics, we show that extracted programs always make computational progress and run memory clean. Additionally, programs can be freely reflected into the logical level for conducting deep proofs in the style of standard dependent type theories. This enables one to write resource safe programs and verify their correctness using a unified language.

2023-09-15T18:04:35Z Qiancheng Fu Hongwei Xi http://arxiv.org/abs/2510.05376v1 Constraint-Level Design of zkEVMs: Architectures, Trade-offs, and Evolution 2025-10-06T21:10:11Z

Zero-knowledge Ethereum Virtual Machines (zkEVMs) must reconcile a fundamental contradiction: the Ethereum Virtual Machine was designed for transparent sequential execution, while zero-knowledge proofs require algebraic circuit representations. This survey provides the first systematic analysis of how existing major production zkEVM implementations resolve this tension through distinct constraint engineering strategies. We develop a comparative framework that maps the design space across three architectural dimensions. First, arithmetization schemes reveal stark trade-offs: R1CS requires compositional gadget libraries, PLONKish achieves elegance through custom gates that capture complex EVM opcodes in single constraints, while the homogeneous structure of AIR fundamentally mismatches the irregular instruction set of EVM. Second, dispatch mechanisms determine constraint activation patterns: selector-based systems waste trace width on inactive constraints, while ROM-based approaches trade memory lookups for execution flexibility. Third, the Type 1-4 spectrum quantifies an inescapable trade-off: the bit-level EVM compatibility of Type 1 demands significantly higher constraint complexity than the custom instruction sets of Type 4. Beyond cataloging implementations, we identify critical open problems across multiple domains: performance barriers preventing sub-second proving, absence of formal verification for constraint-to-EVM semantic equivalence, lack of standardized benchmarking frameworks, and architectural gaps in hybrid zkEVM/zkVM designs, decentralized prover coordination, privacy preservation, and interoperability.

2025-10-06T21:10:11Z Yahya Hassanzadeh-Nazarabadi Sanaz Taheri-Boshrooyeh http://arxiv.org/abs/2504.06421v2 We've Got You Covered: Type-Guided Repair of Incomplete Input Generators 2025-10-06T16:42:18Z

Property-based testing (PBT) is a popular technique for automatically testing semantic properties of a program, specified as a pair of pre- and post-conditions. The efficacy of this approach depends on being able to quickly generate inputs that meet the precondition, in order to maximize the set of program behaviors that are probed. For semantically rich preconditions, purely random generation is unlikely to produce many valid inputs; when this occurs, users are forced to manually write their own specialized input generators. One common problem with handwritten generators is that they may be incomplete, i.e., they are unable to generate some values meeting the target precondition. This paper presents a novel program repair technique that patches an incomplete generator so that its range includes every valid input. Our approach uses a novel enumerative synthesis algorithm that leverages the recently developed notion of coverage types to characterize the set of missing test values as well as the coverage provided by candidate repairs. We have implemented a repair tool for OCaml generators, called Cobb, and used it to repair a suite of benchmarks drawn from the PBT literature.

2025-04-08T20:40:16Z Extended version of OOPSLA 2025 paper Patrick LaFontaine Zhe Zhou Ashish Mishra Suresh Jagannathan Benjamin Delaware 10.1145/3763158