https://arxiv.org/api/aQVzceBucXuxjLHfwbabYYHx2ZE 2026-06-26T14:53:01Z 9951 1290 15 http://arxiv.org/abs/2310.06959v7 Proof Repair across Quotient Type Equivalences 2025-08-22T18:39:40Z

Proofs in proof assistants like Rocq can be brittle, breaking easily in response to changes. To address this, recent work introduced an algorithm and tool in Rocq to automatically repair broken proofs in response to changes that correspond to type equivalences. However, many changes remained out of the scope of this algorithm and tool -- especially changes in underlying \emph{behavior}. We extend this proof repair algorithm so that it can express certain changes in behavior that were previously out of scope. We focus in particular on equivalences between \emph{quotient types} -- types equipped with a relation that describes what it means for any two elements of that type to be equal. Quotient type equivalences can be used to express interesting changes in representations of mathematical structures, as well as changes in the implementations of data structures. We extend this algorithm and tool to support quotient type equivalences in Rocq. Notably, since Rocq lacks quotient types entirely, our extensions use Rocq's setoid machinery in place of quotients. Specifically, (1) our extension to the algorithm supports new changes corresponding to setoids, and (2) our extension to the tool supports this new class of changes and further automates away some of the new proof obligations. We demonstrate our extensions on proof repair case studies for previously unsupported changes. We also perform manual proof repair in Cubical Agda, a language with a univalent metatheory, which allows us to construct the first ever internal proofs of correctness for proof repair.

2023-10-10T19:21:53Z 26 pages, 22 figures, for associated code, see https://github.com/uwplse/pumpkin-pi/releases/tag/oopsla2025 Cosmo Viola Max Fan Talia Ringer http://arxiv.org/abs/2508.16522v1 On the Duality of Task and Actor Programming Models 2025-08-22T16:45:03Z

Programming models for distributed and heterogeneous machines are rapidly growing in popularity to meet the demands of modern workloads. Task and actor models are common choices that offer different trade-offs between development productivity and achieved performance. Task-based models offer better productivity and composition of software, whereas actor-based models routinely deliver better peak performance due to lower overheads. While task-based and actor-based models appear to be different superficially, we demonstrate these programming models are duals of each other. Importantly, we show that this duality extends beyond functionality to performance, and elucidate techniques that let task-based systems deliver performance competitive with actor-based systems without compromising productivity. We apply these techniques to both Realm, an explicitly parallel task-based runtime, as well as Legion, an implicitly parallel task-based runtime. We show these techniques reduce Realm's overheads by between 1.7-5.3x, coming within a factor of two of the overheads imposed by heavily optimized actor-based systems like Charm++ and MPI. We further show that our techniques enable between 1.3-5.0x improved strong scaling of unmodified Legion applications.

2025-08-22T16:45:03Z Rohan Yadav Joseph Guman Sean Treichler Michael Garland Alex Aiken Fredrik Kjolstad Michael Bauer http://arxiv.org/abs/2508.16517v1 ARSP: Automated Repair of Verilog Designs via Semantic Partitioning 2025-08-22T16:40:17Z

Debugging functional Verilog bugs consumes a significant portion of front-end design time. While Large Language Models (LLMs) have demonstrated great potential in mitigating this effort, existing LLM-based automated debugging methods underperform on industrial-scale modules. A major reason for this is bug signal dilution in long contexts, where a few bug-relevant tokens are overwhelmed by hundreds of unrelated lines, diffusing the model's attention. To address this issue, we introduce ARSP, a two-stage system that mitigates dilution via semantics-guided fragmentation. A Partition LLM splits a module into semantically tight fragments; a Repair LLM patches each fragment; edits are merged without altering unrelated logic. A synthetic data framework generates fragment-level training pairs spanning bug types, design styles, and scales to supervise both models. Experiments show that ARSP achieves 77.92% pass@1 and 83.88% pass@5, outperforming mainstream commercial LLMs including Claude-3.7 and SOTA automated Verilog debugging tools Strider and MEIC. Also, semantic partitioning improves pass@1 by 11.6% and pass@5 by 10.2% over whole-module debugging, validating the effectiveness of fragment-level scope reduction in LLM-based Verilog debugging.

2025-08-22T16:40:17Z Bingkun Yao Ning Wang Xiangfeng Liu Yuxin Du Yuchen Hu Hong Gao Zhe Jiang Nan Guan http://arxiv.org/abs/2508.15898v1 Automated Formal Verification of a Software Fault Isolation System 2025-08-21T18:00:19Z

Software fault isolation (SFI) is a popular way to sandbox untrusted software. A key component of SFI is the verifier that checks the untrusted code is written in a subset of the machine language that guarantees it never reads or writes outside of a region of memory dedicated to the sandbox. Soundness bugs in the SFI verifier would break the SFI security model and allow the supposedly sandboxed code to read protected memory. In this paper, we address the concern of SFI verifier bugs by performing an automated formal verification of a recent SFI system called Lightweight Fault Isolation (LFI). In particular, we formally verify that programs accepted by the LFI verifier never read or write to memory outside of a designated sandbox region.

2025-08-21T18:00:19Z Short paper to appear at FMCAD 2025, https://fmcad.org/ Matthew Sotoudeh Zachary Yedidia http://arxiv.org/abs/2410.14906v7 Structural Temporal Logic for Mechanized Program Verification 2025-08-21T16:35:36Z

Mechanized verification of liveness properties for infinite programs with effects and nondeterminism is challenging. Existing temporal reasoning frameworks operate at the level of models such as traces and automata. Reasoning happens at a very low-level, requiring complex nested (co-)inductive proof techniques and familiarity with proof assistant mechanics (e.g., the guardedness checker). Further, reasoning at the level of models instead of program constructs creates a verification gap that loses the benefits of modularity and composition enjoyed by structural program logics such as Hoare Logic. To address this verification gap, and the lack of compositional proof techniques for temporal specifications, we propose Ticl, a new structural temporal logic. Using ticl, we encode complex (co-)inductive proof techniques as structural lemmas and focus our reasoning on variants and invariants. We show that it is possible to perform compositional proofs of general temporal properties in a proof assistant, while working at a high level of abstraction. We demonstrate the benefits of Ticl by giving mechanized proofs of safety and liveness properties for programs with scheduling, concurrent shared memory, and distributed consensus, demonstrating a low proof-to-code ratio.

2024-10-18T23:19:29Z Proc. ACM Program. Lang., Vol. 9, No. OOPSLA2, Article 313. Publication date: October 2025 Eleftherios Ioannidis Yannick Zakowski Steve Zdancewic Sebastian Angel 10.1145/3763091 http://arxiv.org/abs/2507.05234v2 React-tRace: A Semantics for Understanding React Hooks 2025-08-21T14:24:35Z

React has become the most widely used web front-end framework, enabling the creation of user interfaces in a declarative and compositional manner. Hooks are a set of APIs that manage side effects in function components in React. However, their semantics are often seen as opaque to developers, leading to UI bugs. We introduce React-tRace, a formalization of the semantics of the essence of React Hooks, providing a semantics that clarifies their behavior. We demonstrate that our model captures the behavior of React, by theoretically showing that it embodies essential properties of Hooks and empirically comparing our React-tRace-definitional interpreter against a test suite. Furthermore, we showcase a practical visualization tool based on the formalization to demonstrate how developers can better understand the semantics of Hooks.

2025-07-07T17:42:17Z To be published in OOPSLA2 2025 Jay Lee Joongwon Ahn Kwangkeun Yi 10.1145/3763067 http://arxiv.org/abs/2508.15333v1 Fair Termination for Resource-Aware Active Objects 2025-08-21T07:56:23Z

Active object systems are a model of distributed computation that has been adopted for modelling distributed systems and business process workflows. This field of modelling is, in essence, concurrent and resource-aware, motivating the development of resource-aware formalisations on the active object model. The contributions of this work are the development of a core calculus for resource-aware active objects together with a type system ensuring that well-typed programs are fairly terminating, i.e., they can always eventually terminate. To achieve this, we combine techniques from graded semantics and type systems, which are quite well understood for sequential programs, with those for fair termination, which have been developed for synchronous~sessions.

2025-08-21T07:56:23Z 18 pages, 12 pages of appendix, 12 figures, APLAS 2025 Francesco Dagnino Paola Giannini Violet Ka I Pun Ulises Torrella http://arxiv.org/abs/2504.17444v2 Encode the $\forall\exists$ Relational Hoare Logic into Standard Hoare Logic 2025-08-21T05:06:36Z

Verifying a real-world program's functional correctness can be decomposed into (1) a refinement proof showing that the program implements a more abstract high-level program and (2) an algorithm correctness proof at the high level. Relational Hoare logic serves as a powerful tool to establish refinement but often necessitates formalization beyond standard Hoare logic. Particularly in the nondeterministic setting, the $\forall\exists$ relational Hoare logic is required. Existing approaches encode this logic into a Hoare logic with ghost states and invariants, yet these extensions significantly increase formalization complexity and soundness proof overhead. This paper proposes a generic encoding theory that reduces the $\forall\exists$ relational Hoare logic to standard (unary) Hoare logic. Precisely, we propose to redefine the validity of relational Hoare triples while preserving the original proof rules and then encapsulate the $\forall\exists$ pattern within assertions. We have proved that the validity of encoded standard Hoare triples is equivalent to the validity of the desired relational Hoare triples. Moreover, the encoding theory demonstrates how common relational Hoare logic proof rules are indeed special cases of standard Hoare logic proof rules, and relational proof steps correspond to standard proof steps. Our theory enables standard Hoare logic to prove $\forall\exists$ relational properties by defining a predicate Exec, without requiring modifications to the logic framework or re-verification of soundness.

2025-04-24T11:13:54Z Extended version of the paper accepted at OOPSLA 2025 R2 Shushu Wu Xiwei Wu Qinxiang Cao 10.1145/3763138 http://arxiv.org/abs/2508.15166v1 Probabilistic Inference for Datalog with Correlated Inputs 2025-08-21T02:10:14Z

Probabilistic extensions of logic programming languages, such as ProbLog, integrate logical reasoning with probabilistic inference to evaluate probabilities of output relations; however, prior work does not account for potential statistical correlations among input facts. This paper introduces Praline, a new extension to Datalog designed for precise probabilistic inference in the presence of (partially known) input correlations. We formulate the inference task as a constrained optimization problem, where the solution yields sound and precise probability bounds for output facts. However, due to the complexity of the resulting optimization problem, this approach alone often does not scale to large programs. To address scalability, we propose a more efficient $δ$-exact inference algorithm that leverages constraint solving, static analysis, and iterative refinement. Our empirical evaluation on challenging real-world benchmarks, including side-channel analysis, demonstrates that our method not only scales effectively but also delivers tight probability bounds.

2025-08-21T02:10:14Z Accepted for publication at OOPSLA 2025 (R2) Jingbo Wang Shashin Halalingaiah Weiyi Chen Chao Wang Isil Dillig http://arxiv.org/abs/2508.15109v1 Homomorphism Calculus for User-Defined Aggregations 2025-08-20T22:56:38Z

Data processing frameworks like Apache Spark and Flink provide built-in support for user-defined aggregation functions (UDAFs), enabling the integration of domain-specific logic. However, for these frameworks to support \emph{efficient} UDAF execution, the function needs to satisfy a \emph{homomorphism property}, which ensures that partial results from independent computations can be merged correctly. Motivated by this problem, this paper introduces a novel \emph{homomorphism calculus} that can both verify and refute whether a UDAF is a dataframe homomorphism. If so, our calculus also enables the construction of a corresponding merge operator which can be used for incremental computation and parallel execution. We have implemented an algorithm based on our proposed calculus and evaluate it on real-world UDAFs, demonstrating that our approach significantly outperforms two leading synthesizers.

2025-08-20T22:56:38Z Proc. ACM Program. Lang. 9, OOPSLA2, Article 294 (October 2025), 27 pages Ziteng Wang Ruijie Fang Linus Zheng Dixin Tang Isil Dillig 10.1145/3763072 http://arxiv.org/abs/2506.10913v2 Choreographic Quick Changes: First-Class Location (Set) Polymorphism 2025-08-20T21:50:16Z

Choreographic programming is a promising new paradigm for programming concurrent systems where a developer writes a single centralized program that compiles to individual programs for each node. Existing choreographic languages, however, lack critical features integral to modern systems, like the ability of one node to dynamically compute who should perform a computation and send that decision to others. This work addresses this gap with $λ_{QC}$, the first typed choreographic language with \emph{first class process names} and polymorphism over both types and (sets of) locations. $λ_{QC}$ also improves expressive power over previous work by supporting algebraic and recursive data types as well as multiply-located values. We formalize and mechanically verify our results in Rocq, including the standard choreographic guarantee of deadlock freedom.

2025-06-12T17:27:48Z 36 pages, 11 figures Ashley Samuelson Andrew K. Hirsch Ethan Cecchetti http://arxiv.org/abs/2508.15866v1 Correctness-Guaranteed Code Generation via Constrained Decoding 2025-08-20T20:48:18Z

Language Models (LMs) are increasingly being used for code generation, but ensuring the correctness of generated programs remains a significant challenge. Although imperfect code may be acceptable during software development with human oversight, domains such as video games and robotics require one-shot correctness for runtime-critical components. We present a constrained decoding algorithm for generating semantically correct programs that incorporates a context-sensitive parser, which, at each step, outputs a regular expression that satisfies a critical non-extensible property to guide the generation of the next token sequence that can continue to a correct program. To build such a context-sensitive parser, we propose a framework of a dynamic tree of parsers (ToP) during parsing, where each parser corresponds to a modular context-free grammar enriched with contextual information such as variable scopes and type constraints, with tree branches representing ambiguity in the future code segment. We demonstrate our approach through sLua, a strongly typed variant of Lua, showing that our method can generate semantically correct programs conforming to any prescribed scripting API. We further show that, with careful design, our semantic guarantees extend to runtime correctness, as validated in the application of generating game mechanics for a roguelike video game.

2025-08-20T20:48:18Z Published at COLM 2025 Lingxiao Li Salar Rahili Yiwei Zhao http://arxiv.org/abs/2501.14421v2 Reasoning about Weak Isolation Levels in Separation Logic 2025-08-20T18:30:22Z

Consistency guarantees among concurrently executing transactions in local- and distributed systems, commonly referred to as isolation levels, have been formalized in a number of models. Thus far, no model can reason about executable implementations of databases or local transaction libraries providing weak isolation levels. Weak isolation levels are characterized by being highly concurrent and, unlike their stronger counterpart serializability, they are not equivalent to the consistency guarantees provided by a transaction library implemented using a global lock. Industrial-strength databases almost exclusively implement weak isolation levels as their default level. This calls for formalism as numerous bugs violating isolation have been detected in these databases. In this paper, we formalize three weak isolation levels in separation logic, namely read uncommitted, read committed, and snapshot isolation. We define modular separation logic specifications that are independent of the underlying transaction library implementation. Historically, isolation levels have been specified using examples of executions between concurrent transactions that are not allowed to occur, and we demonstrate that our specifications correctly prohibit such examples. To show that our specifications are realizable, we formally verify that an executable implementation of a key-value database running the multi-version concurrency control algorithm from the original snapshot isolation paper satisfies our specification of snapshot isolation. Moreover, we prove implications between the specifications -- snapshot isolation implies read committed and read committed implies read uncommitted -- and thus the verification effort of the database serves as proof that all of our specifications are realizable. All results are mechanized in the Rocq proof assistant on top of the Iris separation logic framework.

2025-01-24T11:46:38Z Proceedings of the ACM on Programming Languages, Volume 9, Issue ICFP, August 2025, Article No. 246, Pages 306 - 340 Anders Alnor Mathiasen Léon Gondelman Léon Ducruet Amin Timany Lars Birkedal 10.1145/3747515 http://arxiv.org/abs/2506.09043v3 Gradual Metaprogramming 2025-08-20T13:01:01Z

Data engineers increasingly use domain-specific languages (DSLs) to generate the code for data pipelines. Such DSLs are often embedded in Python. Unfortunately, there are challenges in debugging the generation of data pipelines: an error in a Python DSL script is often detected too late, after the execution of the script, and the source code location that triggers the error is hard to pinpoint. In this paper, we focus on the scenario where a DSL embedded in Python (so it is dynamically-typed) generates data pipeline description code that is statically-typed. We propose gradual metaprogramming to (1) provide a migration path toward statically typed DSLs, (2) immediately provide earlier detection of code generation type errors, and (3) report the source code location responsible for the type error. Gradual metaprogramming accomplishes this by type checking code fragments and incrementally performing runtime checks as they are spliced together. We define MetaGTLC, a metaprogramming calculus in which a gradually-typed metalanguage manipulates a statically-typed object language, and give semantics to it by translation to the cast calculus MetaCC. We prove that successful metaevaluation always generates a well-typed object program and mechanize the proof in Agda.

2025-06-10T17:58:32Z 14 pages, 10 figures Tianyu Chen Darshal Shetty Jeremy G. Siek Chao-Hong Chen Weixi Ma Arnaud Venet Rocky Liu 10.1145/3759538.3759650 http://arxiv.org/abs/2410.22261v2 Proto-Quipper with Reversing and Control 2025-08-20T12:43:27Z

The quantum programming language Quipper supports circuit operations such as reversing and controlling certain quantum circuits. Additionally, Quipper provides a function called with-computed, which can be used to program circuits of the form g; f; g-dagger. The latter is a common pattern in quantum circuit design. One benefit of using with-computed, as opposed to constructing the circuit g ; f; g-dagger directly from g, f, and g-dagger, is that it facilitates an important optimization. Namely, if the resulting circuit is later controlled, only f needs to be controlled; the circuits g and g-dagger need not even be controllable. In this paper, we formalize a semantics for reversible and controllable circuits, using a dagger symmetric monoidal category R to interpret reversible circuits, and a new notion we call a controllable category N, which encompasses the control and with-computed operations in Quipper. We extend the language Proto-Quipper with reversing, control and the with-computed operation. Since not all circuits are reversible and/or controllable, we use a type system with modalities to track reversibility and controllability. This generalizes the modality of Fu-Kishida-Ross-Selinger 2023. We give an abstract categorical semantics, and show that the type system and operational semantics are sound with respect to this semantics.

2024-10-29T17:22:01Z In Proceedings QPL 2025, arXiv:2508.13619 EPTCS 426, 2025, pp. 1-22 Peng Fu University of South Carolina Kohei Kishida University of Illinois Urbana-Champaign Neil J. Ross Dalhousie University Peter Selinger Dalhousie University 10.4204/EPTCS.426.1