https://arxiv.org/api/aQVzceBucXuxjLHfwbabYYHx2ZE2026-06-26T14:53:01Z9951129015http://arxiv.org/abs/2310.06959v7Proof Repair across Quotient Type Equivalences2025-08-22T18:39:40ZProofs in proof assistants like Rocq can be brittle, breaking easily in response to changes. To address this, recent work introduced an algorithm and tool in Rocq to automatically repair broken proofs in response to changes that correspond to type equivalences. However, many changes remained out of the scope of this algorithm and tool -- especially changes in underlying \emph{behavior}. We extend this proof repair algorithm so that it can express certain changes in behavior that were previously out of scope. We focus in particular on equivalences between \emph{quotient types} -- types equipped with a relation that describes what it means for any two elements of that type to be equal. Quotient type equivalences can be used to express interesting changes in representations of mathematical structures, as well as changes in the implementations of data structures.
We extend this algorithm and tool to support quotient type equivalences in Rocq. Notably, since Rocq lacks quotient types entirely, our extensions use Rocq's setoid machinery in place of quotients. Specifically, (1) our extension to the algorithm supports new changes corresponding to setoids, and (2) our extension to the tool supports this new class of changes and further automates away some of the new proof obligations. We demonstrate our extensions on proof repair case studies for previously unsupported changes. We also perform manual proof repair in Cubical Agda, a language with a univalent metatheory, which allows us to construct the first ever internal proofs of correctness for proof repair.2023-10-10T19:21:53Z26 pages, 22 figures, for associated code, see https://github.com/uwplse/pumpkin-pi/releases/tag/oopsla2025Cosmo ViolaMax FanTalia Ringerhttp://arxiv.org/abs/2508.16522v1On the Duality of Task and Actor Programming Models2025-08-22T16:45:03ZProgramming models for distributed and heterogeneous machines are rapidly growing in popularity to meet the demands of modern workloads. Task and actor models are common choices that offer different trade-offs between development productivity and achieved performance. Task-based models offer better productivity and composition of software, whereas actor-based models routinely deliver better peak performance due to lower overheads. While task-based and actor-based models appear to be different superficially, we demonstrate these programming models are duals of each other. Importantly, we show that this duality extends beyond functionality to performance, and elucidate techniques that let task-based systems deliver performance competitive with actor-based systems without compromising productivity. We apply these techniques to both Realm, an explicitly parallel task-based runtime, as well as Legion, an implicitly parallel task-based runtime. We show these techniques reduce Realm's overheads by between 1.7-5.3x, coming within a factor of two of the overheads imposed by heavily optimized actor-based systems like Charm++ and MPI. We further show that our techniques enable between 1.3-5.0x improved strong scaling of unmodified Legion applications.2025-08-22T16:45:03ZRohan YadavJoseph GumanSean TreichlerMichael GarlandAlex AikenFredrik KjolstadMichael Bauerhttp://arxiv.org/abs/2508.16517v1ARSP: Automated Repair of Verilog Designs via Semantic Partitioning2025-08-22T16:40:17ZDebugging functional Verilog bugs consumes a significant portion of front-end design time. While Large Language Models (LLMs) have demonstrated great potential in mitigating this effort, existing LLM-based automated debugging methods underperform on industrial-scale modules. A major reason for this is bug signal dilution in long contexts, where a few bug-relevant tokens are overwhelmed by hundreds of unrelated lines, diffusing the model's attention. To address this issue, we introduce ARSP, a two-stage system that mitigates dilution via semantics-guided fragmentation. A Partition LLM splits a module into semantically tight fragments; a Repair LLM patches each fragment; edits are merged without altering unrelated logic. A synthetic data framework generates fragment-level training pairs spanning bug types, design styles, and scales to supervise both models. Experiments show that ARSP achieves 77.92% pass@1 and 83.88% pass@5, outperforming mainstream commercial LLMs including Claude-3.7 and SOTA automated Verilog debugging tools Strider and MEIC. Also, semantic partitioning improves pass@1 by 11.6% and pass@5 by 10.2% over whole-module debugging, validating the effectiveness of fragment-level scope reduction in LLM-based Verilog debugging.2025-08-22T16:40:17ZBingkun YaoNing WangXiangfeng LiuYuxin DuYuchen HuHong GaoZhe JiangNan Guanhttp://arxiv.org/abs/2508.15898v1Automated Formal Verification of a Software Fault Isolation System2025-08-21T18:00:19ZSoftware fault isolation (SFI) is a popular way to sandbox untrusted software. A key component of SFI is the verifier that checks the untrusted code is written in a subset of the machine language that guarantees it never reads or writes outside of a region of memory dedicated to the sandbox. Soundness bugs in the SFI verifier would break the SFI security model and allow the supposedly sandboxed code to read protected memory. In this paper, we address the concern of SFI verifier bugs by performing an automated formal verification of a recent SFI system called Lightweight Fault Isolation (LFI). In particular, we formally verify that programs accepted by the LFI verifier never read or write to memory outside of a designated sandbox region.2025-08-21T18:00:19ZShort paper to appear at FMCAD 2025, https://fmcad.org/Matthew SotoudehZachary Yedidiahttp://arxiv.org/abs/2410.14906v7Structural Temporal Logic for Mechanized Program Verification2025-08-21T16:35:36ZMechanized verification of liveness properties for infinite programs with effects and nondeterminism is challenging. Existing temporal reasoning frameworks operate at the level of models such as traces and automata. Reasoning happens at a very low-level, requiring complex nested (co-)inductive proof techniques and familiarity with proof assistant mechanics (e.g., the guardedness checker). Further, reasoning at the level of models instead of program constructs creates a verification gap that loses the benefits of modularity and composition enjoyed by structural program logics such as Hoare Logic. To address this verification gap, and the lack of compositional proof techniques for temporal specifications, we propose Ticl, a new structural temporal logic. Using ticl, we encode complex (co-)inductive proof techniques as structural lemmas and focus our reasoning on variants and invariants. We show that it is possible to perform compositional proofs of general temporal properties in a proof assistant, while working at a high level of abstraction. We demonstrate the benefits of Ticl by giving mechanized proofs of safety and liveness properties for programs with scheduling, concurrent shared memory, and distributed consensus, demonstrating a low proof-to-code ratio.2024-10-18T23:19:29ZProc. ACM Program. Lang., Vol. 9, No. OOPSLA2, Article 313. Publication date: October 2025Eleftherios IoannidisYannick ZakowskiSteve ZdancewicSebastian Angel10.1145/3763091http://arxiv.org/abs/2507.05234v2React-tRace: A Semantics for Understanding React Hooks2025-08-21T14:24:35ZReact has become the most widely used web front-end framework, enabling the creation of user interfaces in a declarative and compositional manner. Hooks are a set of APIs that manage side effects in function components in React. However, their semantics are often seen as opaque to developers, leading to UI bugs. We introduce React-tRace, a formalization of the semantics of the essence of React Hooks, providing a semantics that clarifies their behavior. We demonstrate that our model captures the behavior of React, by theoretically showing that it embodies essential properties of Hooks and empirically comparing our React-tRace-definitional interpreter against a test suite. Furthermore, we showcase a practical visualization tool based on the formalization to demonstrate how developers can better understand the semantics of Hooks.2025-07-07T17:42:17ZTo be published in OOPSLA2 2025Jay LeeJoongwon AhnKwangkeun Yi10.1145/3763067http://arxiv.org/abs/2508.15333v1Fair Termination for Resource-Aware Active Objects2025-08-21T07:56:23ZActive object systems are a model of distributed computation that has been adopted for modelling distributed systems and business process workflows. This field of modelling is, in essence, concurrent and resource-aware, motivating the development of resource-aware formalisations on the active object model. The contributions of this work are the development of a core calculus for resource-aware active objects together with a type system ensuring that well-typed programs are fairly terminating, i.e., they can always eventually terminate. To achieve this, we combine techniques from graded semantics and type systems, which are quite well understood for sequential programs, with those for fair termination, which have been developed for synchronous~sessions.2025-08-21T07:56:23Z18 pages, 12 pages of appendix, 12 figures, APLAS 2025Francesco DagninoPaola GianniniViolet Ka I PunUlises Torrellahttp://arxiv.org/abs/2504.17444v2Encode the $\forall\exists$ Relational Hoare Logic into Standard Hoare Logic2025-08-21T05:06:36ZVerifying a real-world program's functional correctness can be decomposed into (1) a refinement proof showing that the program implements a more abstract high-level program and (2) an algorithm correctness proof at the high level. Relational Hoare logic serves as a powerful tool to establish refinement but often necessitates formalization beyond standard Hoare logic. Particularly in the nondeterministic setting, the $\forall\exists$ relational Hoare logic is required. Existing approaches encode this logic into a Hoare logic with ghost states and invariants, yet these extensions significantly increase formalization complexity and soundness proof overhead. This paper proposes a generic encoding theory that reduces the $\forall\exists$ relational Hoare logic to standard (unary) Hoare logic. Precisely, we propose to redefine the validity of relational Hoare triples while preserving the original proof rules and then encapsulate the $\forall\exists$ pattern within assertions. We have proved that the validity of encoded standard Hoare triples is equivalent to the validity of the desired relational Hoare triples. Moreover, the encoding theory demonstrates how common relational Hoare logic proof rules are indeed special cases of standard Hoare logic proof rules, and relational proof steps correspond to standard proof steps. Our theory enables standard Hoare logic to prove $\forall\exists$ relational properties by defining a predicate Exec, without requiring modifications to the logic framework or re-verification of soundness.2025-04-24T11:13:54ZExtended version of the paper accepted at OOPSLA 2025 R2Shushu WuXiwei WuQinxiang Cao10.1145/3763138http://arxiv.org/abs/2508.15166v1Probabilistic Inference for Datalog with Correlated Inputs2025-08-21T02:10:14ZProbabilistic extensions of logic programming languages, such as ProbLog, integrate logical reasoning with probabilistic inference to evaluate probabilities of output relations; however, prior work does not account for potential statistical correlations among input facts. This paper introduces Praline, a new extension to Datalog designed for precise probabilistic inference in the presence of (partially known) input correlations. We formulate the inference task as a constrained optimization problem, where the solution yields sound and precise probability bounds for output facts. However, due to the complexity of the resulting optimization problem, this approach alone often does not scale to large programs. To address scalability, we propose a more efficient $δ$-exact inference algorithm that leverages constraint solving, static analysis, and iterative refinement. Our empirical evaluation on challenging real-world benchmarks, including side-channel analysis, demonstrates that our method not only scales effectively but also delivers tight probability bounds.2025-08-21T02:10:14ZAccepted for publication at OOPSLA 2025 (R2)Jingbo WangShashin HalalingaiahWeiyi ChenChao WangIsil Dillighttp://arxiv.org/abs/2508.15109v1Homomorphism Calculus for User-Defined Aggregations2025-08-20T22:56:38ZData processing frameworks like Apache Spark and Flink provide built-in support for user-defined aggregation functions (UDAFs), enabling the integration of domain-specific logic. However, for these frameworks to support \emph{efficient} UDAF execution, the function needs to satisfy a \emph{homomorphism property}, which ensures that partial results from independent computations can be merged correctly. Motivated by this problem, this paper introduces a novel \emph{homomorphism calculus} that can both verify and refute whether a UDAF is a dataframe homomorphism. If so, our calculus also enables the construction of a corresponding merge operator which can be used for incremental computation and parallel execution. We have implemented an algorithm based on our proposed calculus and evaluate it on real-world UDAFs, demonstrating that our approach significantly outperforms two leading synthesizers.2025-08-20T22:56:38ZProc. ACM Program. Lang. 9, OOPSLA2, Article 294 (October 2025), 27 pagesZiteng WangRuijie FangLinus ZhengDixin TangIsil Dillig10.1145/3763072http://arxiv.org/abs/2506.10913v2Choreographic Quick Changes: First-Class Location (Set) Polymorphism2025-08-20T21:50:16ZChoreographic programming is a promising new paradigm for programming concurrent systems where a developer writes a single centralized program that compiles to individual programs for each node. Existing choreographic languages, however, lack critical features integral to modern systems, like the ability of one node to dynamically compute who should perform a computation and send that decision to others. This work addresses this gap with $λ_{QC}$, the first typed choreographic language with \emph{first class process names} and polymorphism over both types and (sets of) locations. $λ_{QC}$ also improves expressive power over previous work by supporting algebraic and recursive data types as well as multiply-located values. We formalize and mechanically verify our results in Rocq, including the standard choreographic guarantee of deadlock freedom.2025-06-12T17:27:48Z36 pages, 11 figuresAshley SamuelsonAndrew K. HirschEthan Cecchettihttp://arxiv.org/abs/2508.15866v1Correctness-Guaranteed Code Generation via Constrained Decoding2025-08-20T20:48:18ZLanguage Models (LMs) are increasingly being used for code generation, but ensuring the correctness of generated programs remains a significant challenge. Although imperfect code may be acceptable during software development with human oversight, domains such as video games and robotics require one-shot correctness for runtime-critical components. We present a constrained decoding algorithm for generating semantically correct programs that incorporates a context-sensitive parser, which, at each step, outputs a regular expression that satisfies a critical non-extensible property to guide the generation of the next token sequence that can continue to a correct program. To build such a context-sensitive parser, we propose a framework of a dynamic tree of parsers (ToP) during parsing, where each parser corresponds to a modular context-free grammar enriched with contextual information such as variable scopes and type constraints, with tree branches representing ambiguity in the future code segment. We demonstrate our approach through sLua, a strongly typed variant of Lua, showing that our method can generate semantically correct programs conforming to any prescribed scripting API. We further show that, with careful design, our semantic guarantees extend to runtime correctness, as validated in the application of generating game mechanics for a roguelike video game.2025-08-20T20:48:18ZPublished at COLM 2025Lingxiao LiSalar RahiliYiwei Zhaohttp://arxiv.org/abs/2501.14421v2Reasoning about Weak Isolation Levels in Separation Logic2025-08-20T18:30:22ZConsistency guarantees among concurrently executing transactions in local- and distributed systems, commonly referred to as isolation levels, have been formalized in a number of models. Thus far, no model can reason about executable implementations of databases or local transaction libraries providing weak isolation levels. Weak isolation levels are characterized by being highly concurrent and, unlike their stronger counterpart serializability, they are not equivalent to the consistency guarantees provided by a transaction library implemented using a global lock. Industrial-strength databases almost exclusively implement weak isolation levels as their default level. This calls for formalism as numerous bugs violating isolation have been detected in these databases. In this paper, we formalize three weak isolation levels in separation logic, namely read uncommitted, read committed, and snapshot isolation. We define modular separation logic specifications that are independent of the underlying transaction library implementation. Historically, isolation levels have been specified using examples of executions between concurrent transactions that are not allowed to occur, and we demonstrate that our specifications correctly prohibit such examples. To show that our specifications are realizable, we formally verify that an executable implementation of a key-value database running the multi-version concurrency control algorithm from the original snapshot isolation paper satisfies our specification of snapshot isolation. Moreover, we prove implications between the specifications -- snapshot isolation implies read committed and read committed implies read uncommitted -- and thus the verification effort of the database serves as proof that all of our specifications are realizable. All results are mechanized in the Rocq proof assistant on top of the Iris separation logic framework.2025-01-24T11:46:38ZProceedings of the ACM on Programming Languages, Volume 9, Issue ICFP, August 2025, Article No. 246, Pages 306 - 340Anders Alnor MathiasenLéon GondelmanLéon DucruetAmin TimanyLars Birkedal10.1145/3747515http://arxiv.org/abs/2506.09043v3Gradual Metaprogramming2025-08-20T13:01:01ZData engineers increasingly use domain-specific languages (DSLs) to generate the code for data pipelines. Such DSLs are often embedded in Python. Unfortunately, there are challenges in debugging the generation of data pipelines: an error in a Python DSL script is often detected too late, after the execution of the script, and the source code location that triggers the error is hard to pinpoint.
In this paper, we focus on the scenario where a DSL embedded in Python (so it is dynamically-typed) generates data pipeline description code that is statically-typed. We propose gradual metaprogramming to (1) provide a migration path toward statically typed DSLs, (2) immediately provide earlier detection of code generation type errors, and (3) report the source code location responsible for the type error. Gradual metaprogramming accomplishes this by type checking code fragments and incrementally performing runtime checks as they are spliced together. We define MetaGTLC, a metaprogramming calculus in which a gradually-typed metalanguage manipulates a statically-typed object language, and give semantics to it by translation to the cast calculus MetaCC. We prove that successful metaevaluation always generates a well-typed object program and mechanize the proof in Agda.2025-06-10T17:58:32Z14 pages, 10 figuresTianyu ChenDarshal ShettyJeremy G. SiekChao-Hong ChenWeixi MaArnaud VenetRocky Liu10.1145/3759538.3759650http://arxiv.org/abs/2410.22261v2Proto-Quipper with Reversing and Control2025-08-20T12:43:27ZThe quantum programming language Quipper supports circuit operations such as reversing and controlling certain quantum circuits. Additionally, Quipper provides a function called with-computed, which can be used to program circuits of the form g; f; g-dagger. The latter is a common pattern in quantum circuit design. One benefit of using with-computed, as opposed to constructing the circuit g ; f; g-dagger directly from g, f, and g-dagger, is that it facilitates an important optimization. Namely, if the resulting circuit is later controlled, only f needs to be controlled; the circuits g and g-dagger need not even be controllable.
In this paper, we formalize a semantics for reversible and controllable circuits, using a dagger symmetric monoidal category R to interpret reversible circuits, and a new notion we call a controllable category N, which encompasses the control and with-computed operations in Quipper. We extend the language Proto-Quipper with reversing, control and the with-computed operation. Since not all circuits are reversible and/or controllable, we use a type system with modalities to track reversibility and controllability. This generalizes the modality of Fu-Kishida-Ross-Selinger 2023. We give an abstract categorical semantics, and show that the type system and operational semantics are sound with respect to this semantics.2024-10-29T17:22:01ZIn Proceedings QPL 2025, arXiv:2508.13619EPTCS 426, 2025, pp. 1-22Peng FuUniversity of South CarolinaKohei KishidaUniversity of Illinois Urbana-ChampaignNeil J. RossDalhousie UniversityPeter SelingerDalhousie University10.4204/EPTCS.426.1