https://arxiv.org/api/LemII7nfr7RH/HgIzEDamo+7Z60 2026-06-13T12:11:58Z 9885 30 15 http://arxiv.org/abs/2606.08465v1 An Empirical Comparison of General Context-Free Parsers 2026-06-07T05:58:58Z

Parsing underpins a vast range of software engineering tasks, from compilers and static analyzers to language servers and fuzz testing tools. Yet most parsers deployed in practice are deterministic (LL or LR), forcing developers not only to contort their grammars to fit the parser, but to simplify the very languages they design sacrificing expressiveness for the sake of parseability. General context-free parsers eliminate this constraint. Yet, despite decades of algorithmic development, no rigorous head-to-head comparison exists across the major families of parsing algorithms. We present the first unified, controlled benchmark of six generalized parsing algorithms: CYK, Valiant, Earley, GLL, RNGLR, and BRNGLR, plus deterministic LL(1) and LR(1) baselines, all implemented in Rust with shared data structures and parse-tree extraction, and evaluated across 22 grammars ranging from simple expressions to full C++ and Java. Our results show that the cost of generality is lower than widely assumed. On deterministic grammars, the GLR family incurs only a 3x median slowdown over LR(1), with a narrow and predictable variance. GLR is the clear performance winner among generalized parsers and a practical default choice for software engineering tools.

2026-06-07T05:58:58Z Huan Vo Danushka Liyanage Hong Jin Kang Sasha Rubin Rahul Gopinath http://arxiv.org/abs/2412.19946v2 Comparing semantic frameworks for dependently-sorted algebraic theories 2026-06-05T15:27:19Z

Algebraic theories with dependency between sorts form the structural core of Martin-Löf type theory and similar systems. Their denotational semantics are typically studied using categorical techniques; many different categorical structures have been introduced to model them (contextual categories, categories with families, display map categories, etc.) Comparisons of these models are scattered throughout the literature, and a detailed, big-picture analysis of their relationships has been lacking. We aim to provide a clear and comprehensive overview of the relationships between as many such models as possible. Specifically, we take *comprehension categories* as a unifying language and show how almost all established notions of model embed as sub-2-categories (usually full) of the 2-category of comprehension categories.

2024-12-27T22:55:39Z 23 pages. Presented at APLAS 2024; arXiv version lightly revised and expanded. v2: Corrected statement of Prop. 46 (missing discreteness hypothesis); various minor cosmetic edits; numbering unchanged Programming Languages and Systems (Proc. APLAS 2024), Oleg Kiselyov (ed.), 2025, Springer Nature Singapore, pp. 3-22 Benedikt Ahrens Peter LeFanu Lumsdaine Paige Randall North 10.1007/978-981-97-8943-6_1 http://arxiv.org/abs/2604.15290v4 Pure Borrow: Linear Haskell Meets Rust-Style Borrowing 2026-06-05T14:38:16Z

A promising approach to unifying functional and imperative programming paradigms is to localize mutation using linear or affine types. Haskell, a purely functional language, was recently extended with linear types by Bernardy et al., in the name of Linear Haskell. However, it remained unknown whether such a pure language could safely support non-local borrowing in the style of Rust, where each borrower can be freely split and dropped without direct communication of ownership back to the lender. We answer this question affirmatively with Pure Borrow, a novel framework that realizes Rust-style borrowing in Linear Haskell with purity. Notably, it features parallel state mutation with affine mutable references inside pure computation, unlike the IO and ST monads and existing Linear Haskell APIs. It also enjoys purity, lazy evaluation, first-class polymorphism and leak freedom, unlike Rust. We implement Pure Borrow simply as a library in Linear Haskell and demonstrate its power with a case study in parallel computing. We formalize the core of Pure Borrow and build a metatheory that works toward establishing safety, leak freedom and confluence, with a new, history-based model of borrowing.

2026-04-16T17:53:51Z Extended version of the PLDI 2026 paper Yusuke Matsushita Hiromi Ishii 10.1145/3808259 http://arxiv.org/abs/2506.20356v4 Deadlock-free Context-free Session Types 2026-06-05T13:33:37Z

We tackle the problem of statically ensuring that message-passing programs never run into deadlocks. We focus on concurrent functional programs governed by context-free session types, which can express rich tree-like structures not expressible in standard session types. We propose a new type system based on context-free session types: it enforces both protocol conformance and deadlock freedom, also for programs implementing cyclic communication topologies with recursion and polymorphism. We show how the priority-based approach to deadlock freedom can be extended to this expressive setting. We prove that well-typed concurrent programs respect their protocols and never deadlock.

2025-06-25T12:11:47Z Andreia Mordido Jorge A. Pérez http://arxiv.org/abs/2602.18602v2 Package Managers à la Carte: A Formal Model of Dependency Resolution 2026-06-04T21:57:28Z

Package managers are legion. Every programming language and operating system has its own solution, each with subtly different semantics for dependency resolution. This fragmentation prevents multilingual projects from expressing precise dependencies across language ecosystems; it leaves external system dependencies implicit and unversioned; and it obscures the full dependency graph that supply-chain analysis depends on. We present the Package Calculus, a formalism for dependency resolution that unifies the core semantics of package managers. Through a series of formal reductions, we show how this core is expressive enough to model the diversity of real-world dependency expression languages. The calculus provides the theoretical foundation for future cross-ecosystem tooling, as a lingua franca of dependency expression.

2026-02-20T20:32:04Z Ryan Gibb Patrick Ferris David Allsopp Thomas Gazagnaire Anil Madhavapeddy http://arxiv.org/abs/2602.19973v3 Misquoted No More: Securely Extracting F* Programs with IO 2026-06-04T12:13:40Z

Shallow embeddings that use monads to represent effects are popular in proof-oriented languages because they are convenient for formal verification. Once shallowly embedded programs are verified, they are often extracted to mainstream languages like OCaml or C and linked into larger codebases. The extraction process is not fully verified because it often involves quotation -- turning the shallowly embedded program into a deeply embedded one -- and verifying quotation remains a major open challenge. Instead, some prior work obtains formal correctness guarantees using translation validation to certify individual extraction results. We build on this idea, but limit the use of translation validation to a first extraction step that we call relational quotation and that uses a metaprogram to construct a typing derivation for the given shallowly embedded program. This metaprogram is simple, since the typing derivation follows the structure of the original program. Once we validate that the typing derivation is valid for the original program, we pass it to a verified syntax-generation function that produces code guaranteed to be semantically related to the original program. We apply this general idea to build SEIO*, a framework for extracting shallowly embedded F* programs with IO and refinement types to a deeply embedded simply typed lambda-calculus while providing formal secure compilation guarantees. Using two cross-language logical relations, we devise a machine-checked proof in F* that SEIO* guarantees Robust Relational Hyperproperty Preservation (RrHP), a very strong secure compilation criterion that implies full abstraction as well as preservation of trace properties and hyperproperties against arbitrary linked adversarial code. This goes beyond the state of the art in verified and certifying extraction, which so far has focused on correctness rather than security.

2026-02-23T15:37:18Z Conditionally accepted at ICFP'26, improved version including new section on refinement types Cezar-Constantin Andrici Abigail Pribisova Danel Ahman Catalin Hritcu Exequiel Rivas Théo Winterhalter http://arxiv.org/abs/2605.30507v2 A Virtual Processor brings back the Free Lunch 2026-06-04T08:39:06Z

This work introduces a self-optimizing virtual processor (VP) for numerical array programs that shifts parallelization from a manual developer task to a cooperative, agent-like runtime mechanism. Instead of relying on centralized task-graph scheduling, static compiler optimization, or explicitly annotated parallel constructs, the VP uses a decentralized network of cooperative execution segments, derived from the stream of numerical instructions and their data dependencies at runtime. Each segment makes only local decisions about when, where, and how to prepare and execute its computation, including task placement, kernel preparation, and data movement. No central scheduler or mapper instance determines the execution globally; instead, scheduling itself is parallelized and distributed over time - asynchronously and strictly dependency driven. The overall execution strategy emerges from concurrently executing local segments, continuously responding to data availability, cost estimates, system state, hardware capabilities, and problem size. While preserving the sequential semantics of the program our VP automatically exploits parallelism across large program regions rather than being limited to individual loop bodies, modules, or explicitly marked parallel sections; developers are not required to design or encode a parallelization strategy. The current VP primarily targets low-latency strong scaling on local heterogeneous hardware, covering workloads from small, latency-sensitive array operations to large data-parallel computations. The current implementation targets the predefined array instruction set of the ILNumerics ONAL domain-specific language, accessible https://github.com/ILNumerics/ILNumerics.ONAL , while the underlying concept is applicable to general array-based numerical programming models such as MATLAB and NumPy.

2026-05-28T19:43:49Z 10 pages + appendix (3 pages), 7 figures, 4 benchmarks at https://github.com/hokb/decentralized-array-execution-artifacts2026 (GitHub) or https://doi.org/10.5281/zenodo.20407801 (DOI Zenodo) Haymo Kutschbach http://arxiv.org/abs/2606.05680v1 CASS-RTL: Correctness-Aware Subspace Steering for RTL Generation with LLMs 2026-06-04T04:02:51Z

Recent advances in large language models (LLMs) have enabled the automatic synthesis (generation) of register-transfer level (RTL) code from natural language instructions, offering a promising pathway to accelerate chip design. Unlike typical natural language (and software coding) tasks, LLM-based RTL code generation demands strict cycle accuracy with concurrency, where minor logical errors can render a circuit unusable or insecure. While prior work has explored hallucination mitigation via external verification, self-evaluation prompts, retrieval-augmented prompting, domain specific fine-tuning, agentic solutions, and reasoning, these approaches largely overlook the attention-oriented internal mechanisms of LLMs that may inherently correlate with RTL correctness. This work proposes CASS-RTL, a first-of-its-kind framework for discovering and leveraging LLMs' correctness-aware components to guide RTL generation toward functionally accurate outputs. We (i) identify attention heads whose activation patterns consistently differentiate correct from incorrect RTL; (ii) construct a low-dimensional subspace capturing correctness-relevant signals; and (iii) design a lightweight, geometry-aware intervention that steers the model at inference time. CASS-RTL is fully model-agnostic, requires no additional supervision or retraining, and readily integrates into existing models. Empirically, we evaluate CASS-RTL on multiple models and observe 10%-20% improvement in pass@1/5/10 accuracy on VerilogEval and 5% improvement on CVDP, demonstrating the effectiveness of our method in enhancing reliability without sacrificing model efficiency or requiring a large labeled dataset for fine-tuning.

2026-06-04T04:02:51Z Accepted to the IEEE International Conference on LLM-Aided Design (LAD '26) Mohammad Akyash Nowfel Mashnoor Kimia Azar Hadi Kamali http://arxiv.org/abs/2606.07665v1 AgentCompile: An LLM-Guided Compiler for Direct CUDA Inference 2026-06-04T03:49:56Z

Transformer inference increasingly depends on specialized compiler and runtime support, but real model graphs still require semantic decisions about which regions are worth specializing and which CUDA implementation families are plausible. We present AgentCompile, an LLM-guided CUDA inference compiler that uses LLM outputs only as advisory search metadata. Given compiler-derived region summaries and bounded candidate spaces, the LLM proposes semantic labels, candidate priorities, parameter hints, and risk annotations; the compiler materializes CUDA candidates through templates, checks interface and hardware constraints, validates candidates empirically, selects implementations by measured latency, and falls back when specialization is unsupported or unprofitable. In end-to-end autoregressive generation, AgentCompile averages 5.66x, 4.05x, and 4.26x speedup over PyTorch eager on Qwen3-1.7B, Qwen3-4B, and Llama-3.2-1B-Instruct, respectively, across five representative workloads. We will open-source the project.

2026-06-04T03:49:56Z 11 pages, 3 figures Xuanzhe Li Ziyan Weng Zhiyu Zhu Junhui Hou http://arxiv.org/abs/2511.15000v3 Bonsai: Compiling Queries to Pruned Tree Traversals 2026-06-03T22:03:16Z

Trees can accelerate queries that search or aggregate values over large collections. They achieve this by storing metadata that enables quick pruning (or inclusion) of subtrees when predicates on that metadata can prove that none (or all) of the data in a subtree affect the query result. Existing systems implement this pruning logic manually for each query predicate and data structure. We generalize and mechanize this class of optimization. Our method derives conditions for when subtrees can be pruned (or included wholesale), expressed in terms of the metadata available at each node. We efficiently generate these conditions using symbolic interval analysis, extended with new rules to handle geometric predicates (e.g., intersection, containment). Additionally, our compiler fuses compound queries (e.g., reductions on filters) into a single tree traversal. These techniques enable the automatic derivation of generalized single-index and dual-index tree joins that support a wide class of join predicates beyond standard equality and range predicates. The generated traversals match the behavior of expert-written code that implements query-specific traversals, and can asymptotically outperform the linear scans and nested-loop joins that existing systems fall back to when hand-written cases do not apply.

2025-11-19T00:50:20Z Proc. ACM Program. Lang. 10, PLDI, Article 178 (June 2026) Alexander J Root Christophe Gyurgyik Purvi Goel Kayvon Fatahalian Jonathan Ragan-Kelley Andrew Adams Fredrik Kjolstad 10.1145/3808256 http://arxiv.org/abs/2606.05466v1 Look Before You Leap: Checking in on Type Tag Checking 2026-06-03T21:44:02Z

Tagging of generic dynamic values is important in symbolic-computation and dynamic-language systems, but the trade-offs change as machine architectures and workloads evolve. In particular, old folklore about boxed values, immediate values, and type tags must be recalibrated from time to time. We revisit the performance of badged object headers, low-bit tagging, and two NaN-boxing layouts on a range of platforms in use today, including AArch64 and x86-64 architectures from different manufacturers. The experiments isolate two distinct effects: the cost avoided by not heap-allocating common scalar values, and the cost avoided by obtaining tag information from the value word rather than by performing a heap read. The results show that several local bit operations are often cheaper than opening a heap object to obtain a tag or small value. Low-bit tagging remains the simplest and usually fastest choice for mostly symbolic workloads, while NaN-boxing is close in access cost and avoids the time and space of heap allocation for ordinary floating-point values.

2026-06-03T21:44:02Z Stephen M. Watt http://arxiv.org/abs/2512.03086v2 Beyond Code Pairs: Dialogue-Based Data Generation for LLM Code Translation 2026-06-03T19:05:35Z

Large language models (LLMs) have shown remarkable capabilities in code translation, yet their performance deteriorates in low-resource programming domains such as Fortran and emerging frameworks like CUDA, where high-quality parallel data are scarce. We present an automated dataset generation pipeline featuring a dual-LLM Questioner-Solver design that incorporates external knowledge from compilers and runtime feedback. Beyond traditional source-target code pair datasets, our approach additionally generates (1) verified translations with unit tests for assessing functional consistency and (2) multi-turn dialogues that capture the reasoning process behind translation refinement. Applied to Fortran-to-C++ and C++-to-CUDA, the pipeline yields 3.64k and 3.93k dialogues, respectively. Fine-tuning on this data yields dramatic improvements in functional correctness, boosting unit test success rates by over 56% on the challenging C++-to-CUDA task. We show that the generated data enables a 7B open-weight model to significantly outperform larger proprietary systems on key metrics like compilation success.

2025-11-29T05:26:53Z Le Chen Nuo Xu Winson Chen Bin Lei Pei-Hung Lin Dunzhi Zhou Rajeev Thakur Caiwen Ding Ali Jannesari Chunhua Liao http://arxiv.org/abs/2606.05348v1 Incremental Computation for Efficient Programmable Inference in Probabilistic Programs 2026-06-03T18:49:07Z

Inference in probabilistic programs generally requires evaluating many possible program executions to find those of high posterior density. To scale inference to large datasets, it is crucial that expensive intermediate results are shared across these many evaluations, rather than recomputed from scratch. This paper presents a new approach to realizing this sharing, based on \textit{incremental computation}, a technique for efficiently recomputing (deterministic) program outputs when program inputs change. First, we show how expressive probabilistic programs can be compiled to deterministic ones that compute their density functions. Then, building on the incremental $λ$-calculus, we develop a general technique for compositionally incrementalizing expressive functional programs, and apply it to these densities. The resulting incremental densities can be used to accelerate a broad range of Monte Carlo inference algorithms, including for nonparametric models not well supported by existing systems. Furthermore, our decomposition of incremental density computation into separate density and incrementalization steps allows for modular reasoning about correctness -- a key pain point in existing systems, where ad-hoc incrementalization features are a known source of soundness bugs. We develop denotational logical relations arguments for the correctness of each step independently, and implement the approach in a Julia prototype, finding that it leads to asymptotic runtime improvements in the size of the dataset on a range of models and inference algorithms.

2026-06-03T18:49:07Z Full version of the PLDI 2026 article, including proofs and other supplementary material Fabian Zaiser Jack Czenszak Martin C. Rinard Vikash K. Mansinghka Alexander K. Lew 10.1145/3808316 http://arxiv.org/abs/2606.04903v1 Provably Auditable and Safe LLM Agents from Human-Authored Ontologies 2026-06-03T14:01:33Z

We introduce the LLM agent architecture Agentic Redux, intended for use with nontrivial problem domains that require linear auditability. Using the typed lambda calculus, we prove that, run on appropriate domains, Agentic Redux executions are semantically guaranteed to be correct, with all decisions recorded in an append-only ledger. We present two production-grade appropriate domains, in healthcare billing compliance, and security vulnerability disclosure. Working code for Agentic Redux run on both domains is available in a supporting code repository. We also introduce Ontology-First Agent Design, a methodology for creation of agent frameworks on a problem domain, in which a human expert ontologizes the problem domain with Basic Formal Ontology, and then assigns an LLM to derive roles that agents and humans-in-the-loop can fill, in order to work the problems in the domain.

2026-06-03T14:01:33Z Aaron Sterling http://arxiv.org/abs/2606.04877v1 Abduction Prover in Isabelle/HOL 2026-06-03T13:41:08Z

Proof assistants based on expressive logics suffer limited automation for proof search, raising the cost of formal verification based on proof assistants. We address this problem by introducing the Abduction Prover for Isabelle/HOL. Given a challenging proof goal, the Abduction Prover constructs a proof script for the goal by identifying useful conjectures using abductive reasoning.

2026-06-03T13:41:08Z Accepted to Isabelle2026 Yutaka Nagashima Daniel Sebastian Goc