https://arxiv.org/api/LemII7nfr7RH/HgIzEDamo+7Z602026-06-13T12:11:58Z98853015http://arxiv.org/abs/2606.08465v1An Empirical Comparison of General Context-Free Parsers2026-06-07T05:58:58ZParsing underpins a vast range of software engineering tasks, from compilers and static analyzers to language servers and fuzz testing tools. Yet most parsers deployed in practice are deterministic (LL or LR), forcing developers not only to contort their grammars to fit the parser, but to simplify the very languages they design sacrificing expressiveness for the sake of parseability. General context-free parsers eliminate this constraint. Yet, despite decades of algorithmic development, no rigorous head-to-head comparison exists across the major families of parsing algorithms.
We present the first unified, controlled benchmark of six generalized parsing algorithms: CYK, Valiant, Earley, GLL, RNGLR, and BRNGLR, plus deterministic LL(1) and LR(1) baselines, all implemented in Rust with shared data structures and parse-tree extraction, and evaluated across 22 grammars ranging from simple expressions to full C++ and Java. Our results show that the cost of generality is lower than widely assumed. On deterministic grammars, the GLR family incurs only a 3x median slowdown over LR(1), with a narrow and predictable variance. GLR is the clear performance winner among generalized parsers and a practical default choice for software engineering tools.2026-06-07T05:58:58ZHuan VoDanushka LiyanageHong Jin KangSasha RubinRahul Gopinathhttp://arxiv.org/abs/2412.19946v2Comparing semantic frameworks for dependently-sorted algebraic theories2026-06-05T15:27:19ZAlgebraic theories with dependency between sorts form the structural core of Martin-Löf type theory and similar systems. Their denotational semantics are typically studied using categorical techniques; many different categorical structures have been introduced to model them (contextual categories, categories with families, display map categories, etc.) Comparisons of these models are scattered throughout the literature, and a detailed, big-picture analysis of their relationships has been lacking.
We aim to provide a clear and comprehensive overview of the relationships between as many such models as possible. Specifically, we take *comprehension categories* as a unifying language and show how almost all established notions of model embed as sub-2-categories (usually full) of the 2-category of comprehension categories.2024-12-27T22:55:39Z23 pages. Presented at APLAS 2024; arXiv version lightly revised and expanded. v2: Corrected statement of Prop. 46 (missing discreteness hypothesis); various minor cosmetic edits; numbering unchangedProgramming Languages and Systems (Proc. APLAS 2024), Oleg Kiselyov (ed.), 2025, Springer Nature Singapore, pp. 3-22Benedikt AhrensPeter LeFanu LumsdainePaige Randall North10.1007/978-981-97-8943-6_1http://arxiv.org/abs/2604.15290v4Pure Borrow: Linear Haskell Meets Rust-Style Borrowing2026-06-05T14:38:16ZA promising approach to unifying functional and imperative programming paradigms is to localize mutation using linear or affine types. Haskell, a purely functional language, was recently extended with linear types by Bernardy et al., in the name of Linear Haskell. However, it remained unknown whether such a pure language could safely support non-local borrowing in the style of Rust, where each borrower can be freely split and dropped without direct communication of ownership back to the lender.
We answer this question affirmatively with Pure Borrow, a novel framework that realizes Rust-style borrowing in Linear Haskell with purity. Notably, it features parallel state mutation with affine mutable references inside pure computation, unlike the IO and ST monads and existing Linear Haskell APIs. It also enjoys purity, lazy evaluation, first-class polymorphism and leak freedom, unlike Rust. We implement Pure Borrow simply as a library in Linear Haskell and demonstrate its power with a case study in parallel computing. We formalize the core of Pure Borrow and build a metatheory that works toward establishing safety, leak freedom and confluence, with a new, history-based model of borrowing.2026-04-16T17:53:51ZExtended version of the PLDI 2026 paperYusuke MatsushitaHiromi Ishii10.1145/3808259http://arxiv.org/abs/2506.20356v4Deadlock-free Context-free Session Types2026-06-05T13:33:37ZWe tackle the problem of statically ensuring that message-passing programs never run into deadlocks. We focus on concurrent functional programs governed by context-free session types, which can express rich tree-like structures not expressible in standard session types. We propose a new type system based on context-free session types: it enforces both protocol conformance and deadlock freedom, also for programs implementing cyclic communication topologies with recursion and polymorphism. We show how the priority-based approach to deadlock freedom can be extended to this expressive setting. We prove that well-typed concurrent programs respect their protocols and never deadlock.2025-06-25T12:11:47ZAndreia MordidoJorge A. Pérezhttp://arxiv.org/abs/2602.18602v2Package Managers à la Carte: A Formal Model of Dependency Resolution2026-06-04T21:57:28ZPackage managers are legion. Every programming language and operating system has its own solution, each with subtly different semantics for dependency resolution. This fragmentation prevents multilingual projects from expressing precise dependencies across language ecosystems; it leaves external system dependencies implicit and unversioned; and it obscures the full dependency graph that supply-chain analysis depends on. We present the Package Calculus, a formalism for dependency resolution that unifies the core semantics of package managers. Through a series of formal reductions, we show how this core is expressive enough to model the diversity of real-world dependency expression languages. The calculus provides the theoretical foundation for future cross-ecosystem tooling, as a lingua franca of dependency expression.2026-02-20T20:32:04ZRyan GibbPatrick FerrisDavid AllsoppThomas GazagnaireAnil Madhavapeddyhttp://arxiv.org/abs/2602.19973v3Misquoted No More: Securely Extracting F* Programs with IO2026-06-04T12:13:40ZShallow embeddings that use monads to represent effects are popular in proof-oriented languages because they are convenient for formal verification. Once shallowly embedded programs are verified, they are often extracted to mainstream languages like OCaml or C and linked into larger codebases. The extraction process is not fully verified because it often involves quotation -- turning the shallowly embedded program into a deeply embedded one -- and verifying quotation remains a major open challenge. Instead, some prior work obtains formal correctness guarantees using translation validation to certify individual extraction results. We build on this idea, but limit the use of translation validation to a first extraction step that we call relational quotation and that uses a metaprogram to construct a typing derivation for the given shallowly embedded program. This metaprogram is simple, since the typing derivation follows the structure of the original program. Once we validate that the typing derivation is valid for the original program, we pass it to a verified syntax-generation function that produces code guaranteed to be semantically related to the original program.
We apply this general idea to build SEIO*, a framework for extracting shallowly embedded F* programs with IO and refinement types to a deeply embedded simply typed lambda-calculus while providing formal secure compilation guarantees. Using two cross-language logical relations, we devise a machine-checked proof in F* that SEIO* guarantees Robust Relational Hyperproperty Preservation (RrHP), a very strong secure compilation criterion that implies full abstraction as well as preservation of trace properties and hyperproperties against arbitrary linked adversarial code. This goes beyond the state of the art in verified and certifying extraction, which so far has focused on correctness rather than security.2026-02-23T15:37:18ZConditionally accepted at ICFP'26, improved version including new section on refinement typesCezar-Constantin AndriciAbigail PribisovaDanel AhmanCatalin HritcuExequiel RivasThéo Winterhalterhttp://arxiv.org/abs/2605.30507v2A Virtual Processor brings back the Free Lunch2026-06-04T08:39:06ZThis work introduces a self-optimizing virtual processor (VP) for numerical array programs that shifts parallelization from a manual developer task to a cooperative, agent-like runtime mechanism. Instead of relying on centralized task-graph scheduling, static compiler optimization, or explicitly annotated parallel constructs, the VP uses a decentralized network of cooperative execution segments, derived from the stream of numerical instructions and their data dependencies at runtime. Each segment makes only local decisions about when, where, and how to prepare and execute its computation, including task placement, kernel preparation, and data movement. No central scheduler or mapper instance determines the execution globally; instead, scheduling itself is parallelized and distributed over time - asynchronously and strictly dependency driven. The overall execution strategy emerges from concurrently executing local segments, continuously responding to data availability, cost estimates, system state, hardware capabilities, and problem size. While preserving the sequential semantics of the program our VP automatically exploits parallelism across large program regions rather than being limited to individual loop bodies, modules, or explicitly marked parallel sections; developers are not required to design or encode a parallelization strategy. The current VP primarily targets low-latency strong scaling on local heterogeneous hardware, covering workloads from small, latency-sensitive array operations to large data-parallel computations. The current implementation targets the predefined array instruction set of the ILNumerics ONAL domain-specific language, accessible https://github.com/ILNumerics/ILNumerics.ONAL , while the underlying concept is applicable to general array-based numerical programming models such as MATLAB and NumPy.2026-05-28T19:43:49Z10 pages + appendix (3 pages), 7 figures, 4 benchmarks at https://github.com/hokb/decentralized-array-execution-artifacts2026 (GitHub) or https://doi.org/10.5281/zenodo.20407801 (DOI Zenodo)Haymo Kutschbachhttp://arxiv.org/abs/2606.05680v1CASS-RTL: Correctness-Aware Subspace Steering for RTL Generation with LLMs2026-06-04T04:02:51ZRecent advances in large language models (LLMs) have enabled the automatic synthesis (generation) of register-transfer level (RTL) code from natural language instructions, offering a promising pathway to accelerate chip design. Unlike typical natural language (and software coding) tasks, LLM-based RTL code generation demands strict cycle accuracy with concurrency, where minor logical errors can render a circuit unusable or insecure. While prior work has explored hallucination mitigation via external verification, self-evaluation prompts, retrieval-augmented prompting, domain specific fine-tuning, agentic solutions, and reasoning, these approaches largely overlook the attention-oriented internal mechanisms of LLMs that may inherently correlate with RTL correctness. This work proposes CASS-RTL, a first-of-its-kind framework for discovering and leveraging LLMs' correctness-aware components to guide RTL generation toward functionally accurate outputs. We (i) identify attention heads whose activation patterns consistently differentiate correct from incorrect RTL; (ii) construct a low-dimensional subspace capturing correctness-relevant signals; and (iii) design a lightweight, geometry-aware intervention that steers the model at inference time. CASS-RTL is fully model-agnostic, requires no additional supervision or retraining, and readily integrates into existing models. Empirically, we evaluate CASS-RTL on multiple models and observe 10%-20% improvement in pass@1/5/10 accuracy on VerilogEval and 5% improvement on CVDP, demonstrating the effectiveness of our method in enhancing reliability without sacrificing model efficiency or requiring a large labeled dataset for fine-tuning.2026-06-04T04:02:51ZAccepted to the IEEE International Conference on LLM-Aided Design (LAD '26)Mohammad AkyashNowfel MashnoorKimia AzarHadi Kamalihttp://arxiv.org/abs/2606.07665v1AgentCompile: An LLM-Guided Compiler for Direct CUDA Inference2026-06-04T03:49:56ZTransformer inference increasingly depends on specialized compiler and runtime support, but real model graphs still require semantic decisions about which regions are worth specializing and which CUDA implementation families are plausible. We present AgentCompile, an LLM-guided CUDA inference compiler that uses LLM outputs only as advisory search metadata. Given compiler-derived region summaries and bounded candidate spaces, the LLM proposes semantic labels, candidate priorities, parameter hints, and risk annotations; the compiler materializes CUDA candidates through templates, checks interface and hardware constraints, validates candidates empirically, selects implementations by measured latency, and falls back when specialization is unsupported or unprofitable. In end-to-end autoregressive generation, AgentCompile averages 5.66x, 4.05x, and 4.26x speedup over PyTorch eager on Qwen3-1.7B, Qwen3-4B, and Llama-3.2-1B-Instruct, respectively, across five representative workloads. We will open-source the project.2026-06-04T03:49:56Z11 pages, 3 figuresXuanzhe LiZiyan WengZhiyu ZhuJunhui Houhttp://arxiv.org/abs/2511.15000v3Bonsai: Compiling Queries to Pruned Tree Traversals2026-06-03T22:03:16ZTrees can accelerate queries that search or aggregate values over large collections. They achieve this by storing metadata that enables quick pruning (or inclusion) of subtrees when predicates on that metadata can prove that none (or all) of the data in a subtree affect the query result. Existing systems implement this pruning logic manually for each query predicate and data structure. We generalize and mechanize this class of optimization. Our method derives conditions for when subtrees can be pruned (or included wholesale), expressed in terms of the metadata available at each node. We efficiently generate these conditions using symbolic interval analysis, extended with new rules to handle geometric predicates (e.g., intersection, containment). Additionally, our compiler fuses compound queries (e.g., reductions on filters) into a single tree traversal. These techniques enable the automatic derivation of generalized single-index and dual-index tree joins that support a wide class of join predicates beyond standard equality and range predicates. The generated traversals match the behavior of expert-written code that implements query-specific traversals, and can asymptotically outperform the linear scans and nested-loop joins that existing systems fall back to when hand-written cases do not apply.2025-11-19T00:50:20ZProc. ACM Program. Lang. 10, PLDI, Article 178 (June 2026)Alexander J RootChristophe GyurgyikPurvi GoelKayvon FatahalianJonathan Ragan-KelleyAndrew AdamsFredrik Kjolstad10.1145/3808256http://arxiv.org/abs/2606.05466v1Look Before You Leap: Checking in on Type Tag Checking2026-06-03T21:44:02ZTagging of generic dynamic values is important in symbolic-computation and dynamic-language systems, but the trade-offs change as machine architectures and workloads evolve. In particular, old folklore about boxed values, immediate values, and type tags must be recalibrated from time to time. We revisit the performance of badged object headers, low-bit tagging, and two NaN-boxing layouts on a range of platforms in use today, including AArch64 and x86-64 architectures from different manufacturers. The experiments isolate two distinct effects: the cost avoided by not heap-allocating common scalar values, and the cost avoided by obtaining tag information from the value word rather than by performing a heap read. The results show that several local bit operations are often cheaper than opening a heap object to obtain a tag or small value. Low-bit tagging remains the simplest and usually fastest choice for mostly symbolic workloads, while NaN-boxing is close in access cost and avoids the time and space of heap allocation for ordinary floating-point values.2026-06-03T21:44:02ZStephen M. Watthttp://arxiv.org/abs/2512.03086v2Beyond Code Pairs: Dialogue-Based Data Generation for LLM Code Translation2026-06-03T19:05:35ZLarge language models (LLMs) have shown remarkable capabilities in code translation, yet their performance deteriorates in low-resource programming domains such as Fortran and emerging frameworks like CUDA, where high-quality parallel data are scarce. We present an automated dataset generation pipeline featuring a dual-LLM Questioner-Solver design that incorporates external knowledge from compilers and runtime feedback. Beyond traditional source-target code pair datasets, our approach additionally generates (1) verified translations with unit tests for assessing functional consistency and (2) multi-turn dialogues that capture the reasoning process behind translation refinement. Applied to Fortran-to-C++ and C++-to-CUDA, the pipeline yields 3.64k and 3.93k dialogues, respectively. Fine-tuning on this data yields dramatic improvements in functional correctness, boosting unit test success rates by over 56% on the challenging C++-to-CUDA task. We show that the generated data enables a 7B open-weight model to significantly outperform larger proprietary systems on key metrics like compilation success.2025-11-29T05:26:53ZLe ChenNuo XuWinson ChenBin LeiPei-Hung LinDunzhi ZhouRajeev ThakurCaiwen DingAli JannesariChunhua Liaohttp://arxiv.org/abs/2606.05348v1Incremental Computation for Efficient Programmable Inference in Probabilistic Programs2026-06-03T18:49:07ZInference in probabilistic programs generally requires evaluating many possible program executions to find those of high posterior density. To scale inference to large datasets, it is crucial that expensive intermediate results are shared across these many evaluations, rather than recomputed from scratch. This paper presents a new approach to realizing this sharing, based on \textit{incremental computation}, a technique for efficiently recomputing (deterministic) program outputs when program inputs change. First, we show how expressive probabilistic programs can be compiled to deterministic ones that compute their density functions. Then, building on the incremental $λ$-calculus, we develop a general technique for compositionally incrementalizing expressive functional programs, and apply it to these densities. The resulting incremental densities can be used to accelerate a broad range of Monte Carlo inference algorithms, including for nonparametric models not well supported by existing systems. Furthermore, our decomposition of incremental density computation into separate density and incrementalization steps allows for modular reasoning about correctness -- a key pain point in existing systems, where ad-hoc incrementalization features are a known source of soundness bugs. We develop denotational logical relations arguments for the correctness of each step independently, and implement the approach in a Julia prototype, finding that it leads to asymptotic runtime improvements in the size of the dataset on a range of models and inference algorithms.2026-06-03T18:49:07ZFull version of the PLDI 2026 article, including proofs and other supplementary materialFabian ZaiserJack CzenszakMartin C. RinardVikash K. MansinghkaAlexander K. Lew10.1145/3808316http://arxiv.org/abs/2606.04903v1Provably Auditable and Safe LLM Agents from Human-Authored Ontologies2026-06-03T14:01:33ZWe introduce the LLM agent architecture Agentic Redux, intended for use with nontrivial problem domains that require linear auditability. Using the typed lambda calculus, we prove that, run on appropriate domains, Agentic Redux executions are semantically guaranteed to be correct, with all decisions recorded in an append-only ledger. We present two production-grade appropriate domains, in healthcare billing compliance, and security vulnerability disclosure. Working code for Agentic Redux run on both domains is available in a supporting code repository. We also introduce Ontology-First Agent Design, a methodology for creation of agent frameworks on a problem domain, in which a human expert ontologizes the problem domain with Basic Formal Ontology, and then assigns an LLM to derive roles that agents and humans-in-the-loop can fill, in order to work the problems in the domain.2026-06-03T14:01:33ZAaron Sterlinghttp://arxiv.org/abs/2606.04877v1Abduction Prover in Isabelle/HOL2026-06-03T13:41:08ZProof assistants based on expressive logics suffer limited automation for proof search, raising the cost of formal verification based on proof assistants. We address this problem by introducing the Abduction Prover for Isabelle/HOL. Given a challenging proof goal, the Abduction Prover constructs a proof script for the goal by identifying useful conjectures using abductive reasoning.2026-06-03T13:41:08ZAccepted to Isabelle2026Yutaka NagashimaDaniel Sebastian Goc