https://arxiv.org/api/Gibr/fRaNn6I5E9cd2ah+XhxGbw2026-06-14T20:38:31Z988548015http://arxiv.org/abs/2507.13792v4Don't exhaust, don't waste2026-03-22T14:27:18ZWe extend the semantics and type system of a lambda calculus equipped with common constructs to be "resource-aware". That is, the semantics keeps track of the usage of resources, and is stuck, besides in case of type errors, if either a needed resource is exhausted, or a provided resource would be wasted. In such way, the type system guarantees, besides standard soundness, that for well-typed programs there is a computation where no resource gets either exhausted or wasted. The extension is parametric on an arbitrary "grade algebra", modeling an assortment of possible usages, and does not require ad-hoc changes to the underlying language. To this end, the semantics needs to be formalized in big-step style; as a consequence, expressing and proving (resource-aware) soundness is challenging, and is achieved by applying recent techniques based on coinductive reasoning.2025-07-18T10:06:08ZPreprint submitted to JFP (Journal of Functional Programming)Riccardo BianchiniFrancesco DagninoPaola GianniniElena Zuccahttp://arxiv.org/abs/2603.20942v1Accompanist: A Runtime for Resilient Choreographic Programming2026-03-21T20:50:25ZIn service-oriented architecture, services coordinate in one of two ways: directly, using point-to-point communication, or indirectly, through an intermediary called the orchestrator. Orchestrators tend to be more popular because their local state is a 'single source of truth' for the status of ongoing workflows, which simplifies fault recovery and rollback for distributed transactions that use the 'saga' pattern. But orchestration is not always an option because of hardware constraints and security policies. Without a central orchestrator, resilient saga transactions are hard to implement correctly.
A natural idea is to use choreographic programming, a paradigm that brings the 'global view' of orchestrators to a decentralised setting. Unfortunately, choreographic programming relies on strong assumptions about network reliability and service uptime that often do not hold. Recent work weakens some of these assumptions with 'failure-aware' language features, but these features make programs more complex. We propose a complementary approach: to co-design the programming interface with a customizable runtime that can replay computation to mask faults. Our approach keeps programs simple, does not require modifying the compiler, and lends itself to a clean separation of concerns in formal proofs.
We present Accompanist, a resilient runtime for the Choral choreographic programming language. With Accompanist, programmers can implement decentralised saga transactions as choreographic programs and deploy the compiled code to 'sidecars' that run alongside services in a pre-existing codebase. Our key assumptions are that choreographic programs should be deterministic, transactions within a saga should be idempotent, and messages should be written to a durable message queue. Based on these assumptions, we present a formal model and prove that target code is correct-by-construction.2026-03-21T20:50:25Z23 pages, 16 figures, submitted to OOPSLA 2026Viktor Strate KløvedalDan PlyukhinMarco PeressottiFabrizio Montesihttp://arxiv.org/abs/2603.20933v1AC4A: Access Control for Agents2026-03-21T20:23:06ZLarge Language Model (LLM) agents combine the chat interaction capabilities of LLMs with the power to interact with external tools and APIs. This enables them to perform complex tasks and act autonomously to achieve user goals. However, current agent systems operate on an all-or-nothing basis: an agent either has full access to an API's capabilities and a web page's content, or it has no access at all. This coarse-grained approach forces users to trust agents with more capabilities than they actually need for a given task.
In this paper, we introduce AC4A, an access control framework for agents. As agents become more capable and autonomous, users need a way to limit what APIs or portions of web pages these agents can access, eliminating the need to trust them with everything an API or web page allows. Our goal with AC4A is to provide a framework for defining permissions that lets agents access only the resources they are authorized to access. AC4A works across both API-based and browser-based agents. It does not prescribe what permissions should be, but offers a flexible way to define and enforce them, making it practical for real-world systems.
AC4A works by creating permissions granting access to resources, drawing inspiration from established access control frameworks like the one for the Unix file system. Applications define their resources as hierarchies and provide a way to compute the necessary permissions at runtime needed for successful resource access. We demonstrate the usefulness of AC4A in enforcing permissions over real-world APIs and web pages through case studies. The source code of AC4A is available at https://github.com/reSHARMA/AC4A2026-03-21T20:23:06ZReshabh K SharmaDan Grossmanhttp://arxiv.org/abs/2603.20623v1MINISA: Minimal Instruction Set Architecture for Next-gen Reconfigurable Inference Accelerator2026-03-21T03:36:35ZModern reconfigurable AI accelerators rely on rich mapping and data-layout flexibility to sustain high utilization across matrix multiplication, convolution, and emerging applications beyond AI. However, exposing this flexibility through fine-grained micro-control results in prohibitive control overhead of fetching configuration bits from off-chip memory. This paper presents MINISA, a minimal instruction set that programs a reconfigurable accelerator at the granularity of Virtual Neurons (VNs), the coarsest control granularity that retains flexibility of hardware and the finest granularity that avoids unnecessary control costs. First, we introduce FEATHER+, a modest refinement of FEATHER, that eliminates redundant on-chip replication needed for runtime dataflow/layout co-switching and supports dynamic cases where input and weight data are unavailable before execution for offline layout manipulation. MINISA then abstracts control of FEATHER+ into three layout-setting instructions for input, weight, and output VNs and a single mapping instruction for setting dataflow. This reduces the control and instruction footprint while preserving the legal mapping and layout space supported by the FEATHER+. Our results show that MINISA reduces geometric mean off-chip instruction traffic by factors ranging from 35x to (4x10^5)x under various sizes under 50 GEMM workloads spanning AI (GPT-oss), FHE, and ZKP. This eliminates instruction-fetch stalls that consume 96.9% of micro-instruction cycles, yielding up to 31.6x end-to-end speedup for 16x256 FEATHER+. Our code: https://github.com/maeri-project/FEATHER/tree/main/minisa.2026-03-21T03:36:35ZISPASS 2026, 13 pages, 13 figures2026 IEEE International Symposium on Performance Analysis of Systems and SoftwareJianming TongDevansh JainYujie LiCharith MendisTushar Krishnahttp://arxiv.org/abs/2603.20127v1Analyzing Decoders for Quantum Error Correction2026-03-20T16:50:00ZQuantum error correction (QEC) enables reliable computation on noisy hardware by encoding logical information across many physical qubits and periodically measuring parities to detect errors. A decoder is the classical algorithm that uses these measurements to infer which error most likely occurred, so that the system can correct it. The decoder's accuracy-how rarely it makes the wrong guess-directly determines the scale of quantum computation that can be reliably executed. With a wealth of competing decoding algorithms, a QEC system designer needs reliable methods to evaluate them. Today, the dominant approach is to evaluate decoders using Monte Carlo simulation. However, simulation has several drawbacks such as requiring many samples to produce low variance estimates.
In this work, we develop a new systematic analysis for evaluating decoders. We introduce a novel formal semantics of a core language for QEC programs that captures the de facto standard Stim circuit format, providing a principled theoretical foundation for the emerging space of fault-tolerant quantum systems design. Given a QEC program and a decoder, our verifier can quantify both the decoder accuracy and the decoder robustness to drift in physical error rate. Our approach has two key components: (i) a structured search over the space of possible errors; and (ii) a constrained polynomial optimization kernel. A thorough empirical evaluation of our approach suggests that it can outperform simulation, especially in low error rate regimes, and that it can be deployed to quantify decoder robustness over an interval of physical error rates.2026-03-20T16:50:00ZAbtin MolaviFeras SaadAws Albarghouthihttp://arxiv.org/abs/2603.20001v1Sound State Encodings in Translational Separation Logic Verifiers (Extended Version)2026-03-20T14:48:57ZAutomated program verifiers are often organized into a front-end, which encodes an input program into an intermediate verification language (IVL), and a back-end, which proves that the IVL program is correct. Soundness of such translational verifiers requires that the back-end verification is sound and that correctness of the IVL program implies correctness of the input program. Existing formalizations for translational verifiers based on separation logic target the former, but support the latter only under the strong assumption that there exists a separation logic for the input program with the same state model as the IVL. This assumption is unrealistic in practice, especially since the state model also defines the supported separation logic resources.
We present the first formal framework for proving the soundness of translational separation logic verifiers with non-trivial state encodings. To be applicable to various front-ends and IVLs, our framework only assumes the existence of a homomorphic encoding relation between the front-end and IVL state models. At the core of our framework is a novel condition, backward satisfiability, which is crucial to guarantee the soundness of the front-end translation. We formalize our framework for front-end verifiers based on concurrent separation logic and separation logic IVLs, such as Raven, VeriFast, and Viper. We demonstrate its expressiveness by proving soundness for three common state encodings. Our framework and all proofs are formalized in Isabelle/HOL.2026-03-20T14:48:57ZHongyi LingThibault DardinierEllen ArltPeter Müllerhttp://arxiv.org/abs/2502.02711v4Tensor Network Structure Search with Program Synthesis2026-03-20T14:25:34ZTensor networks provide a powerful framework for compressing multi-dimensional data. The optimal tensor network structure for a given data tensor depends on both data characteristics and specific optimality criteria, making tensor network structure search a difficult problem. Existing solutions typically rely on sampling and compressing numerous candidate structures; these procedures are computationally expensive and therefore limiting for practical applications. We address this challenge by viewing tensor network structure search as a program synthesis problem and introducing an efficient constraint-based assessment method that avoids costly tensor decomposition. Specifically, we establish a correspondence between transformation programs and network structures. We also design a novel operation named output-directed splits to reduce the search space without hindering expressiveness. We then propose a synthesis algorithm to identify promising network candidates through constraint solving, and avoid tensor decomposition for all but the most promising candidates. Experimental results show that our approach improves search speed by up to $10\times$ and achieves compression ratios $1.5\times$ to $3\times$ better than state-of-the-art. Notably, our approach scales to larger tensors that are unattainable by prior work. Furthermore, the discovered topologies generalize well to similar data, yielding compression ratios up to $ 2.4\times$ better than a generic structure while the runtime remains around $110$ seconds.2025-02-04T20:46:37ZZheng GuoAditya DeshpandeBrian KiedrowskiXinyu WangAlex Gorodetskyhttp://arxiv.org/abs/2603.19856v1Is It a Good Idea to Build an HLS Tool on Top of MLIR? Experience from Building the Dynamatic HLS Compiler2026-03-20T11:16:14ZWhen the MLIR project was first introduced, it promised to address the issues that the HLS community had with the LLVM project. But is this really the case, and is MLIR the "right"/"best" compiler infrastructure for HLS? We here share our experiences based on the development of Dynamatic (github.com/EPFL-LAP/dynamatic).2026-03-20T11:16:14ZAccepted at the Workshop on Languages, Tools, and Techniques for Accelerator Design (LATTE '26)Jiahui XuEmmet MurphyLana Josipovichttp://arxiv.org/abs/2603.19560v1Incremental Live Programming via Shortcut Memoization2026-03-20T01:56:32ZLive programming systems aim to quickly show programmers the dynamic impacts of program edits. To do so, they re-execute the program whenever it is edited, which poses a computational challenge when programs become large or complex. This has led to the need for incrementality in the implementation of live program interpreters. This paper introduces Chordata, an incremental program interpreter based on shortcut memoization, which learns repeated patterns of computation, called shortcuts, by observing executions of previous versions of a program. It can then apply these shortcuts when the same or a structurally similar program fragment is re-executed. This paper contributes a formal semantics of shortcut memoization for any language with a rewrite-based semantics, with mechanized proofs of key correctness properties. We then express a variant of the Hazel live programming system, expressed as a CEK machine, in Chordata, and develop a number of practical heuristics to learn high-value shortcuts. We evaluate the resulting system on editing traces of students solving simple programming problems. Chordata achieves a speedup of 13.03\times compared to baseline with a 19.97\times memory overhead. For smaller changes and for more complex programs, Chordata achieves even greater speedups. Furthermore, we show that Chordata is capable of providing a speedup even within a single execution, with a faster speedup on a larger input.2026-03-20T01:56:32ZMarisa KirisameThomas J. PorterRuqing YangJianqiu ZhaoYudi WuIvan WeiCyrus OmarPavel Panchekhahttp://arxiv.org/abs/2411.14802v3Enhancing a Hierarchical Graph Rewriting Language based on MELL Cut Elimination2026-03-19T15:11:05ZHierarchical graph rewriting is a highly expressive computational formalism that manipulates graphs enhanced with box structures for representing hierarchies. It has provided the foundations of various graph-based modeling tools, but the design of high-level declarative languages based on hierarchical graph rewriting is still a challenge. For a solid design choice, well-established formalisms with backgrounds other than graph rewriting would provide useful guidelines. Proof nets of Multiplicative Exponential Linear Logic (MELL) is such a framework because its original formulation of cut elimination is essentially graph rewriting involving box structures, where so-called Promotion Boxes with an indefinite number of non-local edges may be cloned, migrated and deleted. This work builds on LMNtal as a declarative language based on hierarchical (port) graph rewriting, and discusses how it can be extended to support the above operations on Promotion Boxes of MELL proof nets. LMNtal thus extended turns out to be a practical graph rewriting language that has strong affinity with MELL proof nets. The language features provided are general enough to encode other well-established models of concurrency. Using the toolchain of LMNtal that provides state-space search and model checking, we implemented cut elimination rules of MELL proof nets in extended LMNtal and demonstrated that the platform could serve as a useful workbench for proof nets.2024-11-22T09:03:09ZExtended version of the PADL 2025 paper (LNCS, Springer, 2025). This version incorporates into the main text details omitted from the conference version and includes some additional materialKento TakyuKazunori Uedahttp://arxiv.org/abs/2603.18477v1Leveraging Large Language Models for Generalizing Peephole Optimizations2026-03-19T04:19:51ZPeephole optimizations are a core component of modern optimizing compilers. It rewrites specific instruction into semantically equivalent but more efficient forms. In practice, creating a new peephole optimization often starts from a concrete optimization instance and requires lifting it into a more general rewrite rule that matches a wider range of instruction patterns. This generalization step is critical to optimization effectiveness, but it is also difficult: producing rules that are both correct and sufficiently general typically demands substantial manual effort and domain expertise. Existing approaches such as Hydra attempt to automate this task with program synthesis, but their generalization capability is often limited by search-space explosion, under-generalization, and restricted support for diverse instruction domains.
We present LPG, large language model aided peephole optimization generalization, a framework that uses large language models (LLMs) to generalize peephole optimizations. The design of LPG is motivated by the observation that LLMs are effective at semantic abstraction and exploratory reasoning, while formal analyses are necessary to ensure that generated rules are sound and profitable. Based on this observation, LPG adopts a closed-loop workflow that integrates LLM-driven symbolic constant generalization, structural generalization, constraint relaxation, and bitwidth/precision generalization with feedback from syntactic validation, semantic verification, and profitability checking.
We evaluate LPG on real-world peephole optimization issues drawn from the LLVM ecosystem. Overall, LPG successfully generalizes 90 out of 102 optimizations. On the integer-focused subset that is directly comparable to Hydra, LPG generalizes 74 out of 81 optimizations, whereas Hydra generalizes 35.2026-03-19T04:19:51ZChunhao LiaoHongxu XuXintong ZhouZhenyang XuChengnian Sunhttp://arxiv.org/abs/2603.18372v1TENSURE: Fuzzing Sparse Tensor Compilers (Registered Report)2026-03-19T00:13:14ZSparse Tensor Compilers (STCs) have emerged as critical infrastructure for optimizing high-dimensional data analytics and machine learning workloads. The STCs must synthesize complex, irregular control flow for various compressed storage formats directly from high-level declarative specifications, thereby making them highly susceptible to subtle correctness defects. Existing testing frameworks, which rely on mutating computation graphs restricted to a standard vocabulary of operators, fail to exercise the arbitrary loop synthesis capabilities of these compilers. Furthermore, generic grammar-based fuzzers struggle to generate valid inputs due to the strict rules governing how indices are reused across multiple tensors.
In this paper, we present TENSURE, the first extensible black-box fuzzing framework specifically designed for the testing of STCs. TENSURE leverages Einstein Summation (Einsum) notation as a general input abstraction, enabling the generation of complex, unconventional tensor contractions that expose corner cases in the code-generation phases of STCs. We propose a novel constraint-based generation algorithm that guarantees 100% semantic validity of synthesized kernels, significantly outperforming the ~3.3% validity rate of baseline grammar fuzzers. To enable metamorphic testing without a trusted reference, we introduce a set of semantic-preserving mutation operators that exploit algebraic commutativity and heterogeneity in storage formats. Our evaluation on two state-of-the-art systems, TACO and Finch, reveals widespread fragility, particularly in TACO, where TENSURE exposed crashes or silent miscompilations in a majority of generated test cases. These findings underscore the critical need for specialized testing tools in the sparse compilation ecosystem.2026-03-19T00:13:14ZKabilan MahathevanYining ZhangMuhammad Ali GulzarKirshanthan Sundararajah10.14722/fuzzing.2026.23006http://arxiv.org/abs/2505.06193v4Ohana trees, linear approximation and multi-types for the $λ$I-calculus: No variable gets left behind or forgotten!2026-03-18T17:08:35ZAlthough the $λ$I-calculus is a natural fragment of the $λ$-calculus, obtained by forbidding the erasure of arguments, its equational theories did not receive much attention. The reason is that all proper denotational models studied in the literature equate all non-normalizable $λ$I-terms, whence the associated theory is not very informative. The goal of this paper is to introduce a previously unknown theory of the $λ$I-calculus, induced by a notion of evaluation trees that we call "Ohana trees". The Ohana tree of a $λ$I-term is an annotated version of its Böhm tree, remembering all free variables that are hidden within its meaningless subtrees, or pushed into infinity along its infinite branches.
We develop the associated theories of program approximation: the first approach -- more classic -- is based on finite trees and continuity, the second adapts Ehrhard and Regnier's Taylor expansion. We then prove a Commutation Theorem stating that the normal form of the Taylor expansion of a $λ$I-term coincides with the Taylor expansion of its Ohana tree. As a corollary, we obtain that the equality induced by Ohana trees is compatible with abstraction and application.
Subsequently, we introduce a denotational model designed to capture the equality induced by Ohana trees. Although presented as a non-idempotent type system, our model is based on a suitably modified version of the relational semantics of the $λ$-calculus, which is known to yield proper models of the $λ$I-calculus when restricted to non-empty finite multisets. To track variables occurring in subterms that are hidden or pushed to infinity in the evaluation trees, we generalize the system in two ways: first, we reintroduce annotated versions of the empty multiset indexed by sets of variables; second, (...)2025-05-09T17:13:18ZThis is the (submitted) long version of the (published) conference version v2, see https://doi.org/10.4230/LIPIcs.FSCD.2025.12Rémy CerdaGiulio ManzonettoAlexis Saurinhttp://arxiv.org/abs/2603.18122v1Don't Vibe Code, Do Skele-Code: Interactive No-Code Notebooks for Subject Matter Experts to Build Lower-Cost Agentic Workflows2026-03-18T16:37:29ZSkele-Code is a natural-language and graph-based interface for building workflows with AI agents, designed especially for less or non-technical users. It supports incremental, interactive notebook-style development, and each step is converted to code with a required set of functions and behavior to enable incremental building of workflows. Agents are invoked only for code generation and error recovery, not orchestration or task execution. This agent-supported, but code-first approach to workflows, along with the context-engineering used in Skele-Code, can help reduce token costs compared to the multi-agent system approach to executing workflows. Skele-Code produces modular, easily extensible, and shareable workflows. The generated workflows can also be used as skills by agents, or as steps in other workflows.2026-03-18T16:37:29ZMain paper 9 pages. Topics: Agentic Coding, HCI, LLMs, WorkflowsSriram Gopalakrishnanhttp://arxiv.org/abs/2502.05310v5Oracular Programming: A Modular Foundation for Building LLM-Enabled Software2026-03-18T11:56:17ZLarge Language Models (LLMs) can solve previously intractable tasks given only natural-language instructions and a few examples, but they remain difficult to steer precisely and lack a key capability for building reliable software at scale: the modular composition of computations under enforceable contracts. As a result, they are often embedded in larger software pipelines that use domain-specific knowledge to decompose tasks and improve reliability through validation and search. Yet the complexity of writing, tuning, and maintaining such pipelines has so far limited their sophistication. We propose oracular programming: a foundational paradigm for integrating traditional, explicit computations with inductive oracles such as LLMs. It rests on two directing principles: the full separation of core and search logic (allowing the latter to freely evolve without breaking the former), and the treatment of few-shot examples as grounded and evolvable program components. Within this paradigm, programmers express high-level problem-solving strategies as programs with unresolved choice points. These choice points are resolved at runtime by LLMs, which generalize from user-provided examples of correct and incorrect decisions. An oracular program is composed of three orthogonal components: a strategy that consists of a nondeterministic program with choice points that can be reified into a search tree, a policy that specifies how to navigate this tree with the help of LLM oracles, and a set of demonstrations that describe successful and unsuccessful tree navigation scenarios across diverse problem instances. Each component is expressed in a dedicated programming language. We address the key programming language design challenges of modularly composing oracular programs and enforcing consistency between their components as they evolve.2025-02-07T20:24:43ZJonathan LaurentAndré Platzer