https://arxiv.org/api/fj7BPM8DucORTv0iP4E055+Hi+Y 2026-06-15T06:35:29Z 9888 615 15 http://arxiv.org/abs/2602.14717v1 Optimal Program Synthesis via Abstract Interpretation 2026-02-16T13:02:53Z

We consider the problem of synthesizing programs with numerical constants that optimize a quantitative objective, such as accuracy, over a set of input-output examples. We propose a general framework for optimal synthesis of such programs in a given domain specific language (DSL), with provable optimality guarantees. Our framework enumerates programs in a general search graph, where nodes represent subsets of concrete programs. To improve scalability, it uses A* search in conjunction with a search heuristic based on abstract interpretation; intuitively, this heuristic establishes upper bounds on the value of subtrees in the search graph, enabling the synthesizer to identify and prune subtrees that are provably suboptimal. In addition, we propose a natural strategy for constructing abstract transformers for monotonic semantics, which is a common property for components in DSLs for data classification. Finally, we implement our approach in the context of two such existing DSLs, demonstrating that our algorithm is more scalable than existing optimal synthesizers.

2026-02-16T13:02:53Z Proc. ACM Program. Lang. 8, POPL, Article 16 (January 2024) Stephen Mell Steve Zdancewic Osbert Bastani 10.1145/3632858 http://arxiv.org/abs/2602.14573v1 Polar: An Algebraic Analyzer for (Probabilistic) Loops 2026-02-16T09:05:42Z

We present the Polar framework for fully automating the analysis of classical and probabilistic loops using algebraic reasoning. The central theme in Polar comes with handling algebraic recurrences that precisely capture the loop semantics. To this end, our work implements a variety of techniques to compute exact closed-forms of recurrences over higher-order moments of variables, infer invariants, and derive loop sensitivities with respect to unknown parameters. Polar can analyze probabilistic loops containing if-statements, polynomial arithmetic, and common probability distributions. By translating loop analysis into linear recurrence solving, Polar uses the derived closed-forms of recurrences to compute the strongest polynomial invariant or to infer parameter sensitivity. Polar is both sound and complete within well-defined programming model restrictions. Lifting any of these restrictions results in significant hardness limits of computation. To overcome computational burdens for the sake of efficiency, Polar also provides incomplete but sound techniques to compute moments of combinations of variables.

2026-02-16T09:05:42Z Published in "Principles of Verification: Cycling the Probabilistic Landscape" Marcel Moosbrugger Julian Müllner Ezio Bartocci Laura Kovács 10.1007/978-3-031-75783-9_8 http://arxiv.org/abs/2602.15078v1 Computer Science as Infrastructure: the Spine of the Lean Computer Science Library (CSLib) 2026-02-16T07:41:16Z

Following in the footsteps of the success of Mathlib - the centralised library of formalised mathematics in Lean - CSLib is a rapidly-growing centralised library of formalised computer science and software. In this paper, we present its founding technical principles, operation, abstractions, and semantic framework. We contribute reusable semantic interfaces (reduction and labelled transition systems), proof automation, CI/testing support for maintaining automation and compatibility with Mathlib, and the first substantial developments of languages and models.

2026-02-16T07:41:16Z Christopher Henson Fabrizio Montesi http://arxiv.org/abs/2605.04050v1 LCM: Lossless Context Management 2026-02-14T20:46:47Z

We introduce Lossless Context Management (LCM), a deterministic architecture for LLM memory that outperforms Claude Code on long-context tasks. When benchmarked using Opus 4.6, our LCM-augmented coding agent, Volt, achieves higher scores than Claude Code on the OOLONG long-context eval, including at every context length between 32K and 1M tokens. LCM may be considered both a vindication and extension of the recursive paradigm pioneered by Recursive Language Models (RLMs). Our results demonstrate that recursive context manipulation can outperform not just conventional LLMs, but frontier coding agents with native file-system access. LCM departs from RLM by decomposing symbolic recursion into two deterministic, engine-managed mechanisms: recursive context compression, in which a hierarchical summary DAG automatically compacts older messages while retaining lossless pointers to every original; and recursive task partitioning, in which engine-managed parallel primitives like LLM-Map replace model-written loops. This trade-off, analogous to the move from GOTO to structured control flow in program-ming language design, sacrifices maximal flexibility for termination guarantees, zero-cost continuity on short tasks, and lossless retrievability of all prior state.

2026-02-14T20:46:47Z Clint Ehrlich Theodore Blackman http://arxiv.org/abs/2602.07672v2 Debugging code world models 2026-02-14T20:46:17Z

Code World Models (CWMs) are language models trained to simulate program execution by predicting explicit runtime state after every executed command. This execution-based world modeling enables internal verification within the model, offering an alternative to natural language chain-of-thought reasoning. However, the sources of errors and the nature of CWMs' limitations remain poorly understood. We study CWMs from two complementary perspectives: local semantic execution and long-horizon state tracking. On real-code benchmarks, we identify two dominant failure regimes. First, dense runtime state reveals produce token-intensive execution traces, leading to token-budget exhaustion on programs with long execution histories. Second, failures disproportionately concentrate in string-valued state, which we attribute to limitations of subword tokenization rather than program structure. To study long-horizon behavior, we use a controlled permutation-tracking benchmark that isolates state propagation under action execution. We show that long-horizon degradation is driven primarily by incorrect action generation: when actions are replaced with ground-truth commands, a Transformer-based CWM propagates state accurately over long horizons, despite known limitations of Transformers in long-horizon state tracking. These findings suggest directions for more efficient supervision and state representations in CWMs that are better aligned with program execution and data types.

2026-02-07T19:32:15Z 8 pages, 4 figures, under review in conference Babak Rahmani http://arxiv.org/abs/2602.13400v1 InEx-Bug: A Human Annotated Dataset of Intrinsic and Extrinsic Bugs in the NPM Ecosystem 2026-02-13T19:03:28Z

Understanding the causes of software defects is essential for reliable software maintenance and ecosystem stability. However, existing bug datasets do not distinguish between issues originating within a project from those caused by external dependencies or environmental factors. In this paper we present InEx-Bug, a manually annotated dataset of 377 GitHub issues from 103 NPM repositories, categorizing issues as Intrinsic (internal defect), Extrinsic (dependency/environment issue), Not-a-Bug, or Unknown. Beyond labels, the dataset includes rich temporal and behavioral metadata such as maintainer participation, code changes, and reopening patterns. Analyses show Intrinsic bugs resolve faster (median 8.9 vs 10.2 days), are close more often (92% vs 78%), and require code changes more frequently (57% vs 28%) compared to Extrinsic bugs. While Extrinsic bugs exhibit higher reopen rates (12% vs 4%) and delayed recurrence (median 157 vs 87 days). The dataset provides a foundation for further studying Intrinsic and Extrinsic defects in the NPM ecosystem.

2026-02-13T19:03:28Z Tanner Wright Adams Chen Gema Rodríguez-Pérez http://arxiv.org/abs/2602.11481v1 Compiler-Guided Inference-Time Adaptation: Improving GPT-5 Programming Performance in Idris 2026-02-12T01:48:52Z

GPT-5, a state of the art large language model from OpenAI, demonstrates strong performance in widely used programming languages such as Python, C++, and Java; however, its ability to operate in low resource or less commonly used languages remains underexplored. This work investigates whether GPT-5 can effectively acquire proficiency in an unfamiliar functional programming language, Idris, through iterative, feedback driven prompting. We first establish a baseline showing that with zero shot prompting the model solves only 22 out of 56 Idris exercises using the platform Exercism, substantially underperforming relative to higher resource languages (45 out of 50 in Python and 35 out of 47 in Erlang). We then evaluate several refinement strategies, including iterative prompting based on platform feedback, augmenting prompts with documentation and error classification guides, and iterative prompting using local compilation errors and failed test cases. Among these approaches, incorporating local compilation errors yields the most substantial improvements. Using this structured, error guided refinement loop, GPT-5 performance increased to an impressive 54 solved problems out of 56. These results suggest that while large language models may initially struggle in low resource settings, structured compiler level feedback can play a critical role in unlocking their capabilities.

2026-02-12T01:48:52Z Minda Li Bhaskar Krishnamachari http://arxiv.org/abs/2602.11232v1 Yaksha-Prashna: Understanding eBPF Bytecode Network Function Behavior 2026-02-11T16:15:24Z

Many cloud infrastructure organizations increasingly rely on third-party eBPF-based network functions for use cases like security, observability, and load balancing, so that not everyone requires a team of highly skilled eBPF experts. However, the network functions from third parties (e.g., F5, Palo Alto) are available in bytecode format to cloud operators, giving little or no understanding of their functional correctness and interaction with other network functions in a chain. Also, eBPF developers want to provide proof of functional correctness for their developed network functions without disclosing the source code to the operators. We design Yaksha-Prashna, a system that allows operators/developers to assert and query bytecode's conformance to its specification and dependencies on other bytecodes. Our work builds domain-specific models that enable us to employ scalable program analysis to extract and model eBPF programs. Using Yaksha-Prashna language, we express 24 properties on standard and non-standard eBPF-based network functions with 200-1000x speedup over the state-of-the-art work.

2026-02-11T16:15:24Z Animesh Singh K Shiv Kumar S. VenkataKeerthy Pragna Mamidipaka R V B R N Aaseesh Sayandeep Sen Palanivel Kodeswaran Theophilus A. Benson Ramakrishna Upadrasta Praveen Tammana http://arxiv.org/abs/2604.09558v1 VTC: DNN Compilation with Virtual Tensors for Data Movement Elimination 2026-02-11T06:23:10Z

With the widening gap between compute and memory operation latencies, data movement optimizations have become increasingly important for DNN compilation. Current optimizations such as layout transformations and operator fusion only target a subset of tensor operators and consequently miss important opportunities for reducing data movement in contemporary DNN workloads, including large language models. We introduce VTC, a novel tensor compilation framework that for the first time eliminates all unnecessary data movement by targeting the full spectrum of data movement operators. VTC proposes the concept of virtual tensors to track data movement between compute operators via index mappings rather than expensive physical data transfers to and from global memory, which can seamlessly interoperate with existing computation kernels and handle arbitrary tensor operator compositions. We also introduce a novel data movement elimination algorithm to automatically identify a profitable virtual tensor creation strategy. Evaluation on a variety of DNNs shows that VTC can outperform existing ML compilers by up to 1.93x (1.28x on average) on NVIDIA GPUs with up to 60% (17.5% on average) inference memory savings.

2026-02-11T06:23:10Z Accepted to OSDI'26 Muyan Hu Ahan Gupta Jiachen Yuan Vima Gupta Taeksang Kim Xin Xu Janardhan Kulkarni Ofer Dekel Vikram Adve Charith Mendis http://arxiv.org/abs/2512.02371v2 Pushing Tensor Accelerators Beyond MatMul in a User-Schedulable Language 2026-02-11T03:04:20Z

Tensor accelerators now represent a growing share of compute resources in modern CPUs and GPUs. However, they are hard to program, leading developers to use vendor-provided kernel libraries that support tensor accelerators. As a result, the usage of tensor accelerators is limited to the provided interface, mainly designed for traditional ML and scientific computing workloads. In this paper, we show that tensor accelerators can improve the performance of applications beyond simple variants of MatMul. For example, many image processing pipelines are linear transformations over matrices in disguise and can therefore utilize such specialized hardware. This is nonetheless hindered by the difficulties in programming tensor accelerators. We tackle this problem with compiler-based techniques. We use the Halide user-schedulable language and express operations as Halide algorithms succinctly. To this end, we implement a flexible tensor instruction selector based on equality saturation. The tensor instruction selector supports both CPU- and GPU-attached tensor accelerators and works with existing scheduling operations (e.g., producer-consumer fusion). Together, this enables developers to write diverse accelerator-leveraging applications in a few dozen lines. Using our system, we demonstrate the potential of tensor accelerators beyond their traditional domains. We implement several image processing pipelines (e.g., filtering, resampling, and denoising) in our system and evaluate them against non-accelerator-leveraging baselines. We show that these pipelines can achieve significant speedups. For example, a downsampling routine is sped up by $6.1\times$ by utilizing Tensor Cores on an Nvidia RTX 4070 GPU.

2025-12-02T03:22:21Z CGO 2026 Yihong Zhang Derek Gerstmann Andrew Adams Maaz Bin Safeer Ahmad http://arxiv.org/abs/2602.10320v1 Implementability of Global Distributed Protocols modulo Network Architectures 2026-02-10T21:58:45Z

Global protocols specify distributed, message-passing protocols from a birds-eye view, and are used as a specification for synthesizing local implementations. Implementability asks whether a given global protocol admits a distributed implementation. We present the first comprehensive investigation of global protocol implementability modulo network architectures. We propose a set of network-parametric Coherence Conditions, and exhibit sufficient assumptions under which it precisely characterizes implementability. We further reduce these assumptions to a minimal set of operational axioms describing insert and remove behavior of individual message buffers. Our reduction immediately establishes that five commonly studied asynchronous network architectures, namely peer-to-peer FIFO, mailbox, senderbox, monobox and bag, are instances of our network-parametric result. We use our characterization to derive optimal complexity results for implementability modulo networks, relationships between classes of implementable global protocols, and symbolic algorithms for deciding implementability modulo networks. We implement the latter in the first network-parametric tool Sprout(A), and show that it achieves network generality without sacrificing performance and modularity.

2026-02-10T21:58:45Z Elaine Li Thomas Wies http://arxiv.org/abs/2412.15768v2 Complete Fusion for Stateful Streams: Equational Theory of Stateful Streams and Fusion as Normalization-by-Evaluation 2026-02-10T11:56:16Z

Processing large amounts of data fast, in constant and small space is the point of stream processing and the reason for its increasing use. Alas, the most performant, imperative processing code tends to be almost impossible to read, let alone modify, reuse -- or write correctly. We present both a stream compilation theory and its implementation as a portable stream processing library Strymonas that lets us assemble complex stream pipelines just by plugging in simple combinators, and yet attain the performance of hand-written imperative loops and state machines. The library supports finite and infinite streams and offers a rich set of combinators. They may be freely composed, and yet the resulting convoluted imperative code has no traces of combinator abstractions: no closures or intermediate objects. The high-performance is portable and statically guaranteed, without relying on compiler or black-box optimizations. We greatly exceed in performance the available stream processing libraries in OCaml. The library generates C and OCaml code. The declaratively built Strymonas pipelines are all stateful. The stream state introduced in the library is not directly observable. Therefore, the Strymonas API looks like the familiar interface of `pure functional' combinators. Programmers may introduce their own stream state and share it across the pipeline. Strymonas has been developed in tandem with the equational theory of stateful streams. Our theoretical model represents all desired pipelines and guarantees the existence of unique normal forms, which are mappable to (fused) state machines. We describe the normalization algorithm, as a form of normalization-by-evaluation. The equational theory lets us state and prove the correctness of the complete fusion optimization.

2024-12-20T10:40:27Z Revised based on many comments Oleg Kiselyov Tomoaki Kobayashi Nick Palladinos http://arxiv.org/abs/2602.09197v1 Global Protocols under Rendezvous Synchrony: From Realizability to Type Checking 2026-02-09T21:05:13Z

Global protocol specifications are the starting point of top-down verification methodologies, and serve as a blueprint for synthesizing local specifications that guarantee the correctness of distributed implementations. In this work, we study global protocol specifications targeting distributed processes that communicate via rendezvous synchrony. We obtain the following positive results for the synchronous realizability problem: (a) realizability is decidable for global protocols over a transitive concurrency alphabet in 2-EXPTIME in the size of the protocol, and in 3-EXPTIME in the size of the alphabet, and (b) realizability is decidable in EXPTIME for global protocols that unambiguously represent their trace language. Unambiguous global protocols encompass fragments of directed and sender-driven choice protocols studied in prior work. Further, our reductions admit the corollary that the synchronous verification problem is solvable with the same complexity, where a candidate realization is included as part of the input. We then prove a negative result stating that synchronous realizability is undecidable in general. Finally, we propose a type system to check pi-calculus processes against local specifications in the form of synchronous communicating state machines. Our type system is the first to support processes with local mixed choice in the presence of session interleaving and~delegation.

2026-02-09T21:05:13Z Elaine Li Felix Stutz http://arxiv.org/abs/2602.08846v1 Impredicativity in Linear Dependent Type Theory 2026-02-09T16:12:57Z

We construct a realizability model of linear dependent type theory from a linear combinatory algebra. Our model motivates a number of additions to the type theory. In particular, we add a universe with two decoding operations: one takes codes to cartesian types and the other takes codes to linear types. The universe is impredicative in the sense that it is closed under both large cartesian dependent products and large linear dependent products. We also add a rule for injectivity of the modality turning linear terms into cartesian terms. With all of the additions, we are able to encode (linear) inductive types. As a case study, we consider the type of lists over a linear type, and demonstrate that our encoding has the relevant uniqueness principle. The construction of the realizability model is fully formalized in the proof assistant Rocq.

2026-02-09T16:12:57Z 20 pages, 2 figures Sam Speight Niels van der Weide http://arxiv.org/abs/2601.02563v3 Compressed code: the hidden effects of quantization and distillation on programming tokens 2026-02-08T20:13:10Z

Large Language Models (LLMs) have demonstrated exceptional code generation capabilities, yet their token-level mechanisms remain underexplored, particularly in compressed models. Through systematic analysis of programming language token representations, we characterize how programming languages are encoded in LLM tokenizers by analyzing their vocabulary distribution and keyword coverage patterns. We introduce a novel cold-start probability analysis method that provides insights into model behavior without requiring explicit prompts. Additionally, we present a comprehensive evaluation of how different model optimization techniques - including quantization, distillation, model scaling, and task-specific fine-tuning - affect token-level representations and code generation quality. Our experiments, supported by comprehensive probability distribution analysis and evaluation metrics, reveal critical insights into token-level behavior and provide empirically-validated guidelines for maintaining code generation quality under various optimization constraints. These findings advance both theoretical understanding of LLM code generation and practical implementation of optimized models in production environments.

2026-01-05T21:32:47Z 18 pages, 1 figure and 6 tables Viacheslav Siniaev Iaroslav Chelombitko Aleksey Komissarov