https://arxiv.org/api/HaU64OnWOk/RvjkWwWYsCJfWaDA 2026-06-21T10:16:20Z 9918 675 15 http://arxiv.org/abs/2602.17688v1 AnCoder: Anchored Code Generation via Discrete Diffusion Models 2026-02-05T22:46:43Z

Diffusion language models offer a compelling alternative to autoregressive code generation, enabling global planning and iterative refinement of complex program logic. However, existing approaches fail to respect the rigid structure of programming languages and, as a result, often produce broken programs that fail to execute. To address this, we introduce AnchorTree, a framework that explicitly anchors the diffusion process using structured, hierarchical priors native to code. Specifically, AnchorTree uses the abstract syntax tree to prioritize resolving syntactically and semantically salient tokens, such as keywords (e.g., if, while) and identifiers (e.g., variable names), thereby establishing a structural scaffold that guides the remaining generation. We validate this framework via AnCoder, a family of models showing that structurally anchored diffusion offers a parameter-efficient path to high-quality code generation.

2026-02-05T22:46:43Z Anton Xue Litu Rout Constantine Caramanis Sanjay Shakkottai http://arxiv.org/abs/2512.23496v2 Fancy Some Chips for Your TeaStore? Modeling the Control of an Adaptable Discrete System 2026-02-05T16:55:41Z

When designing new web applications, developers must cope with different kinds of constraints relative to the resources they rely on: software, hardware, network, online micro-services, or any combination of the mentioned entities. Together, these entities form a complex system of communicating interdependent processes, physical or logical. It is very desirable that such system ensures its robustness to provide a good quality of service. In this paper we introduce Chips, a language that aims at facilitating the design of models made of various entwined components. It allows the description of applications in the form of functional blocks. Chips mixes notions from control theory and general purpose programming languages to generate robust component-based models. This paper presents how to use Chips to systematically design, model and analyse a complex system project, using a variation of the Adaptable TeaStore application as running example.

2025-12-29T14:35:03Z In Proceedings WACA 2025, arXiv:2512.22054. This work was supported by the ANR grant ANR-23-CE25-0004 (ADAPT). O. Kouchnarenko was supported by the EIPHI Graduate School (grant number ANR-17-EURE-0002) EPTCS 438, 2025, pp. 58-78 Anna Gallone Université Marie et Louis Pasteur, CNRS UMR6174, Institut FEMTO-ST, Besançon, France Simon Bliudze Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRIStAL, Lille, France Sophie Cerf Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRIStAL, Lille, France Olga Kouchnarenko Université Marie et Louis Pasteur, CNRS UMR6174, Institut FEMTO-ST, Besançon, France 10.4204/EPTCS.438.4 http://arxiv.org/abs/2602.05850v1 An Equational Axiomatization of Dynamic Threads via Algebraic Effects: Presheaves on Finite Relations, Labelled Posets, and Parameterized Algebraic Theories 2026-02-05T16:36:20Z

We use the theory of algebraic effects to give a complete equational axiomatization for dynamic threads. Our method is based on parameterized algebraic theories, which give a concrete syntax for strong monads on functor categories, and are a convenient framework for names and binding. Our programs are built from the key primitives `fork' and `wait'. `Fork' creates a child thread and passes its name (thread ID) to the parent thread. `Wait' allows us to wait for given child threads to finish. We provide a parameterized algebraic theory built from fork and wait, together with basic atomic actions and laws such as associativity of `fork'. Our equational axiomatization is complete in two senses. First, for closed expressions, it completely captures equality of labelled posets (pomsets), an established model of concurrency: model complete. Second, any two open expressions are provably equal if they are equal under all closing substitutions: syntactically complete. The benefit of algebraic effects is that the semantic analysis can focus on the algebraic operations of fork and wait. We then extend the analysis to a simple concurrent programming language by giving operational and denotational semantics. The denotational semantics is built using the methods of parameterized algebraic theories and we show that it is sound, adequate, and fully abstract at first order for labelled-poset observations.

2026-02-05T16:36:20Z Published at POPL 2026 Ohad Kammar Jack Liell-Cock Sam Lindley Cristina Matache Sam Staton http://arxiv.org/abs/2405.11361v4 Opportunistically Parallel Lambda Calculus 2026-02-05T16:20:04Z

Scripting languages are widely used to compose external calls such as native libraries and network services. In such scripts, execution time is often dominated by waiting for these external calls, rendering traditional single-language optimizations ineffective. To address this, we propose a novel opportunistic evaluation strategy for scripting languages based on a core lambda calculus that automatically dispatches independent external calls in parallel and streams their results. We prove that our approach is confluent, ensuring that it preserves the programmer's original intent, and that it eventually executes every external call. We implement this approach in a scripting language called Opal. We demonstrate the versatility and performance of Opal, focusing on programs that invoke heavy external computation through the use of large language models (LLMs) and other APIs. Across five scripts, we compare to several state-of-the-art baselines and show that opportunistic evaluation improves total running time (up to $6.2\times$) and latency (up to $12.7\times$) compared to standard sequential Python, while performing very close (between $1.3\%$ and $18.5\%$ running time overhead) to hand-tuned manually optimized asynchronous Rust. For Tree-of-Thoughts, a prominent LLM reasoning approach, we achieve a $6.2\times$ performance improvement over the authors' own implementation.

2024-05-18T18:13:31Z Proc. ACM Program. Lang. 9, OOPSLA2, Article 365 (October 2025) Stephen Mell Konstantinos Kallas Steve Zdancewic Osbert Bastani 10.1145/3763143 http://arxiv.org/abs/2602.04846v1 CSLib: The Lean Computer Science Library 2026-02-04T18:36:49Z

We introduce CSLib, an open-source framework for proving computer-science-related theorems and writing formally verified code in the Lean proof assistant. CSLib aims to be for computer science what Lean's Mathlib is for mathematics. Mathlib has been tremendously impactful: it is a key reason for Lean's popularity within the mathematics research community, and it has also played a critical role in the training of AI systems for mathematical reasoning. However, the base of computer science knowledge in Lean is currently quite limited. CSLib will vastly enhance this knowledge base and provide infrastructure for using this knowledge in real-world verification projects. By doing so, CSLib will (1) enable the broad use of Lean in computer science education and research, and (2) facilitate the manual and AI-aided engineering of large-scale formally verified systems.

2026-02-04T18:36:49Z Clark Barrett Swarat Chaudhuri Fabrizio Montesi Jim Grundy Pushmeet Kohli Leonardo de Moura Alexandre Rademaker Sorrachai Yingchareonthawornchai http://arxiv.org/abs/2506.01374v5 REASONING COMPILER: LLM-Guided Optimizations for Efficient Model Serving 2026-02-04T07:26:24Z

While model serving has unlocked unprecedented capabilities, the high cost of serving large-scale models continues to be a significant barrier to widespread accessibility and rapid innovation. Compiler optimizations have long driven substantial performance improvements, but existing compilers struggle with neural workloads due to the exponentially large and highly interdependent space of possible transformations. Although existing stochastic search techniques can be effective, they are often sample-inefficient and fail to leverage the structural context underlying compilation decisions. We set out to investigate the research question of whether reasoning with large language models (LLMs), without any retraining, can leverage the context-aware decision space of compiler optimizations to significantly improve sample efficiency. To that end, we introduce a novel compilation framework (dubbed REASONING COMPILER) that formulates optimization as a sequential, context-aware decision process guided by a large language model and structured Monte Carlo tree search (MCTS). The LLM acts as a proposal mechanism, suggesting hardware-informed transformations that reflect the current program state and accumulated performance feedback. MCTS incorporates the LLM-generated proposals to balance exploration and exploitation, facilitating a structured, context-sensitive traversal of the expansive compiler optimization space. By achieving substantial speedups with markedly fewer samples than leading neural compilers, our approach demonstrates the potential of LLM-guided reasoning to transform the landscape of compiler optimization.

2025-06-02T07:02:46Z NeurIPS 2025 Annabelle Sujun Tang Christopher Priebe Rohan Mahapatra Lianhui Qin Hadi Esmaeilzadeh http://arxiv.org/abs/2602.03777v1 From Separate Compilation to Sound Language Composition 2026-02-03T17:38:34Z

The development of programming languages involves complex theoretical and practical challenges, particularly when addressing modularity and reusability through language extensions. While language workbenches aim to enable modular development under the constraints of the language extension problem, one critical constraint -- separate compilation -- is often relaxed due to its complexity. However, this relaxation undermines artifact reusability and integration with common dependency systems. A key difficulty under separate compilation arises from managing attribute grammars, as extensions may introduce new attributes that invalidate previously generated abstract syntax tree structures. Existing approaches, such as the use of dynamic maps in the Neverlang workbench, favor flexibility at the cost of compile-time correctness, leading to potential runtime errors due to undefined attributes. This work addresses this issue by introducing nlgcheck, a theoretically sound static analysis tool based on data-flow analysis for the Neverlang language workbench. nlgcheck detects potential runtime errors -- such as undefined attribute accesses -- at compile time, preserving separate compilation while maintaining strong static correctness guarantees. Experimental evaluation using mutation testing on Neverlang-based projects demonstrates that nlgcheck effectively enhances robustness without sacrificing modularity or flexibility and with a level of performance that does not impede its adoption in daily development activities.

2026-02-03T17:38:34Z 43 pages, 1 figure, 5 Listing Federico Bruzzone Walter Cazzola Luca Favalli http://arxiv.org/abs/2602.03588v1 Efficient Algorithms for Partial Constraint Satisfaction Problems over Control-flow Graphs 2026-02-03T14:38:10Z

In this work, we focus on the Partial Constraint Satisfaction Problem (PCSP) over control-flow graphs (CFGs) of programs. PCSP serves as a generalization of the well-known Constraint Satisfaction Problem (CSP). In the CSP framework, we define a set of variables, a set of constraints, and a finite domain $D$ that encompasses all possible values for each variable. The objective is to assign a value to each variable in such a way that all constraints are satisfied. In the graph variant of CSP, an underlying graph is considered and we have one variable corresponding to each vertex of the graph and one or several constraints corresponding to each edge. In PCSPs, we allow for certain constraints to be violated at a specified cost, aiming to find a solution that minimizes the total cost. Numerous classical compiler optimization tasks can be framed as PCSPs over control-flow graphs. Examples include Register Allocation, Lifetime-optimal Speculative Partial Redundancy Elimination (LOSPRE), and Optimal Placement of Bank Selection Instructions. On the other hand, it is well-known that control-flow graphs of structured programs are sparse and decomposable in a variety of ways. In this work, we rely on the Series-Parallel-Loop (SPL) decompositions as introduced by~\cite{RegisterAllocation}. Our main contribution is a general algorithm for PCSPs over SPL graphs with a time complexity of $O(|G| \cdot |D|^6)$, where $|G|$ represents the size of the control-flow graph. Note that for any fixed domain $D,$ this yields a linear-time solution. Our algorithm can be seen as a generalization and unification of previous SPL-based approaches for register allocation and LOSPRE. In addition, we provide experimental results over another classical PCSP task, i.e. Optimal Bank Selection, achieving runtimes four times better than the previous state of the art.

2026-02-03T14:38:10Z Already accepted by SETTA'25. https://www.setta2025.uk/accepted-papers. arXiv admin note: substantial text overlap with arXiv:2507.16660 Xuran Cai Amir Goharshady http://arxiv.org/abs/2601.19065v2 The Opaque Pointer Design Pattern in Python: Towards a Pythonic PIMPL for Modularity, Encapsulation, and Stability 2026-02-03T02:40:48Z

Python libraries often need to maintain a stable public API even as internal implementations evolve, gain new backends, or depend on heavy optional libraries. In Python, where internal objects are easy to inspect and import, users can come to rely on "reachable internals" that were never intended to be public, making refactoring risky and slowing long-term maintenance. This paper revisits the pointer-to-implementation (PIMPL) idiom from C++ and reinterprets it as a Pythonic pattern of opaque delegation: a small public object (or module) that delegates its behavior to a separate implementation object treated as internal. We situate this pattern within a broader taxonomy of encapsulation techniques in Python, relate it to existing practices such as module-level indirection, facade objects, and backend dispatch, and identify PIMPL-like structures already used in the standard library and the scientific Python ecosystem. We then show how a Pythonic PIMPL can be used in existing codebases to isolate heavy dependencies, support lazy imports, and enable runtime selection of alternative backends without changing the public API. Finally, we discuss the benefits and trade-offs of the approach and offer practical guidance on when the pattern is appropriate and how to apply it in large, long-lived Python libraries.

2026-01-27T01:02:26Z Antonios Saravanos New York University John Pazarzis Independent Researcher Stavros Zervoudakis New York University Dongnanzi Zheng New York University http://arxiv.org/abs/2512.23552v2 Beyond Per-Thread Lock Sets: Multi-Thread Critical Sections and Dynamic Deadlock Prediction 2026-02-02T17:59:57Z

Lock sets are commonly used for dynamic analysis of deadlocks. The standard per-thread lock set construction only considers locks acquired in the same thread, but is unaware of locks acquired in another thread. This leads to false positives and false negatives. The underlying issue is that the commonly used notion of a critical section on which the lock set construction relies ignores events from other threads. We give a trace-based characterization of critical sections that drops this restriction. Critical sections are no longer restricted to a single thread and can cover multiple threads. Such forms of critical sections exist, are natural, and correct the standard formulation. We show how to soundly approximate the trace-based characterization via partial order relations. Thus, we obtain an improved lock set construction that can still be efficiently computed and allows us to remove false positives reported by the DIRK deadlock predictor and remove false negatives by extending the SPDOffline deadlock predictor. We integrate various lock set constructions with increased precision in an extension of SPDOffline. Our extensions remain sound (no false positives) but are more complete (fewer false negatives) w.r.t. SPDOffline. For an extensive standard benchmark suite we can also show that the performance is not affected.

2025-12-29T15:50:26Z Fixed typo in Fig 3 (x => l1, y => l2) Martin Sulzmann http://arxiv.org/abs/2603.06588v1 vLLM Hook v0: A Plug-in for Programming Model Internals on vLLM 2026-02-02T16:34:13Z

Modern artificial intelligence (AI) models are deployed on inference engines to optimize runtime efficiency and resource allocation, particularly for transformer-based large language models (LLMs). The vLLM project is a major open-source library to support model serving and inference. However, the current implementation of vLLM limits programmability of the internal states of deployed models. This prevents the use of popular test-time model alignment and enhancement methods. For example, it prevents the detection of adversarial prompts based on attention patterns or the adjustment of model responses based on activation steering. To bridge this critical gap, we present vLLM Hook, an opensource plug-in to enable the programming of internal states for vLLM models. Based on a configuration file specifying which internal states to capture, vLLM Hook provides seamless integration to vLLM and supports two essential features: passive programming and active programming. For passive programming, vLLM Hook probes the selected internal states for subsequent analysis, while keeping the model generation intact. For active programming, vLLM Hook enables efficient intervention of model generation by altering the selected internal states. In addition to presenting the core functions of vLLM Hook, in version 0, we demonstrate 3 use cases including prompt injection detection, enhanced retrieval-augmented retrieval (RAG), and activation steering. Finally, we welcome the community's contribution to improve vLLM Hook via https://github.com/ibm/vllm-hook.

2026-02-02T16:34:13Z Ching-Yun Ko Pin-Yu Chen http://arxiv.org/abs/2602.01720v1 Phoenix: A Modular and Versatile Framework for C/C++ Pointer Analysis 2026-02-02T06:58:36Z

We present Phoenix, a modular pointer analysis framework for C/C++ that unifies multiple state-of-the-art alias analysis algorithms behind a single, stable interface. Phoenix addresses the fragmentation of today's C/C++ pointer analysis ecosystem by cleanly separating IR construction, constraint generation, solver backends, and client-facing queries, making analyses easy to compare, swap, and compose while exposing explicit precision-performance trade-offs. We evaluate Phoenix against SVF under two representative configurations: a flow- and context-insensitive setting and a more precise flow- and context-sensitive setting, on 28 GNU coreutils programs. Phoenix delivers robust speedups in the baseline configuration (up to 2.88x) and remains competitive, and often faster, even in the stronger precision regime (up to 2.91x), without a systematic runtime penalty. In production, Phoenix serves as the analysis substrate for static analysis and fuzzing tools that have uncovered hundreds of new bugs and enabled deployments reporting more than 1000 bugs found in an industrial toolchain.

2026-02-02T06:58:36Z Peisen Yao Zinan Gu Qingkai Shi http://arxiv.org/abs/2601.03708v2 MHRC-Bench: A Multilingual Hardware Repository-Level Code Completion benchmark 2026-02-01T15:06:35Z

Large language models (LLMs) have achieved strong performance on code completion tasks in general-purpose programming languages. However, existing repository-level code completion benchmarks focus almost exclusively on software code and largely overlook hardware description languages. In this work, we present \textbf{MHRC-Bench}, consisting of \textbf{MHRC-Bench-Train} and \textbf{MHRC-Bench-Eval}, the first benchmark designed for multilingual hardware code completion at the repository level. Our benchmark targets completion tasks and covers three major hardware design coding styles. Each completion target is annotated with code-structure-level and hardware-oriented semantic labels derived from concrete syntax tree analysis. We conduct a comprehensive evaluation of models on MHRC-Bench-Eval. Comprehensive evaluation results and analysis demonstrate the effectiveness of MHRC-Bench.

2026-01-07T08:46:10Z Qingyun Zou Jiahao Cui Nuo Chen Bingsheng He Weng-Fai Wong http://arxiv.org/abs/2504.01847v3 Confluence of Conditional Rewriting Modulo 2026-01-31T19:58:23Z

Sets of equations E play an important computational role in rewriting-based systems R by defining an equivalence relation =E inducing a partition of terms into E-equivalence classes on which rewriting computations, denoted ->R/E and called *rewriting modulo E*, are issued. This paper investigates *confluence of ->R/E*, usually called *E-confluence*, for *conditional* rewriting-based systems, where rewriting steps are determined by conditional rules. We rely on Jouannaud and Kirchner's framework to investigate confluence of an abstract relation R modulo an abstract equivalence relation E on a set A. We show how to particularize the framework to be used with conditional systems. Then, we show how to define appropriate finite sets of *conditional pairs* to prove and disprove E-confluence. In particular, we introduce *Logic-based Conditional Critical Pairs* which do not require the use of (often infinitely many) E-unifiers to provide a finite representation of the *local peaks* considered in the abstract framework. We also introduce *parametric Conditional Variable Pairs* which are essential to deal with conditional rules in the analysis of E-confluence. Our results apply to well-known classes of rewriting-based systems. In particular, to *Equational (Conditional) Term Rewriting Systems*. In this realm, our E-confluence results strictly subsume previous approaches by Huet and Jouannaud and Kirchner.

2025-04-02T15:55:06Z 83 pages. 17 figures. 8 tables Salvador Lucas http://arxiv.org/abs/2508.03435v2 StoneDetector: Conventional and versatile code clone detection for Java 2026-01-31T12:13:35Z

Copy & paste is a widespread practice when developing software and, thus, duplicated and subsequently modified code occurs frequently in software projects. Since such code clones, i.e., identical or similar fragments of code, can bloat software projects and cause issues like bug or vulnerability propagation, their identification is of importance. In this paper, we present StoneDetector and its underlying method for finding code clones in Java source and Bytecode. StoneDetector implements a conventional clone detection approach based upon the textual comparison of paths derived from the code's representation by dominator trees. In this way, the tool does not only find exact and syntactically similar near-miss code clones, but also code clones that are harder to detect due to their larger variety in the syntax. We demonstrate StoneDetector's versatility as a conventional clone detection tool and analyze its various available configuration parameters, including the usage of different string metrics, hashing algorithms, etc. In our exhaustive evaluation with other conventional clone detectors on several state-of-the-art benchmarks, we can show StoneDetector's performance and scalability in finding code clones in both, Java source and Bytecode.

2025-08-05T13:23:13Z This is the accepted manuscript of a paper accepted for publication in The Journal of Systems and Software (Elsevier), the final published version will be available via the journal Thomas S. Heinze André Schäfer Wolfram Amme 10.1016/j.jss.2026.112799