https://arxiv.org/api/HaU64OnWOk/RvjkWwWYsCJfWaDA2026-06-21T10:16:20Z991867515http://arxiv.org/abs/2602.17688v1AnCoder: Anchored Code Generation via Discrete Diffusion Models2026-02-05T22:46:43ZDiffusion language models offer a compelling alternative to autoregressive code generation, enabling global planning and iterative refinement of complex program logic. However, existing approaches fail to respect the rigid structure of programming languages and, as a result, often produce broken programs that fail to execute. To address this, we introduce AnchorTree, a framework that explicitly anchors the diffusion process using structured, hierarchical priors native to code. Specifically, AnchorTree uses the abstract syntax tree to prioritize resolving syntactically and semantically salient tokens, such as keywords (e.g., if, while) and identifiers (e.g., variable names), thereby establishing a structural scaffold that guides the remaining generation. We validate this framework via AnCoder, a family of models showing that structurally anchored diffusion offers a parameter-efficient path to high-quality code generation.2026-02-05T22:46:43ZAnton XueLitu RoutConstantine CaramanisSanjay Shakkottaihttp://arxiv.org/abs/2512.23496v2Fancy Some Chips for Your TeaStore? Modeling the Control of an Adaptable Discrete System2026-02-05T16:55:41ZWhen designing new web applications, developers must cope with different kinds of constraints relative to the resources they rely on: software, hardware, network, online micro-services, or any combination of the mentioned entities. Together, these entities form a complex system of communicating interdependent processes, physical or logical. It is very desirable that such system ensures its robustness to provide a good quality of service. In this paper we introduce Chips, a language that aims at facilitating the design of models made of various entwined components. It allows the description of applications in the form of functional blocks. Chips mixes notions from control theory and general purpose programming languages to generate robust component-based models. This paper presents how to use Chips to systematically design, model and analyse a complex system project, using a variation of the Adaptable TeaStore application as running example.2025-12-29T14:35:03ZIn Proceedings WACA 2025, arXiv:2512.22054. This work was supported by the ANR grant ANR-23-CE25-0004 (ADAPT). O. Kouchnarenko was supported by the EIPHI Graduate School (grant number ANR-17-EURE-0002)EPTCS 438, 2025, pp. 58-78Anna GalloneUniversité Marie et Louis Pasteur, CNRS UMR6174, Institut FEMTO-ST, Besançon, FranceSimon BliudzeUniv. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRIStAL, Lille, FranceSophie CerfUniv. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRIStAL, Lille, FranceOlga KouchnarenkoUniversité Marie et Louis Pasteur, CNRS UMR6174, Institut FEMTO-ST, Besançon, France10.4204/EPTCS.438.4http://arxiv.org/abs/2602.05850v1An Equational Axiomatization of Dynamic Threads via Algebraic Effects: Presheaves on Finite Relations, Labelled Posets, and Parameterized Algebraic Theories2026-02-05T16:36:20ZWe use the theory of algebraic effects to give a complete equational axiomatization for dynamic threads. Our method is based on parameterized algebraic theories, which give a concrete syntax for strong monads on functor categories, and are a convenient framework for names and binding. Our programs are built from the key primitives `fork' and `wait'. `Fork' creates a child thread and passes its name (thread ID) to the parent thread. `Wait' allows us to wait for given child threads to finish. We provide a parameterized algebraic theory built from fork and wait, together with basic atomic actions and laws such as associativity of `fork'. Our equational axiomatization is complete in two senses. First, for closed expressions, it completely captures equality of labelled posets (pomsets), an established model of concurrency: model complete. Second, any two open expressions are provably equal if they are equal under all closing substitutions: syntactically complete. The benefit of algebraic effects is that the semantic analysis can focus on the algebraic operations of fork and wait. We then extend the analysis to a simple concurrent programming language by giving operational and denotational semantics. The denotational semantics is built using the methods of parameterized algebraic theories and we show that it is sound, adequate, and fully abstract at first order for labelled-poset observations.2026-02-05T16:36:20ZPublished at POPL 2026Ohad KammarJack Liell-CockSam LindleyCristina MatacheSam Statonhttp://arxiv.org/abs/2405.11361v4Opportunistically Parallel Lambda Calculus2026-02-05T16:20:04ZScripting languages are widely used to compose external calls such as native libraries and network services. In such scripts, execution time is often dominated by waiting for these external calls, rendering traditional single-language optimizations ineffective. To address this, we propose a novel opportunistic evaluation strategy for scripting languages based on a core lambda calculus that automatically dispatches independent external calls in parallel and streams their results. We prove that our approach is confluent, ensuring that it preserves the programmer's original intent, and that it eventually executes every external call. We implement this approach in a scripting language called Opal. We demonstrate the versatility and performance of Opal, focusing on programs that invoke heavy external computation through the use of large language models (LLMs) and other APIs. Across five scripts, we compare to several state-of-the-art baselines and show that opportunistic evaluation improves total running time (up to $6.2\times$) and latency (up to $12.7\times$) compared to standard sequential Python, while performing very close (between $1.3\%$ and $18.5\%$ running time overhead) to hand-tuned manually optimized asynchronous Rust. For Tree-of-Thoughts, a prominent LLM reasoning approach, we achieve a $6.2\times$ performance improvement over the authors' own implementation.2024-05-18T18:13:31ZProc. ACM Program. Lang. 9, OOPSLA2, Article 365 (October 2025)Stephen MellKonstantinos KallasSteve ZdancewicOsbert Bastani10.1145/3763143http://arxiv.org/abs/2602.04846v1CSLib: The Lean Computer Science Library2026-02-04T18:36:49ZWe introduce CSLib, an open-source framework for proving computer-science-related theorems and writing formally verified code in the Lean proof assistant. CSLib aims to be for computer science what Lean's Mathlib is for mathematics. Mathlib has been tremendously impactful: it is a key reason for Lean's popularity within the mathematics research community, and it has also played a critical role in the training of AI systems for mathematical reasoning. However, the base of computer science knowledge in Lean is currently quite limited. CSLib will vastly enhance this knowledge base and provide infrastructure for using this knowledge in real-world verification projects. By doing so, CSLib will (1) enable the broad use of Lean in computer science education and research, and (2) facilitate the manual and AI-aided engineering of large-scale formally verified systems.2026-02-04T18:36:49ZClark BarrettSwarat ChaudhuriFabrizio MontesiJim GrundyPushmeet KohliLeonardo de MouraAlexandre RademakerSorrachai Yingchareonthawornchaihttp://arxiv.org/abs/2506.01374v5REASONING COMPILER: LLM-Guided Optimizations for Efficient Model Serving2026-02-04T07:26:24ZWhile model serving has unlocked unprecedented capabilities, the high cost of serving large-scale models continues to be a significant barrier to widespread accessibility and rapid innovation. Compiler optimizations have long driven substantial performance improvements, but existing compilers struggle with neural workloads due to the exponentially large and highly interdependent space of possible transformations. Although existing stochastic search techniques can be effective, they are often sample-inefficient and fail to leverage the structural context underlying compilation decisions. We set out to investigate the research question of whether reasoning with large language models (LLMs), without any retraining, can leverage the context-aware decision space of compiler optimizations to significantly improve sample efficiency. To that end, we introduce a novel compilation framework (dubbed REASONING COMPILER) that formulates optimization as a sequential, context-aware decision process guided by a large language model and structured Monte Carlo tree search (MCTS). The LLM acts as a proposal mechanism, suggesting hardware-informed transformations that reflect the current program state and accumulated performance feedback. MCTS incorporates the LLM-generated proposals to balance exploration and exploitation, facilitating a structured, context-sensitive traversal of the expansive compiler optimization space. By achieving substantial speedups with markedly fewer samples than leading neural compilers, our approach demonstrates the potential of LLM-guided reasoning to transform the landscape of compiler optimization.2025-06-02T07:02:46ZNeurIPS 2025Annabelle Sujun TangChristopher PriebeRohan MahapatraLianhui QinHadi Esmaeilzadehhttp://arxiv.org/abs/2602.03777v1From Separate Compilation to Sound Language Composition2026-02-03T17:38:34ZThe development of programming languages involves complex theoretical and practical challenges, particularly when addressing modularity and reusability through language extensions. While language workbenches aim to enable modular development under the constraints of the language extension problem, one critical constraint -- separate compilation -- is often relaxed due to its complexity. However, this relaxation undermines artifact reusability and integration with common dependency systems. A key difficulty under separate compilation arises from managing attribute grammars, as extensions may introduce new attributes that invalidate previously generated abstract syntax tree structures. Existing approaches, such as the use of dynamic maps in the Neverlang workbench, favor flexibility at the cost of compile-time correctness, leading to potential runtime errors due to undefined attributes. This work addresses this issue by introducing nlgcheck, a theoretically sound static analysis tool based on data-flow analysis for the Neverlang language workbench. nlgcheck detects potential runtime errors -- such as undefined attribute accesses -- at compile time, preserving separate compilation while maintaining strong static correctness guarantees. Experimental evaluation using mutation testing on Neverlang-based projects demonstrates that nlgcheck effectively enhances robustness without sacrificing modularity or flexibility and with a level of performance that does not impede its adoption in daily development activities.2026-02-03T17:38:34Z43 pages, 1 figure, 5 ListingFederico BruzzoneWalter CazzolaLuca Favallihttp://arxiv.org/abs/2602.03588v1Efficient Algorithms for Partial Constraint Satisfaction Problems over Control-flow Graphs2026-02-03T14:38:10ZIn this work, we focus on the Partial Constraint Satisfaction Problem (PCSP) over control-flow graphs (CFGs) of programs. PCSP serves as a generalization of the well-known Constraint Satisfaction Problem (CSP). In the CSP framework, we define a set of variables, a set of constraints, and a finite domain $D$ that encompasses all possible values for each variable. The objective is to assign a value to each variable in such a way that all constraints are satisfied. In the graph variant of CSP, an underlying graph is considered and we have one variable corresponding to each vertex of the graph and one or several constraints corresponding to each edge. In PCSPs, we allow for certain constraints to be violated at a specified cost, aiming to find a solution that minimizes the total cost. Numerous classical compiler optimization tasks can be framed as PCSPs over control-flow graphs. Examples include Register Allocation, Lifetime-optimal Speculative Partial Redundancy Elimination (LOSPRE), and Optimal Placement of Bank Selection Instructions. On the other hand, it is well-known that control-flow graphs of structured programs are sparse and decomposable in a variety of ways. In this work, we rely on the Series-Parallel-Loop (SPL) decompositions as introduced by~\cite{RegisterAllocation}. Our main contribution is a general algorithm for PCSPs over SPL graphs with a time complexity of \(O(|G| \cdot |D|^6)\), where \(|G|\) represents the size of the control-flow graph. Note that for any fixed domain $D,$ this yields a linear-time solution. Our algorithm can be seen as a generalization and unification of previous SPL-based approaches for register allocation and LOSPRE. In addition, we provide experimental results over another classical PCSP task, i.e. Optimal Bank Selection, achieving runtimes four times better than the previous state of the art.2026-02-03T14:38:10ZAlready accepted by SETTA'25. https://www.setta2025.uk/accepted-papers. arXiv admin note: substantial text overlap with arXiv:2507.16660Xuran CaiAmir Goharshadyhttp://arxiv.org/abs/2601.19065v2The Opaque Pointer Design Pattern in Python: Towards a Pythonic PIMPL for Modularity, Encapsulation, and Stability2026-02-03T02:40:48ZPython libraries often need to maintain a stable public API even as internal implementations evolve, gain new backends, or depend on heavy optional libraries. In Python, where internal objects are easy to inspect and import, users can come to rely on "reachable internals" that were never intended to be public, making refactoring risky and slowing long-term maintenance. This paper revisits the pointer-to-implementation (PIMPL) idiom from C++ and reinterprets it as a Pythonic pattern of opaque delegation: a small public object (or module) that delegates its behavior to a separate implementation object treated as internal. We situate this pattern within a broader taxonomy of encapsulation techniques in Python, relate it to existing practices such as module-level indirection, facade objects, and backend dispatch, and identify PIMPL-like structures already used in the standard library and the scientific Python ecosystem. We then show how a Pythonic PIMPL can be used in existing codebases to isolate heavy dependencies, support lazy imports, and enable runtime selection of alternative backends without changing the public API. Finally, we discuss the benefits and trade-offs of the approach and offer practical guidance on when the pattern is appropriate and how to apply it in large, long-lived Python libraries.2026-01-27T01:02:26ZAntonios SaravanosNew York UniversityJohn PazarzisIndependent ResearcherStavros ZervoudakisNew York UniversityDongnanzi ZhengNew York Universityhttp://arxiv.org/abs/2512.23552v2Beyond Per-Thread Lock Sets: Multi-Thread Critical Sections and Dynamic Deadlock Prediction2026-02-02T17:59:57ZLock sets are commonly used for dynamic analysis of deadlocks. The standard per-thread lock set construction only considers locks acquired in the same thread, but is unaware of locks acquired in another thread. This leads to false positives and false negatives. The underlying issue is that the commonly used notion of a critical section on which the lock set construction relies ignores events from other threads. We give a trace-based characterization of critical sections that drops this restriction. Critical sections are no longer restricted to a single thread and can cover multiple threads. Such forms of critical sections exist, are natural, and correct the standard formulation.
We show how to soundly approximate the trace-based characterization via partial order relations. Thus, we obtain an improved lock set construction that can still be efficiently computed and allows us to remove false positives reported by the DIRK deadlock predictor and remove false negatives by extending the SPDOffline deadlock predictor. We integrate various lock set constructions with increased precision in an extension of SPDOffline. Our extensions remain sound (no false positives) but are more complete (fewer false negatives) w.r.t. SPDOffline. For an extensive standard benchmark suite we can also show that the performance is not affected.2025-12-29T15:50:26ZFixed typo in Fig 3 (x => l1, y => l2)Martin Sulzmannhttp://arxiv.org/abs/2603.06588v1vLLM Hook v0: A Plug-in for Programming Model Internals on vLLM2026-02-02T16:34:13ZModern artificial intelligence (AI) models are deployed on inference engines to optimize runtime efficiency and resource allocation, particularly for transformer-based large language models (LLMs). The vLLM project is a major open-source library to support model serving and inference. However, the current implementation of vLLM limits programmability of the internal states of deployed models. This prevents the use of popular test-time model alignment and enhancement methods. For example, it prevents the detection of adversarial prompts based on attention patterns or the adjustment of model responses based on activation steering. To bridge this critical gap, we present vLLM Hook, an opensource plug-in to enable the programming of internal states for vLLM models. Based on a configuration file specifying which internal states to capture, vLLM Hook provides seamless integration to vLLM and supports two essential features: passive programming and active programming. For passive programming, vLLM Hook probes the selected internal states for subsequent analysis, while keeping the model generation intact. For active programming, vLLM Hook enables efficient intervention of model generation by altering the selected internal states. In addition to presenting the core functions of vLLM Hook, in version 0, we demonstrate 3 use cases including prompt injection detection, enhanced retrieval-augmented retrieval (RAG), and activation steering. Finally, we welcome the community's contribution to improve vLLM Hook via https://github.com/ibm/vllm-hook.2026-02-02T16:34:13ZChing-Yun KoPin-Yu Chenhttp://arxiv.org/abs/2602.01720v1Phoenix: A Modular and Versatile Framework for C/C++ Pointer Analysis2026-02-02T06:58:36ZWe present Phoenix, a modular pointer analysis framework for C/C++ that unifies multiple state-of-the-art alias analysis algorithms behind a single, stable interface. Phoenix addresses the fragmentation of today's C/C++ pointer analysis ecosystem by cleanly separating IR construction, constraint generation, solver backends, and client-facing queries, making analyses easy to compare, swap, and compose while exposing explicit precision-performance trade-offs. We evaluate Phoenix against SVF under two representative configurations: a flow- and context-insensitive setting and a more precise flow- and context-sensitive setting, on 28 GNU coreutils programs. Phoenix delivers robust speedups in the baseline configuration (up to 2.88x) and remains competitive, and often faster, even in the stronger precision regime (up to 2.91x), without a systematic runtime penalty. In production, Phoenix serves as the analysis substrate for static analysis and fuzzing tools that have uncovered hundreds of new bugs and enabled deployments reporting more than 1000 bugs found in an industrial toolchain.2026-02-02T06:58:36ZPeisen YaoZinan GuQingkai Shihttp://arxiv.org/abs/2601.03708v2MHRC-Bench: A Multilingual Hardware Repository-Level Code Completion benchmark2026-02-01T15:06:35ZLarge language models (LLMs) have achieved strong performance on code completion tasks in general-purpose programming languages. However, existing repository-level code completion benchmarks focus almost exclusively on software code and largely overlook hardware description languages. In this work, we present \textbf{MHRC-Bench}, consisting of \textbf{MHRC-Bench-Train} and \textbf{MHRC-Bench-Eval}, the first benchmark designed for multilingual hardware code completion at the repository level. Our benchmark targets completion tasks and covers three major hardware design coding styles. Each completion target is annotated with code-structure-level and hardware-oriented semantic labels derived from concrete syntax tree analysis. We conduct a comprehensive evaluation of models on MHRC-Bench-Eval. Comprehensive evaluation results and analysis demonstrate the effectiveness of MHRC-Bench.2026-01-07T08:46:10ZQingyun ZouJiahao CuiNuo ChenBingsheng HeWeng-Fai Wonghttp://arxiv.org/abs/2504.01847v3Confluence of Conditional Rewriting Modulo2026-01-31T19:58:23ZSets of equations E play an important computational role in rewriting-based systems R by defining an equivalence relation =E inducing a partition of terms into E-equivalence classes on which rewriting computations, denoted ->R/E and called *rewriting modulo E*, are issued. This paper investigates *confluence of ->R/E*, usually called *E-confluence*, for *conditional* rewriting-based systems, where rewriting steps are determined by conditional rules. We rely on Jouannaud and Kirchner's framework to investigate confluence of an abstract relation R modulo an abstract equivalence relation E on a set A. We show how to particularize the framework to be used with conditional systems. Then, we show how to define appropriate finite sets of *conditional pairs* to prove and disprove E-confluence. In particular, we introduce *Logic-based Conditional Critical Pairs* which do not require the use of (often infinitely many) E-unifiers to provide a finite representation of the *local peaks* considered in the abstract framework. We also introduce *parametric Conditional Variable Pairs* which are essential to deal with conditional rules in the analysis of E-confluence. Our results apply to well-known classes of rewriting-based systems. In particular, to *Equational (Conditional) Term Rewriting Systems*. In this realm, our E-confluence results strictly subsume previous approaches by Huet and Jouannaud and Kirchner.2025-04-02T15:55:06Z83 pages. 17 figures. 8 tablesSalvador Lucashttp://arxiv.org/abs/2508.03435v2StoneDetector: Conventional and versatile code clone detection for Java2026-01-31T12:13:35ZCopy & paste is a widespread practice when developing software and, thus, duplicated and subsequently modified code occurs frequently in software projects. Since such code clones, i.e., identical or similar fragments of code, can bloat software projects and cause issues like bug or vulnerability propagation, their identification is of importance. In this paper, we present StoneDetector and its underlying method for finding code clones in Java source and Bytecode. StoneDetector implements a conventional clone detection approach based upon the textual comparison of paths derived from the code's representation by dominator trees. In this way, the tool does not only find exact and syntactically similar near-miss code clones, but also code clones that are harder to detect due to their larger variety in the syntax. We demonstrate StoneDetector's versatility as a conventional clone detection tool and analyze its various available configuration parameters, including the usage of different string metrics, hashing algorithms, etc. In our exhaustive evaluation with other conventional clone detectors on several state-of-the-art benchmarks, we can show StoneDetector's performance and scalability in finding code clones in both, Java source and Bytecode.2025-08-05T13:23:13ZThis is the accepted manuscript of a paper accepted for publication in The Journal of Systems and Software (Elsevier), the final published version will be available via the journalThomas S. HeinzeAndré SchäferWolfram Amme10.1016/j.jss.2026.112799