https://arxiv.org/api/IWexxe3lY8Ka/XL7WgLKe0OS8Ds2026-06-14T14:33:03Z988539015http://arxiv.org/abs/2604.06879v1Determinacy with Priorities up to Clocks2026-04-08T09:38:26ZIn Milner's seminal book on communication and concurrency introducing CCS, a process algebra inherently non-deterministic, chapter 11 was completely devoted to introduce the notion of determinacy and confluence in order to identify a subcalculus of CCS in which all definable agents are confluent. At the same time, or shortly later, determinate semantics were given for programming languages that reconcile concurrency and determinacy, such as Esterel by Berry and Gonthier, or SL by Boussinot and de Simone. These dedicated semantics do not easily map to Milner's confluence theory for CCS, which is unable to express causality and shared memory multi-threading with reaction to absence in a compositional way. We present an extension of CCS with priority-guarded actions and clocks, and we exploit the added expressiveness to enrich Milner's original notion of confluence by the new concept of coherence which permits us to encode, in a compositional fashion, synchronous programming languages such as Esterel.2026-04-08T09:38:26ZIn Proceedings PLACES 2026, arXiv:2604.05737EPTCS 444, 2026, pp. 79-89Luigi LiquoriCentre Inria de l'Université Côte d'AzurMichael MendlerUniversity of BambergClaude StolzeUniversity of Bamberg10.4204/EPTCS.444.8http://arxiv.org/abs/2604.06878v1Towards Multiparty Session Types for Highly-Concurrent and Fault-Tolerant Web Applications2026-04-08T09:38:11ZModern web applications combine persistent state updates, concurrent interactions, and unreliable communication with external services. Failures such as timeouts can occur after partial state changes, producing temporary inconsistencies whose resolution depends on liveness properties that are often not verified in practice. Although formal methods offer rigorous guarantees for reasoning about complex software, they remain rarely adopted in enterprise settings due to their perceived complexity and lack of practical automation. Multiparty Session Types (MPST) offer strong guarantees for communication safety, yet they do not account for the interplay between state evolution, dynamic workflow structure, and failure behaviour that are essential for reasoning about the correctness of real web applications.
This paper introduces a global-type framework that equips MPST with explicit failure semantics and dynamic participation. We define the syntax and operational semantics of these enriched global types and establish core properties, including coherence preservation. This foundation enables formal reasoning about communications in web applications where failures may occur, and lays the groundwork for future stateful extensions and automated verification of liveness properties.2026-04-08T09:38:11ZIn Proceedings PLACES 2026, arXiv:2604.05737EPTCS 444, 2026, pp. 68-78Richard CasettaBNP Paribas, Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LIGNils GesbertUniv. Grenoble Alpes, Inria, CNRS, Grenoble INP, LIGPierre GenevèsUniv. Grenoble Alpes, Inria, CNRS, Grenoble INP, LIG10.4204/EPTCS.444.7http://arxiv.org/abs/2604.06875v1Branching Out: Existential External Choice in Effpi2026-04-08T09:37:23ZEffpi is a framework for writing strongly-typed message-passing programs in Scala, where the compiler enforces the conformance of process implementations to specified protocol types. A compiler plugin is provided to verify properties of protocols, such as deadlock-freedom and liveness, by encoding the behavioural types into a variant of CCS.
To address limitations in the expressiveness of the existing toolkit, we extend Effpi with external choice by introducing a branching operation. Upon accepting a message via a branch, protocols enforce a continuation which depends on the label (type) of the received message. We equip the branching operation with the ability to accept messages over more than one channel. Additionally, we introduce a "catch timeout" operation to allow processes to gracefully handle a lack of incoming messages. The enhanced expressiveness of Effpi is demonstrated through a number of examples, including an implementation of the Raft consensus algorithm.2026-04-08T09:37:23ZIn Proceedings PLACES 2026, arXiv:2604.05737EPTCS 444, 2026, pp. 34-44Benjamin RobinsonUniversity of OxfordNobuko YoshidaUniversity of Oxford10.4204/EPTCS.444.4http://arxiv.org/abs/2604.06874v1Modelling Distributed Applications with Mixed-Choice Stateful Typestates2026-04-08T09:37:08ZDistributed systems have become increasingly prevalent in the software industry. Due to their intrinsic complexity, much research has focused on the verification of their behaviour. An active research line is around behaviour models that capture these protocols - e.g., session types, or typestates - allowing their static verification.
Correctly designing distributed protocols is not trivial. Their communication behaviour is typically implicitly defined via asynchronous message handlers, making errors harder to detect until execution. While typestates can ease the design process by explicitly defining correct sequences of operations, they struggle in two ways: they lack the expressiveness to define quantitative constraints that govern distributed protocols (i.e., number of acknowledgements for a quorum); and they assume strict sequencing of operations, failing to capture concurrent input/output actions in a state, typical of the distributed setting. Furthermore, runtime network failures cannot be statically verified.
We present a probabilistic runtime solution extending typestates with: (i) an internal mutable state for the expression of quantitative constraints; (ii) mixed sessions to represent concurrent input and output actions; (iii) expected ratios for the number of actions in a state, with monitoring semantics to detect deviations from an expected behaviour at runtime.
We demonstrate the suitability of our solution with two examples that motivated our approach: an acknowledgement protocol with a participant that sends several messages while waiting for a response, effectively modelling input and output operations in a state; and a voting protocol whose participants try to achieve consensus on a single bit using a quorum, thus, requiring an internal mutable state, while respecting a pre-defined distribution for the volume of exchanged messages.
2026-04-08T09:37:08ZIn Proceedings PLACES 2026, arXiv:2604.05737EPTCS 444, 2026, pp. 23-33Francisco ParrinhaNOVA LINCS and NOVA FCT, Lisbon, PortugalJoão MotaNOVA LINCS and NOVA FCT, Lisbon, PortugalAntónio RavaraNOVA LINCS and NOVA FCT, Lisbon, Portugal10.4204/EPTCS.444.3http://arxiv.org/abs/2407.18220v2Detecting and Explaining (In-)equivalence of Context-Free Grammars2026-04-08T09:30:37ZWe propose a scalable framework for deciding, proving, and explaining (in-)equivalence of context-free grammars. We present an implementation of the framework and evaluate it on large data sets collected within educational support systems. Even though the equivalence problem for context-free languages is undecidable in general, the framework is able to handle a large portion of these datasets. It introduces and combines techniques from several areas, such as an abstract grammar transformation language to identify equivalent grammars as well as sufficiently similar inequivalent grammars, theory-based comparison algorithms for a large class of context-free languages, and a graph-theory-inspired grammar canonization that allows to efficiently identify isomorphic grammars.2024-07-25T17:36:18ZExtended version of article published in Proc. ACM Program. Lang., Vol. 9, No. OOPSLA2, Article 378 (Oct 2025)Marko SchmellenkampThomas ZeumeSven ArgoSandra KieferCedric SiemsFynn Stebel10.1145/3763156http://arxiv.org/abs/2508.20340v4Once4All: Skeleton-Guided SMT Solver Fuzzing with LLM-Synthesized Generators2026-04-08T09:06:50ZSatisfiability Modulo Theory (SMT) solvers are foundational to modern systems and programming languages research, providing the foundation for tasks like symbolic execution and automated verification. Because these solvers sit on the critical path, their correctness is essential, and high-quality test formulas are key to uncovering bugs. However, while prior testing techniques performed well on earlier solver versions, they struggle to keep pace with rapidly evolving features. Recent approaches based on Large Language Models (LLMs) show promise in exploring advanced solver capabilities, but two obstacles remain: nearly half of the generated formulas are syntactically invalid, and iterative interactions with LLMs introduce substantial computational overhead. In this study, we present Once4All, a novel LLM-assisted fuzzing framework that addresses both issues by shifting from direct formula generation to the synthesis of generators for reusable terms (i.e., logical expressions). Specifically, Once4All uses LLMs to (1) automatically extract context-free grammars (CFGs) for SMT theories, including solver-specific extensions, from documentation, and (2) synthesize composable Boolean term generators that adhere to these grammars. During fuzzing, Once4All populates structural skeletons derived from existing formulas with the terms iteratively produced by the LLM-synthesized generators. This design ensures syntactic validity while promoting semantic diversity. Notably, Once4All requires only one-time LLM interaction investment, dramatically reducing runtime cost. We evaluated Once4All on two leading SMT solvers: Z3 and cvc5. Our experiments show that Once4All has identified 43 confirmed bugs, 40 of which have already been fixed by developers.2025-08-28T01:21:26ZAccepted at ASPLOS 2026Maolin SunYibiao YangYuming Zhouhttp://arxiv.org/abs/2604.06533v1Parametrizing Reads-From Equivalence for Predictive Monitoring2026-04-08T00:09:57ZPredictive runtime monitoring asks whether an execution $σ$ of a concurrent program can be used to \emph{soundly predict} the existence of a reordering $ρ$ of $σ$ that satisfies a property $\varphi$.
Its effectiveness and efficiency depend on two factors: (a) the complexity of $\varphi$, and (b) the expressive power of the reorderings considered. At one extreme, allowing all reorderings induced by \emph{reads-from equivalence} makes predictive monitoring intractable, even for simple properties such as data races. At the other extreme, restricting to commutativity-based reorderings (Mazurkiewicz trace equivalence) yields efficient algorithms for simple properties, but remains intractable for general regular specifications and offers limited predictive power.
We address this tradeoff via \emph{parametrization}. We introduce \emph{sliced reorderings} and their generalization, \emph{$k$-sliced reorderings}. Informally, $ρ$ is a $k$-sliced reordering of $σ$ if $σ$ can be partitioned into $k+1$ ordered subsequences whose concatenation yields $ρ$, while preserving program order and reads-from constraints.
Our results are twofold. First, $k$-sliced reorderings form a strictly increasing hierarchy that converges to reads-from equivalence as $k$ grows. Second, for any fixed $k$, predictive monitoring modulo $k$-sliced reorderings against any regular specification admits a constant-space streaming algorithm.
Together, these results establish $k$-sliced reorderings as a principled alternative to existing equivalences, enabling a uniform parametrized framework where expressive power can be systematically traded off against computational cost.2026-04-08T00:09:57ZAzadeh FarzanUmang Mathurhttp://arxiv.org/abs/2510.23642v2VisCoder2: Building Multi-Language Visualization Coding Agents2026-04-07T20:59:49ZLarge language models (LLMs) have recently enabled coding agents capable of generating, executing, and revising visualization code. However, existing models often fail in practical workflows due to limited language coverage, unreliable execution, and lack of iterative correction mechanisms. Progress has been constrained by narrow datasets and benchmarks that emphasize single-round generation and single-language tasks. To address these challenges, we introduce three complementary resources for advancing visualization coding agents. VisCode-Multi-679K is a large-scale, supervised dataset containing 679K validated and executable visualization samples with multi-turn correction dialogues across 12 programming languages. VisPlotBench is a benchmark for systematic evaluation, featuring executable tasks, rendered outputs, and protocols for both initial generation and multi-round self-debug. Finally, we present VisCoder2, a family of multi-language visualization models trained on VisCode-Multi-679K. Experiments show that VisCoder2 significantly outperforms strong open-source baselines and approaches the performance of proprietary models like GPT-4.1, with further gains from iterative self-debug, reaching 82.4% overall execution pass rate at the 32B scale, particularly in symbolic or compiler-dependent languages.2025-10-24T18:03:57ZYuansheng NiSongcheng CaiXiangchao ChenJiarong LiangZhiheng LyuJiaqi DengKai ZouPing NieFei YuanXiang YueWenhu Chenhttp://arxiv.org/abs/2509.03318v2Semantically Reflected Programs2026-04-07T18:17:03ZThis paper addresses the dichotomy between the formalization of structural and the formalization of behavioral knowledge by means of semantically lifted programs, which explore an intuitive connection between programs and knowledge graphs. While knowledge graphs and ontologies are eminently useful to represent formal knowledge about a system's individuals and universals, programming languages are designed to describe the system's evolution. To address this dichotomy, we introduce a semantic lifting of the program states of an executing program into a knowledge graph, for an object-oriented programming language. The resulting graph is exposed as a semantic reflection layer within the programming language, allowing programmers to leverage knowledge of the application domain in their programs. In this paper, we formalize semantic lifting and semantic reflection for a small programming language, SMOL, explain the operational aspects of the language, and consider type correctness and virtualisation for runtime program queries through the semantic reflection layer. We illustrate semantic lifting and semantic reflection through a case study of geological modelling and discuss different applications of the technique. The language implementation is open source and available online.2025-09-03T13:52:50ZTransactions on Graph Data and Knowledge (TGDK), Volume 4, Issue 1, pp. 3:1-3:52, Schloss Dagstuhl(2026)Eduard KamburjanVidar Norstein KlungreYuanwei QuRudolf SchlatteEgor V. KostylevMartin GieseEinar Broch Johnsen10.4230/TGDK.4.1.3http://arxiv.org/abs/2503.19797v2Fail Faster: Staging and Fast Randomness for High-Performance PBT2026-04-07T17:18:58ZProperty-based testing (PBT) relies on generators for random test cases, often constructed using embedded domain specific languages, which provide expressive combinators for building and composing generators. The effectiveness of PBT depends critically on the speed of these generators. However, careful measurements show that the generator performance of widely used PBT libraries falls well short of what is possible, due principally to (1) the abstraction overhead of their combinator-heavy style and (2) suboptimal sources of randomness. We characterize, quantify, and address these bottlenecks.
To eliminate abstraction overheads, we propose a technique based on multi-stage programming, dubbed Allegro. We apply this technique to leading generator libraries in OCaml and Scala 3, significantly improving performance. To quantify the performance impact of the randomness source, we carry out a controlled experiment, replacing the randomness in the OCaml PBT library with an optimized version. Both interventions exactly preserve the semantics of generators, enabling precise, pointwise comparisons. Together, these improvements find bugs up to $13\times$ faster.2025-03-25T16:04:45Z25 pages, 18 figures, under review at OOPSLACynthia RicheyJoseph W. CutlerHarrison GoldsteinBenjamin C. Piercehttp://arxiv.org/abs/2604.05865v1JTON: A Token-Efficient JSON Superset with Zen Grid Tabular Encoding for Large Language Models2026-04-07T13:26:23ZWhen LLMs process structured data, the serialization format directly affects cost and context utilization. Standard JSON wastes tokens repeating key names in every row of a tabular array--overhead that scales linearly with row count. This paper presents JTON (JSON Tabular Object Notation), a strict JSON superset whose main idea, Zen Grid, factors column headers into a single row and encodes values with semicolons, preserving JSON's type system while cutting redundancy. Across seven real-world domains, Zen Grid reduces token counts by 15-60% versus JSON compact (28.5% average; 32% with bare_strings). Comprehension tests on 10 LLMs show a net +0.3 pp accuracy gain over JSON: four models improve, three hold steady, and three dip slightly. Generation tests on 12 LLMs yield 100% syntactic validity in both few-shot and zero-shot settings. A Rust/PyO3 reference implementation adds SIMD-accelerated parsing at 1.4x the speed of Python's json module. Code, a 683-vector test suite, and all experimental data are publicly available.2026-04-07T13:26:23Z20 pages, 13 figures, 14 tables. Code and test suite available at https://github.com/gowthamkumar-nandakishore/JTONGowthamkumar Nandakishorehttp://arxiv.org/abs/2603.13142v4Critical Sections Are Not Per-Thread: A Trace Semantics for Lock-Based Concurrency2026-04-07T12:32:04ZLocks are a standard mechanism for synchronizing concurrent threads. The standard lock set construction assumes that critical sections are confined to a single thread, and therefore only accounts for locks acquired within that thread. Traditional definitions of critical sections implicitly assume that protected events belong to the same thread. We demonstrate that this assumption does not hold for general C/Pthread executions.
Using a trace model that captures the essence of C/Pthread programs, we give a trace-based characterization of critical sections that does not impose a per-thread restriction. As a result, critical sections may span multiple threads. Such \emph{multi-thread} critical sections arise naturally in real programs and close a semantic gap in the standard lock set construction.2026-03-13T16:32:16Z- clear distinction between open and closed critical sections (reflected in the definitions, statements and proofs) - some grammar and latex fixesMartin Sulzmannhttp://arxiv.org/abs/2604.05737v1Proceedings 17th Workshop on Programming Language Approaches to Concurrency and Communication-cEntric Software2026-04-07T11:39:22ZThis volume contains the proceedings of PLACES 2026, the 17th edition of the Workshop on Programming Language Approaches to Concurrency and Communication-cEntric Software. The workshop is scheduled to take place in Turin, Italy, on April 11, 2026, as a satellite event of ETAPS, the European Joint Conferences on Theory and Practice of Software. PLACES offers a forum for exchanging new ideas on how to address the challenges of concurrent and distributed programming and how to improve the foundations of modern and future computer applications. PLACES welcomes researchers from various fields, and its topics include the design of new programming languages, models for concurrent and distributed systems, type systems, program verification, and applications in various areas (e.g., microservices, sensor networks, blockchains, event processing, business process management). 2026-04-07T11:39:22ZEPTCS 444, 2026Kirstin PetersLorenzo Gheri10.4204/EPTCS.444http://arxiv.org/abs/2507.13091v3Formal Verification for JavaScript Regular Expressions: a Proven Semantics and its Applications (Extended Version)2026-04-07T08:10:14ZWe present the first mechanized, succinct, practical, complete, and proven-faithful semantics for a modern regular expression language with backtracking semantics. We ensure its faithfulness by proving it equivalent to a preexisting line-by-line embedding of the official ECMAScript specification of JavaScript regular expressions. We demonstrate its practicality by presenting two real-world applications. First, a new notion of contextual equivalence for modern regular expressions, which we use to prove or disprove rewrites drawn from previous work. Second, the first formal proof of the PikeVM algorithm used in many real-world engines. In contrast with the specification and other formalization work, our semantics captures not only the top-priority match, but a full backtracking tree recording all possible matches and their respective priority. All our definitions and results have been mechanized in the Rocq proof assistant.2025-07-17T13:02:00Z27 pages, 3 pages of references, 6 pages of appendixAurèle BarrièreVictor DengClément Pit-Claudelhttp://arxiv.org/abs/2402.09664v6CodeMind: Evaluating Large Language Models for Code Reasoning2026-04-07T05:36:18ZLarge Language Models (LLMs) have been widely used to automate programming tasks. Their capabilities have been evaluated by assessing the quality of generated code through tests or proofs. The extent to which they can reason about code is a critical question revealing important insights about their true capabilities. This paper introduces CodeMind, a framework designed to gauge the code reasoning abilities of LLMs through the following explicit and implicit code reasoning tasks: Independent Execution Reasoning (IER), Specification Reasoning (SR) and Dynamic Semantics Reasoning (DSR). The first evaluates the abilities of LLMs to simulate the execution of given inputs to a code and predict the output (IER). The second assesses the abilities of LLMs to incorporate the simulation of test data in the specification into code generation (SR). Finally, CodeMind evaluates LLMs' abilities to understand overall code semantics only given a specific input/output (DSR). Our extensive evaluation of ten LLMs across four widely used benchmarks using CodeMind shows that LLMs, depending on their size and training strategy, can reason about some dynamic aspects of code. However, their performance drops for code with higher complexity, non-trivial logical and arithmetic operators, non-primitive types, and API calls. We show that these reasoning tasks evaluate LLMs differently, and a comprehensive evaluation of code reasoning requires them all. Finally, we show that the performance of LLMs in bug repair is not correlated with any of the code reasoning tasks, and except for advanced frontier models, other LLMs do not incorporate code reasoning when performing bug repair.2024-02-15T02:24:46ZChangshu LiuYang ChenReyhaneh Jabbarvand