https://arxiv.org/api/kGI3qJbDnK+g2YaAolRr9tJYQIg 2026-06-15T01:18:48Z 9886 540 15 http://arxiv.org/abs/2603.05645v1 Pitfalls in VM Implementation on CHERI: Lessons from Porting CRuby 2026-03-05T19:55:24Z

CHERI (Capability Hardware Enhanced RISC Instructions) is a novel hardware designed to address memory safety issues. By replacing traditional pointers with hardware capabilities, it enhances security in modern software systems. A Virtual Machine (VM) is one such system that can benefit from CHERI's protection, as it may contain latent memory vulnerabilities. However, developing and porting VMs to CHERI is a non-trivial task. There are many subtle pitfalls from the assumptions on the undefined behaviors of the C language made based on conventional architectures. Those assumptions conflict with CHERI's stricter memory safety model, causing unexpected failures. Although several prior works have discussed the process of porting VMs, they focus on the overall porting process instead of the pitfalls for VM implementation on CHERI. The guide for programming in CHERI exists, but it is for general programming, not addressing VM-specific issues. We have ported CRuby to CHERI as a case study and surveyed previous works on porting VMs to CHERI. We categorized and discussed the issues found based on their causes. In this paper, we illustrate the VM-specific pitfalls for each category. Most of the pitfalls arise from the undefined behaviors in the C language; in particular, implementation techniques and idioms of VMs often assume behaviors of traditional architectures that are invalid on CHERI. We also discuss workarounds for them and the impacts of those workarounds. We verified the validity of the workarounds by applying them to our CRuby port and by surveying the codebases of prior case studies. This work contributes to the body of knowledge on developing and porting VMs to CHERI and will help guide efforts toward constructing safer VMs.

2026-03-05T19:55:24Z The Art, Science, and Engineering of Programming, 2026, Vol. 11, Issue 1, Article 2 Hanhaotian Liu University of Tokyo, Japan Tetsuro Yamazaki University of Tokyo, Japan Tomoharu Ugawa University of Tokyo, Japan 10.22152/programming-journal.org/2026/11/2 http://arxiv.org/abs/2603.05644v1 Hybrid Structured Editing: Structures for Tools, Text for Users 2026-03-05T19:55:07Z

In programming, better tools often yield better results. For that, modern programming environments offer mechanisms to allow for their extensibility. The closer those tools are to the code, the easier it is for programmers to map the information provided by a tool to the code this information is about. However, existing extension mechanisms do not facilitate the close integration of tools with textual source code. Tools must be able to track program structures across edits to appear at the right positions but the parsing step of text complicates tracking structures. We propose hybrid structured editing, an approach that supports tool builders by providing structural guarantees while providing tool users with a familiar and consistent text editing interface. Hybrid structured editing allows tool builders to declare constraints on the structure that a program must conform to and ensures their observance. We present an implementation and several case studies of tools based on hybrid structured editing to demonstrate its effectiveness. Hybrid structured editing supports the safe extension of programming environments with tools that work on a structured representation of code and provide a consistent and reliable user experience.

2026-03-05T19:55:07Z The Art, Science, and Engineering of Programming, 2026, Vol. 11, Issue 1, Article 1 Tom Beckmann Hasso Plattner Institute, Germany / University of Potsdam, Germany Christoph Thiede Hasso Plattner Institute, Germany / University of Potsdam, Germany Jens Lincke Hasso Plattner Institute, Germany / University of Potsdam, Germany Robert Hirschfeld Hasso Plattner Institute, Germany / University of Potsdam, Germany 10.22152/programming-journal.org/2026/11/1 http://arxiv.org/abs/2603.04476v1 iScript: A Domain-Adapted Large Language Model and Benchmark for Physical Design Tcl Script Generation 2026-03-04T15:20:35Z

Modern EDA flows rely heavily on Tcl scripting, yet general LLMs perform poorly in this domain due to extreme data scarcity, domain-specific semantics, and the high reliability required in physical design. We present iScript, a domain-adapted Qwen3-8B model for Innovus Tcl script generation, and iScript-Bench, a comprehensive benchmark covering five task categories and three difficulty levels. To overcome the lack of training data, we introduce a multi-stage data synthesis pipeline that integrates command extraction, static linting, requirement back-inference, and Chain-of-Thought generation, producing a 10K-tuple (requirement, CoT, script) dataset. iScript is trained through a two-stage strategy combining domain-adaptive pretraining and supervised fine-tuning. To evaluate script correctness efficiently, we further propose a two-step verification framework consisting of static syntax verification and LLM-based functional evaluation. On our benchmark, iScript shows higher pass@k scores than currently state-of-the-art LLMs on average. These results demonstrate the effectiveness of domain adaptation and data synthesis for EDA scripting tasks.

2026-03-04T15:20:35Z Ning Xu Zhaoyang Zhang Senlin Shu Lei Qi Jiaqi Lv Wensuo Wang Tianhao Zhao Chao Zhang Zhaoliang Yang Xiangyu Li Zhaorui Su Jingshan Li Xin Geng http://arxiv.org/abs/2501.05259v2 Reversible Computation with Stacks and "Reversible Management of Failures" 2026-03-04T12:52:42Z

This work examines approaches to making computational models reversible. Broadly speaking, transforming a computational model into a reversible one, i.e. reversibilizing it, means extending its operational semantics conservatively in a way that each term of the model is interpretable as a bijection. We recall that the most common strategy to reversibilize a computational model yields operational semantics that halts computations whenever a computational state cannot be uniquely determined from its successor state, thereby allowing terms to be interpreted as partial bijective functions. We are interested in reversible computational models whose terms can be interpreted as total bijective functions. This is essential for studying aspects of computational complexity related to reversible computational models. We introduce SCORE, a language designed for manipulating variables and stacks. Notably, common reversibilization strategies naturally lead to interpreting the functions for stack manipulation as partial bijections. According to our interests, we demonstrate how to interpret SCORE in a state space where, using a proof-assistant, we certify that stack operations are total bijections. It follows that all SCORE terms can be interpreted as total bijections.

2025-01-09T14:13:59Z In Proceedings LTT 2026, arXiv:2603.02912 EPTCS 441, 2026, pp. 213-226 Matteo Palazzo Dip. di Informatica, Universita' di Torino Luca Roversi Dip. di Informatica, Universita' di Torino 10.4204/EPTCS.441.13 http://arxiv.org/abs/2603.04013v1 A Core Calculus for Type-safe Product Lines of C Programs 2026-03-04T12:51:23Z

In this paper we: (1) propose Lightweight C (LC), namely a core calculus that formalizes a proper subset of the ANSI C without preprocessor directives; (2) define Colored LC (CLC), namely LC endowed with ANSI C preprocessor directives; and (3) define a type system for CLC that guarantees that all programs to be generated by the C preprocessor are well-typed C programs. We believe that the simple formalization provided by CLC could be useful also for teaching purposes. Stefano Berardi spent most of his academic career at the Department of Computer Science of the University of Turin, where he conducts outstanding research on the logical foundations of computer science and on type-based program analyses. Over the years, he taught many courses, from BSc courses on programming with C to PhD courses on program analysis. Therefore, this paper fully falls within Stefano Berardi's research and teaching activities.

2026-03-04T12:51:23Z In Proceedings LTT 2026, arXiv:2603.02912 EPTCS 441, 2026, pp. 103-125 Ferruccio Damiani University of Turin Daisuke Kimura Toho University Luca Paolini University of Turin Makoto Tatsuta National Institute of Informatics 10.4204/EPTCS.441.7 http://arxiv.org/abs/2603.04008v1 Lambdas at the Far Edge: a Tale of Flying Lambdas and Lambdas on Wheels 2026-03-04T12:50:07Z

Aggregate Programming (AP) is a paradigm for programming the collective behaviour of sets of distributed devices, possibly situated at the network far edge, by relying on asynchronous proximity-based interactions. The eXchange Calculus (XC), a recently proposed foundational model for AP, is essentially a typed lambda calculus extended with an operator (the exchange operator) providing an implicit communication mechanism between neighbour devices. This paper provides a gentle introduction to XC and to its implementation as a C++ library, called FCPP. The FCPP library and toolchain has been mainly developed at the Department of Computer Science of the University of Turin, where Stefano Berardi spent most of his academic career conducting outstanding research about logical foundation of computer science and transmitting his passion for research to students and young researchers, often exploiting typed lambda calculi. An FCCP program is essentially a typed lambda term, and FCPP has been used to write code that has been deployed on devices at the far edge of the network, including rovers and (soon) Uncrewed Aerial Vehicles (UAVs); hence the title of the paper.

2026-03-04T12:50:07Z In Proceedings LTT 2026, arXiv:2603.02912 EPTCS 441, 2026, pp. 19-45 Giorgio Audrito Department of Computer Science Daniele Bortoluzzi Department of Computer Science Ferruccio Damiani Department of Computer Science Giordano Scarso Department of Computer Science Gianluca Torta Department of Computer Science Andrea Basso MITO Technology, Milan, Italy Monica Cochi Torino Airport Lorenzo Gusman Torino Airport Lorenzo Comba Department of Agricultural, Forest and Food Sciences Paolo Gay Department of Agricultural, Forest and Food Sciences Paola Dal Zovo Concept Engineering Reply, Turin, Italy Giada Galati Eurix, Turin, Italy Francesco Gallo Eurix, Turin, Italy Aljaž Grdadolnik Faculty of Computer and Information Science University of Ljubljana, Ljubljana, Slovenia Massimo Pescarollo Department of Economics and Statistics Cognetti de Martiis, University of Turin, Turin, Italy Paola Pisano Department of Economics and Statistics, Cognetti de Martiis, University of Turin, Turin, Italy 10.4204/EPTCS.441.2 http://arxiv.org/abs/2603.04006v1 Proving and Computing: The Infinite Pigeonhole Principle and Countable Choice 2026-03-04T12:49:51Z

Structural recursion is a common technique used by programmers in modern languages and is taught to introductory computer science students. But what about its dual, structural corecursion? Structural corecursion is an elegant technique, supported in languages like Haskell and proof assistants such as Rocq or Agda. It enables the design of compositional algorithms by decoupling the generation and consumption of potentially infinite or large data collections. Despite these strengths, structural corecursion is generally considered more advanced than structural recursion and is primarily studied in the context of pure functional programming. Our aim is to illustrate the expressive power of different notions of structural corecursion in the presence of classical reasoning. More specifically, we study coiteration and corecursion combined with the classical callcc operator, which provides a computational interpretation of classical reasoning. This combination enables interesting stream-processing algorithms. As an application, we present a corecursive, control-based proof of the Infinite Pigeonhole Principle and compare it with the continuation-passing proof of Escardó and Oliva in Agda. To further demonstrate the power of mixing corecursion and control, we give an implementation of the Axiom of Countable Choice. In contrast to the usual continuation-passing implementations of this axiom, which rely on general recursion whose termination is established externally, our approach justifies termination by coiteration alone.

2026-03-04T12:49:51Z In Proceedings LTT 2026, arXiv:2603.02912 EPTCS 441, 2026, pp. 1-18 Zena M. Ariola University of Oregon Paul Downen University of Massachusetts, Lowell Hugo Herbelin Université Paris Cité, Inria, CNRS, IRIF 10.4204/EPTCS.441.1 http://arxiv.org/abs/2603.03968v1 Nominal techniques as an Agda library 2026-03-04T12:00:54Z

Nominal techniques provide a mathematically principled approach to dealing with names and variable binding in programming languages. This paper explores an attempt to make nominal techniques accessible as an Agda library. We aim for a technical victory of implementing nominal ideas; we further require a moral victory that the overhead be acceptable for practical systems. The results of this paper have been mechanised and are publicly accessible at https://omelkonian.github.io/nominal-agda/.

2026-03-04T12:00:54Z Murdoch J. Gabbay Orestis Melkonian http://arxiv.org/abs/2604.09591v1 Simplicity Scales 2026-03-04T03:25:11Z

The dominant data interchange formats encode integers using a variable number of bytes or represent floating-point numbers as variable-length UTF-8 strings. The decoder must inspect each byte for a continuation bit or parse each character individually, producing data-dependent branches that stall modern CPU pipelines. Protocol Buffers pays this cost on every integer, field tag, and length prefix. JSON pays it on every value. We present Bebop, a serialization format where every data type uses a fixed number of bytes. A 32-bit integer is always four bytes. Decoding becomes a single memory read with no conditionals. Across 19 decode workloads, Bebop decodes 9--213$\times$ faster than Protocol Buffers. On a 1536-dimension embedding vector, Bebop decodes in 2.8 nanoseconds versus 111 nanoseconds for Protocol Buffers and 4.69 microseconds for simdjson, a 1,675$\times$ gap. On records above 64 KB, the decoder achieves 86% of peak memory bandwidth. The CPU is no longer the bottleneck. We also present a transport-agnostic RPC protocol built on the same wire format. The protocol introduces batch pipelining, where dependent cross-service calls execute in a single round trip with server-side dependency resolution. It deploys over HTTP/1.1, HTTP/2, and binary transports without proxies, removing the HTTP/2 requirement that limits gRPC on serverless platforms and in browsers.

2026-03-04T03:25:11Z Andrew Sampson 6OVER3 Institute Yuta Saito GoodNotes Ronny Chan 6OVER3 Institute http://arxiv.org/abs/2603.01896v2 Agentic Code Reasoning 2026-03-04T01:17:10Z

Can LLM agents explore codebases and reason about code semantics without executing the code? We study this capability, which we call agentic code reasoning, and introduce semi-formal reasoning: a structured prompting methodology that requires agents to construct explicit premises, trace execution paths, and derive formal conclusions. Unlike unstructured chain-of-thought, semi-formal reasoning acts as a certificate: the agent cannot skip cases or make unsupported claims. We evaluate across three tasks (patch equivalence verification, fault localization, and code question answering) and show that semi-formal reasoning consistently improves accuracy on all of them. For patch equivalence, accuracy improves from 78% to 88% on curated examples and reaches 93% on real-world agent-generated patches, approaching the reliability needed for execution-free RL reward signals. For code question answering on RubberDuckBench Mohammad et al. (2026), semi-formal reasoning achieves 87% accuracy. For fault localization on Defects4J Just et al. (2014), semi-formal reasoning improves Top-5 accuracy by 5 percentage points over standard reasoning. These results demonstrate that structured agentic reasoning enables meaningful semantic code analysis without execution, opening practical applications in RL training pipelines, code review, and static program analysis.

2026-03-02T14:17:06Z Shubham Ugare Satish Chandra http://arxiv.org/abs/2603.03141v1 Efficient Dynamic Algorithms to Predict Short Races 2026-03-03T16:28:31Z

We introduce and study the problem of detecting short races in an observed trace. Specifically, for a race type $R$, given a trace $σ$ and window size $w$, the task is to determine whether there exists an $R$-race $(e_1, e_2)$ in $σ$ such that the subtrace starting with $e_1$ and ending with $e_2$ contains at most $w$ events. We present a monitoring framework for short-race prediction and instantiate the framework for happens-before and sync-preserving races, yielding efficient detection algorithms. Our happens-before algorithm runs in the same time as FastTrack but uses space that scales with $\log w$ as opposed to $\log |σ|$. For sync-preserving races, our algorithm runs faster and consumes significantly less space than SyncP. Our experiments validate the effectiveness of these short-race detection algorithms: they run more efficiently, use less memory, and detect significantly more races under the same budget, offering a reasonable balance between resource usage and predictive power.

2026-03-03T16:28:31Z Manuscript under review Minjian Zhang Mahesh Viswanathan http://arxiv.org/abs/2603.03083v1 Bidirectional Interpolation for the Lambda-Calculus -- Revisiting and Formalising Craig-Čubrić Interpolation 2026-03-03T15:26:26Z

Craig's Interpolation theorem has a wide range of applications, from mathematical logic to computer science. Proof-theoretic techniques for establishing interpolation usually follow a method first introduced by Maehara for the Sequent Calculus and then adapted by Prawitz to Natural Deduction. The result can be strengthened to a proof-relevant version, taking proof terms into account: this was first established by Čubrić in the simply-typed lambda-calculus with sums and more recently in linear, classical and intuitionistic sequent calculi. We give a new proof of Čubrić's proof-relevant interpolation theorem by building on principles of bidirectional typing, and formalise it in Rocq.

2026-03-03T15:26:26Z Meven Lennon Bertrand Alexis Saurin http://arxiv.org/abs/2603.02895v1 SpecLoop: An Agentic RTL-to-Specification Framework with Formal Verification Feedback Loop 2026-03-03T11:45:00Z

RTL implementations frequently lack up-to-date or consistent specifications, making comprehension, maintenance, and verification costly and error-prone. While prior work has explored generating specifications from RTL using large language models (LLMs), ensuring that the generated documents faithfully capture design intent remains a major challenge. We present SpecLoop, an agentic framework for RTL-to-specification generation with a formal-verification-driven iterative feedback loop. SpecLoop first generates candidate specifications and then reconstructs RTL from these specifications; it uses formal equivalence checking tools between the reconstructed RTL and the original design to validate functional consistency. When mismatches are detected, counterexamples are fed back to iteratively refine the specifications until equivalence is proven or no further progress can be made. Experiments across multiple LLMs and RTL benchmarks show that incorporating formal verification feedback substantially improves specification correctness and robustness over LLM-only baselines, demonstrating the effectiveness of verification-guided specification generation.

2026-03-03T11:45:00Z Fu-Chieh Chang Yu-Hsin Yang Hung-Ming Huang Yun-Chia Hsu Yin-Yu Lin Ming-Fang Tsai Chun-Chih Yang Pei-Yuan Wu http://arxiv.org/abs/2602.24054v2 Speak Now: Safe Actor Programming with Multiparty Session Types (Extended Version) 2026-03-03T09:41:00Z

Actor languages such as Erlang and Elixir are widely used for implementing scalable and reliable distributed applications, but the informally-specified nature of actor communication patterns leaves systems vulnerable to costly errors such as communication mismatches and deadlocks. Multiparty session types (MPSTs) rule out communication errors early in the development process, but until now, the many-sender, single-receiver nature of actor communication has made it difficult for actor languages to benefit from session types. This paper introduces Maty, the first actor language design supporting both static multiparty session typing and the full power of actors taking part in multiple sessions. Maty therefore combines the error prevention mechanism of session types with the scalability and fault tolerance of actor languages. Our main insight is to enforce session typing through a flow-sensitive effect system, combined with an event-driven programming style and first-class message handlers. Using MPSTs allows us to guarantee communication safety: a process will never send or receive an unexpected message, nor will a session get stuck because an actor is waiting for a message that will never be sent. We extend Maty to support Erlang-style supervision and cascading failure, and show that this preserves Maty's strong metatheory. We implement Maty in Scala using an API generation approach, and demonstrate the expressiveness of our model by implementing a representative sample of the widely-used Savina actor benchmark suite; an industry-supplied factory scenario; and a chat server.

2026-02-27T14:43:01Z Extended version of paper accepted at OOPSLA'26 Simon Fowler Raymond Hu http://arxiv.org/abs/2603.02637v1 StitchCUDA: An Automated Multi-Agents End-to-End GPU Programing Framework with Rubric-based Agentic Reinforcement Learning 2026-03-03T06:04:49Z

Modern machine learning (ML) workloads increasingly rely on GPUs, yet achieving high end-to-end performance remains challenging due to dependencies on both GPU kernel efficiency and host-side settings. Although LLM-based methods show promise on automated GPU kernel generation, prior works mainly focus on single-kernel optimization and do not extend to end-to-end programs, hindering practical deployment. To address the challenge, in this work, we propose StitchCUDA, a multi-agent framework for end-to-end GPU program generation, with three specialized agents: a Planner to orchestrate whole system design, a Coder dedicated to implementing it step-by-step, and a Verifier for correctness check and performance profiling using Nsys/NCU. To fundamentally improve the Coder's ability in end-to-end GPU programming, StitchCUDA integrates rubric-based agentic reinforcement learning over two atomic skills, task-to-code generation and feedback-driven code optimization, with combined rubric reward and rule-based reward from real executions. Therefore, the Coder learns how to implement advanced CUDA programming techniques (e.g., custom kernel fusion, cublas epilogue), and we also effectively prevent Coder's reward hacking (e.g., just copy PyTorch code or hardcoding output) during benchmarking. Experiments on KernelBench show that StitchCUDA achieves nearly 100% success rate on end-to-end GPU programming tasks, with 1.72x better speedup over the multi-agent baseline and 2.73x than the RL model baselines.

2026-03-03T06:04:49Z Shiyang Li Zijian Zhang Winson Chen Yuebo Luo Mingyi Hong Caiwen Ding