https://arxiv.org/api/g1E/FobmTDgsdD1V4A//E1mP/oE 2026-06-28T19:01:45Z 9951 1650 15 http://arxiv.org/abs/2504.15144v1 C2RUST-BENCH: A Minimized, Representative Dataset for C-to-Rust Transpilation Evaluation 2025-04-21T14:48:45Z

Despite the effort in vulnerability detection over the last two decades, memory safety vulnerabilities continue to be a critical problem. Recent reports suggest that the key solution is to migrate to memory-safe languages. To this end, C-to-Rust transpilation becomes popular to resolve memory-safety issues in C programs. Recent works propose C-to-Rust transpilation frameworks; however, a comprehensive evaluation dataset is missing. Although one solution is to put together a large enough dataset, this increases the analysis time in automated frameworks as well as in manual efforts for some cases. In this work, we build a method to select functions from a large set to construct a minimized yet representative dataset to evaluate the C-to-Rust transpilation. We propose C2RUST-BENCH that contains 2,905 functions, which are representative of C-to-Rust transpilation, selected from 15,503 functions of real-world programs.

2025-04-21T14:48:45Z Melih Sirlanci Carter Yagemann Zhiqiang Lin http://arxiv.org/abs/2504.15036v1 Dynamic Robustness Verification Against Weak Memory (Extended Version) 2025-04-21T11:44:58Z

Dynamic race detection is a highly effective runtime verification technique for identifying data races by instrumenting and monitoring concurrent program runs. However, standard dynamic race detection is incompatible with practical weak memory models; the added instrumentation introduces extra synchronization, which masks weakly consistent behaviors and inherently misses certain data races. In response, we propose to dynamically verify program robustness-a property ensuring that a program exhibits only strongly consistent behaviors. Building on an existing static decision procedures, we develop an algorithm for dynamic robustness verification under a C11-style memory model. The algorithm is based on "location clocks", a variant of vector clocks used in standard race detection. It allows effective and easy-to-apply defense against weak memory on a per-program basis, which can be combined with race detection that assumes strong consistency. We implement our algorithm in a tool, called RSAN, and evaluate it across various settings. To our knowledge, this work is the first to propose and develop dynamic verification of robustness against weak memory models.

2025-04-21T11:44:58Z Extended version for PLDI'25 paper Roy Margalit Michalis Kokologiannakis Shachar Itzhaky Ori Lahav http://arxiv.org/abs/2504.14340v1 Omelets Need Onions: E-graphs Modulo Theories via Bottom-up E-matching 2025-04-19T16:15:11Z

E-graphs are a data structure for equational reasoning and optimization over ground terms. One of the benefits of e-graph rewriting is that it can declaratively handle useful but difficult to orient identities like associativity and commutativity (AC) in a generic way. However, using these generic mechanisms is more computationally expensive than using bespoke routines on terms containing sets, multi-sets, linear expressions, polynomials, and binders. A natural question arises: How can one combine the generic capabilities of e-graph rewriting with these specialized theories. This paper discusses a pragmatic approach to this e-graphs modulo theories (EMT) question using two key ideas: bottom-up e-matching and semantic e-ids.

2025-04-19T16:15:11Z Submitted to EGRAPHS 2025 Philip Zucker http://arxiv.org/abs/2504.14038v1 Flowco: Rethinking Data Analysis in the Age of LLMs 2025-04-18T19:01:27Z

Conducting data analysis typically involves authoring code to transform, visualize, analyze, and interpret data. Large language models (LLMs) are now capable of generating such code for simple, routine analyses. LLMs promise to democratize data science by enabling those with limited programming expertise to conduct data analyses, including in scientific research, business, and policymaking. However, analysts in many real-world settings must often exercise fine-grained control over specific analysis steps, verify intermediate results explicitly, and iteratively refine their analytical approaches. Such tasks present barriers to building robust and reproducible analyses using LLMs alone or even in conjunction with existing authoring tools (e.g., computational notebooks). This paper introduces Flowco, a new mixed-initiative system to address these challenges. Flowco leverages a visual dataflow programming model and integrates LLMs into every phase of the authoring process. A user study suggests that Flowco supports analysts, particularly those with less programming experience, in quickly authoring, debugging, and refining data analyses.

2025-04-18T19:01:27Z Stephen N. Freund Brooke Simon Emery D. Berger Eunice Jun http://arxiv.org/abs/2412.06754v2 Probability and Angelic Nondeterminism with Multiset Semantics 2025-04-17T18:38:14Z

We introduce a version of probabilistic Kleene algebra with angelic nondeterminism and a corresponding class of automata. Our approach implements semantics via distributions over multisets in order to overcome theoretical barriers arising from the lack of a distributive law between the powerset and Giry monads. We produce a full Kleene theorem and a coalgebraic theory, as well as both operational and denotational semantics and equational reasoning principles.

2024-12-09T18:44:54Z Shawn Ong Stephanie Ma Dexter Kozen 10.1145/3729286 http://arxiv.org/abs/2505.00718v1 Productive Quantum Programming Needs Better Abstract Machines 2025-04-17T17:00:06Z

An effective, accessible abstraction hierarchy has made using and programming computers possible for people across all disciplines. Establishing such a hierarchy for quantum programming is an outstanding challenge, especially due to a proliferation of different conventions and the rapid pace of innovation. One critical portion of the hierarchy is the abstract machine, the layer that separates a programmer's mental model of the hardware from its physical realization. Drawing on historical parallels in classical computing, we explain why having the "right" quantum abstract machine (QAM) is essential for making progress in the field and propose a novel framework for evaluating QAMs based on a set of desirable criteria. These criteria capture aspects of a QAM such as universality, compactness, expressiveness, and composability, which aid in the representation of quantum programs. By defining this framework we take steps toward defining an optimal QAM. We further apply our framework to survey the landscape of existing proposals, draw comparisons, and assess them based on our criteria. While these proposals share many common strengths, we find that each falls short of our ideal. Our framework and our findings set a direction for subsequent efforts to define a future QAM that is both straightforward to map to a variety of quantum computers, and provides a stable abstraction for quantum software development.

2025-04-17T17:00:06Z 11 pages, 2 figures, 1 table Santiago Núñez-Corrales Olivia Di Matteo John Dumbell Marcus Edwards Edoardo Giusto Scott Pakin Vlad Stirbu http://arxiv.org/abs/2409.11157v2 The Incredible Shrinking Context... in a Decompiler Near You 2025-04-17T14:34:36Z

Decompilation of binary code has arisen as a highly-important application in the space of Ethereum VM (EVM) smart contracts. Major new decompilers appear nearly every year and attain popularity, for a multitude of reverse-engineering or tool-building purposes. Technically, the problem is fundamental: it consists of recovering high-level control flow from a highly-optimized continuation-passing-style (CPS) representation. Architecturally, decompilers can be built using either static analysis or symbolic execution techniques. We present Shrknr, a static-analysis-based decompiler succeeding the state-of-the-art Elipmoc decompiler. Shrknr manages to achieve drastic improvements relative to the state of the art, in all significant dimensions: scalability, completeness, precision. Chief among the techniques employed is a new variant of static analysis context: shrinking context sensitivity. Shrinking context sensitivity performs deep cuts in the static analysis context, eagerly "forgetting" control-flow history, in order to leave room for further precise reasoning. We compare Shrnkr to state-of-the-art decompilers, both static-analysis- and symbolic-execution-based. In a standard benchmark set, Shrnkr scales to over 99.5% of contracts (compared to ~95%), covers (i.e., reaches and manages to decompile) 67% more code, and reduces key imprecision metrics by over 65%.

2024-09-17T13:10:38Z Full version of ISSTA 2025 paper Sifis Lagouvardos Yannis Bollanos Neville Grech Yannis Smaragdakis http://arxiv.org/abs/2504.08876v2 Is Productivity in Quantum Programming Equivalent to Expressiveness? 2025-04-17T14:17:26Z

The expressiveness of quantum programming languages plays a crucial role in the efficient and comprehensible representation of quantum algorithms. Unlike classical programming languages, which offer mature and well-defined abstraction mechanisms, quantum languages must integrate cognitively challenging concepts such as superposition, interference and entanglement while maintaining clarity and usability. However, identifying and characterizing differences in expressiveness between quantum programming paradigms remains an open area of study. Our work investigates the landscape of expressiveness through a comparative analysis of hosted quantum programming languages such as Qiskit, Cirq, Qrisp, and quAPL, and standalone languages including Q# and Qmod. We focused on evaluating how different quantum programming languages support the implementation of core quantum algorithms -- Deutsch-Jozsa, Simon, Bernstein-Vazirani, and Grover -- using expressiveness metrics: Lines of Code (LOC), Cyclomatic Complexity (CC), and Halstead Complexity (HC) metrics as proxies for developer productivity. Our findings suggest that different quantum programming paradigms offer distinct trade-offs between expressiveness and productivity, highlighting the importance of language design in quantum software development.

2025-04-11T15:04:16Z 11 pages, 6 figures Francini Corrales-Garro Danny Valerio-Ramírez Santiago Núñez-Corrales http://arxiv.org/abs/2311.11851v6 Crash-Stop Failures in Asynchronous Multiparty Session Types 2025-04-17T10:33:00Z

Session types provide a typing discipline for message-passing systems. However, their theory often assumes an ideal world: one in which everything is reliable and without failures. Yet this is in stark contrast with distributed systems in the real world. To address this limitation, we introduce a new asynchronous multiparty session types (MPST) theory with crash-stop failures, where processes may crash arbitrarily and cease to interact after crashing. We augment asynchronous MPST and processes with crash handling branches, and integrate crash-stop failure semantics into types and processes. Our approach requires no user-level syntax extensions for global types, and features a formalisation of global semantics, which captures complex behaviours induced by crashed/crash handling processes. Our new theory covers the entire spectrum, ranging from the ideal world of total reliability to entirely unreliable scenarios where any process may crash, using optional reliability assumptions. Under these assumptions, we demonstrate the sound and complete correspondence between global and local type semantics, which guarantee deadlock-freedom, protocol conformance, and liveness of well-typed processes by construction, even in the presence of crashes.

2023-11-20T15:38:04Z arXiv admin note: substantial text overlap with arXiv:2305.06238 Logical Methods in Computer Science, Volume 21, Issue 2 (April 18, 2025) lmcs:12622 Adam D. Barwell Ping Hou Nobuko Yoshida Fangyi Zhou 10.46298/lmcs-21(2:5)2025 http://arxiv.org/abs/2312.07263v2 A Saturation-Based Unification Algorithm for Higher-Order Rational Patterns 2025-04-16T20:50:24Z

Higher-order unification has been shown to be undecidable. Miller discovered the pattern fragment and subsequently showed that higher-order pattern unification is decidable and has most general unifiers. We extend the algorithm to higher-order rational terms (a.k.a. regular Böhm trees, a form of cyclic $λ$-terms) and show that pattern unification on higher-order rational terms is decidable and has most general unifiers. We prove the soundness and completeness of the algorithm.

2023-12-12T13:44:34Z Zhibo Chen Frank Pfenning http://arxiv.org/abs/2504.12464v1 Canonicity for Cost-Aware Logical Framework via Synthetic Tait Computability 2025-04-16T19:56:15Z

In the original work on the cost-aware logical framework by Niu et al., a dependent variant of the call-by-push-value language for cost analysis, the authors conjectured that the canonicity property of the type theory can be succinctly proved via Sterling's synthetic Tait computability. This work resolves the conjecture affirmatively.

2025-04-16T19:56:15Z Runming Li Robert Harper http://arxiv.org/abs/2311.05740v2 Generating Pragmatic Examples to Train Neural Program Synthesizers 2025-04-16T18:08:02Z

Programming-by-example is the task of synthesizing a program that is consistent with a set of user-provided input-output examples. As examples are often an under-specification of one's intent, a good synthesizer must choose the intended program from the many that are consistent with the given set of examples. Prior work frames program synthesis as a cooperative game between a listener (that synthesizes programs) and a speaker (a user choosing examples), and shows that models of computational pragmatic inference are effective in choosing the user intended programs. However, these models require counterfactual reasoning over a large set of programs and examples, which is infeasible in realistic program spaces. In this paper, we propose PraX, a novel way to amortize this search with neural networks. We sample pairs of programs and examples via self-play between listener and speaker models, and use pragmatic inference to choose informative training examples from this sample. We then use the informative dataset to train models to improve the synthesizer's ability to disambiguate user-provided examples without human supervision. We validate PraX on the challenging task of synthesizing regular expressions from example strings, and find that our method (1) outperforms models trained without choosing pragmatic examples by 23% (a 51% relative increase) (2) matches the performance of supervised learning on a dataset of pragmatic examples provided by humans, despite using no human data in training.

2023-11-09T20:53:00Z ICLR 2024 Saujas Vaduguru Daniel Fried Yewen Pu http://arxiv.org/abs/2503.16514v3 VeriMind: Agentic LLM for Automated Verilog Generation with a Novel Evaluation Metric 2025-04-16T14:58:48Z

Designing Verilog modules requires meticulous attention to correctness, efficiency, and adherence to design specifications. However, manually writing Verilog code remains a complex and time-consuming task that demands both expert knowledge and iterative refinement. Leveraging recent advancements in large language models (LLMs) and their structured text generation capabilities, we propose VeriMind, an agentic LLM framework for Verilog code generation that significantly automates and optimizes the synthesis process. Unlike traditional LLM-based code generators, VeriMind employs a structured reasoning approach: given a user-provided prompt describing design requirements, the system first formulates a detailed train of thought before the final Verilog code is generated. This multi-step methodology enhances interpretability, accuracy, and adaptability in hardware design. In addition, we introduce a novel evaluation metric-pass@ARC-which combines the conventional pass@k measure with Average Refinement Cycles (ARC) to capture both success rate and the efficiency of iterative refinement. Experimental results on diverse hardware design tasks demonstrated that our approach achieved up to $8.3\%$ improvement on pass@k metric and $8.1\%$ on pass@ARC metric. These findings underscore the transformative potential of agentic LLMs in automated hardware design, RTL development, and digital system synthesis.

2025-03-15T23:43:06Z Bardia Nadimi Ghali Omar Boutaib Hao Zheng http://arxiv.org/abs/2504.12031v1 Proof-Carrying Neuro-Symbolic Code 2025-04-16T12:42:18Z

This invited paper introduces the concept of "proof-carrying neuro-symbolic code" and explains its meaning and value, from both the "neural" and the "symbolic" perspectives. The talk outlines the first successes and challenges that this new area of research faces.

2025-04-16T12:42:18Z Invited paper at CiE 2025. arXiv admin note: text overlap with arXiv:2501.05867 Ekaterina Komendantskaya http://arxiv.org/abs/2502.15382v2 Verified Parameterized Choreographies Technical Report 2025-04-15T13:37:18Z

This technical report contains the full set of definitions and projection rules of the paper ``Verified Parameterized Choreographies'' by Rubbens et al. It also supplements the artefact.

2025-02-21T11:06:24Z Changes 2025-04-15: - Updated DOI to artefact Robert Rubbens Petra van den Bos Marieke Huisman