https://arxiv.org/api/g1E/FobmTDgsdD1V4A//E1mP/oE2026-06-28T19:01:45Z9951165015http://arxiv.org/abs/2504.15144v1C2RUST-BENCH: A Minimized, Representative Dataset for C-to-Rust Transpilation Evaluation2025-04-21T14:48:45ZDespite the effort in vulnerability detection over the last two decades, memory safety vulnerabilities continue to be a critical problem. Recent reports suggest that the key solution is to migrate to memory-safe languages. To this end, C-to-Rust transpilation becomes popular to resolve memory-safety issues in C programs. Recent works propose C-to-Rust transpilation frameworks; however, a comprehensive evaluation dataset is missing. Although one solution is to put together a large enough dataset, this increases the analysis time in automated frameworks as well as in manual efforts for some cases. In this work, we build a method to select functions from a large set to construct a minimized yet representative dataset to evaluate the C-to-Rust transpilation. We propose C2RUST-BENCH that contains 2,905 functions, which are representative of C-to-Rust transpilation, selected from 15,503 functions of real-world programs.2025-04-21T14:48:45ZMelih SirlanciCarter YagemannZhiqiang Linhttp://arxiv.org/abs/2504.15036v1Dynamic Robustness Verification Against Weak Memory (Extended Version)2025-04-21T11:44:58ZDynamic race detection is a highly effective runtime verification technique for identifying data races by instrumenting and monitoring concurrent program runs. However, standard dynamic race detection is incompatible with practical weak memory models; the added instrumentation introduces extra synchronization, which masks weakly consistent behaviors and inherently misses certain data races. In response, we propose to dynamically verify program robustness-a property ensuring that a program exhibits only strongly consistent behaviors. Building on an existing static decision procedures, we develop an algorithm for dynamic robustness verification under a C11-style memory model. The algorithm is based on "location clocks", a variant of vector clocks used in standard race detection. It allows effective and easy-to-apply defense against weak memory on a per-program basis, which can be combined with race detection that assumes strong consistency. We implement our algorithm in a tool, called RSAN, and evaluate it across various settings. To our knowledge, this work is the first to propose and develop dynamic verification of robustness against weak memory models.2025-04-21T11:44:58ZExtended version for PLDI'25 paperRoy MargalitMichalis KokologiannakisShachar ItzhakyOri Lahavhttp://arxiv.org/abs/2504.14340v1Omelets Need Onions: E-graphs Modulo Theories via Bottom-up E-matching2025-04-19T16:15:11ZE-graphs are a data structure for equational reasoning and optimization over ground terms. One of the benefits of e-graph rewriting is that it can declaratively handle useful but difficult to orient identities like associativity and commutativity (AC) in a generic way. However, using these generic mechanisms is more computationally expensive than using bespoke routines on terms containing sets, multi-sets, linear expressions, polynomials, and binders. A natural question arises: How can one combine the generic capabilities of e-graph rewriting with these specialized theories. This paper discusses a pragmatic approach to this e-graphs modulo theories (EMT) question using two key ideas: bottom-up e-matching and semantic e-ids.2025-04-19T16:15:11ZSubmitted to EGRAPHS 2025Philip Zuckerhttp://arxiv.org/abs/2504.14038v1Flowco: Rethinking Data Analysis in the Age of LLMs2025-04-18T19:01:27ZConducting data analysis typically involves authoring code to transform, visualize, analyze, and interpret data. Large language models (LLMs) are now capable of generating such code for simple, routine analyses. LLMs promise to democratize data science by enabling those with limited programming expertise to conduct data analyses, including in scientific research, business, and policymaking. However, analysts in many real-world settings must often exercise fine-grained control over specific analysis steps, verify intermediate results explicitly, and iteratively refine their analytical approaches. Such tasks present barriers to building robust and reproducible analyses using LLMs alone or even in conjunction with existing authoring tools (e.g., computational notebooks). This paper introduces Flowco, a new mixed-initiative system to address these challenges. Flowco leverages a visual dataflow programming model and integrates LLMs into every phase of the authoring process. A user study suggests that Flowco supports analysts, particularly those with less programming experience, in quickly authoring, debugging, and refining data analyses.2025-04-18T19:01:27ZStephen N. FreundBrooke SimonEmery D. BergerEunice Junhttp://arxiv.org/abs/2412.06754v2Probability and Angelic Nondeterminism with Multiset Semantics2025-04-17T18:38:14ZWe introduce a version of probabilistic Kleene algebra with angelic nondeterminism and a corresponding class of automata. Our approach implements semantics via distributions over multisets in order to overcome theoretical barriers arising from the lack of a distributive law between the powerset and Giry monads. We produce a full Kleene theorem and a coalgebraic theory, as well as both operational and denotational semantics and equational reasoning principles.2024-12-09T18:44:54ZShawn OngStephanie MaDexter Kozen10.1145/3729286http://arxiv.org/abs/2505.00718v1Productive Quantum Programming Needs Better Abstract Machines2025-04-17T17:00:06ZAn effective, accessible abstraction hierarchy has made using and programming computers possible for people across all disciplines. Establishing such a hierarchy for quantum programming is an outstanding challenge, especially due to a proliferation of different conventions and the rapid pace of innovation. One critical portion of the hierarchy is the abstract machine, the layer that separates a programmer's mental model of the hardware from its physical realization. Drawing on historical parallels in classical computing, we explain why having the "right" quantum abstract machine (QAM) is essential for making progress in the field and propose a novel framework for evaluating QAMs based on a set of desirable criteria. These criteria capture aspects of a QAM such as universality, compactness, expressiveness, and composability, which aid in the representation of quantum programs. By defining this framework we take steps toward defining an optimal QAM. We further apply our framework to survey the landscape of existing proposals, draw comparisons, and assess them based on our criteria. While these proposals share many common strengths, we find that each falls short of our ideal. Our framework and our findings set a direction for subsequent efforts to define a future QAM that is both straightforward to map to a variety of quantum computers, and provides a stable abstraction for quantum software development.2025-04-17T17:00:06Z11 pages, 2 figures, 1 tableSantiago Núñez-CorralesOlivia Di MatteoJohn DumbellMarcus EdwardsEdoardo GiustoScott PakinVlad Stirbuhttp://arxiv.org/abs/2409.11157v2The Incredible Shrinking Context... in a Decompiler Near You2025-04-17T14:34:36ZDecompilation of binary code has arisen as a highly-important application in the space of Ethereum VM (EVM) smart contracts. Major new decompilers appear nearly every year and attain popularity, for a multitude of reverse-engineering or tool-building purposes. Technically, the problem is fundamental: it consists of recovering high-level control flow from a highly-optimized continuation-passing-style (CPS) representation. Architecturally, decompilers can be built using either static analysis or symbolic execution techniques.
We present Shrknr, a static-analysis-based decompiler succeeding the state-of-the-art Elipmoc decompiler. Shrknr manages to achieve drastic improvements relative to the state of the art, in all significant dimensions: scalability, completeness, precision. Chief among the techniques employed is a new variant of static analysis context: shrinking context sensitivity. Shrinking context sensitivity performs deep cuts in the static analysis context, eagerly "forgetting" control-flow history, in order to leave room for further precise reasoning.
We compare Shrnkr to state-of-the-art decompilers, both static-analysis- and symbolic-execution-based. In a standard benchmark set, Shrnkr scales to over 99.5% of contracts (compared to ~95%), covers (i.e., reaches and manages to decompile) 67% more code, and reduces key imprecision metrics by over 65%.2024-09-17T13:10:38ZFull version of ISSTA 2025 paperSifis LagouvardosYannis BollanosNeville GrechYannis Smaragdakishttp://arxiv.org/abs/2504.08876v2Is Productivity in Quantum Programming Equivalent to Expressiveness?2025-04-17T14:17:26ZThe expressiveness of quantum programming languages plays a crucial role in the efficient and comprehensible representation of quantum algorithms. Unlike classical programming languages, which offer mature and well-defined abstraction mechanisms, quantum languages must integrate cognitively challenging concepts such as superposition, interference and entanglement while maintaining clarity and usability. However, identifying and characterizing differences in expressiveness between quantum programming paradigms remains an open area of study. Our work investigates the landscape of expressiveness through a comparative analysis of hosted quantum programming languages such as Qiskit, Cirq, Qrisp, and quAPL, and standalone languages including Q# and Qmod. We focused on evaluating how different quantum programming languages support the implementation of core quantum algorithms -- Deutsch-Jozsa, Simon, Bernstein-Vazirani, and Grover -- using expressiveness metrics: Lines of Code (LOC), Cyclomatic Complexity (CC), and Halstead Complexity (HC) metrics as proxies for developer productivity. Our findings suggest that different quantum programming paradigms offer distinct trade-offs between expressiveness and productivity, highlighting the importance of language design in quantum software development.2025-04-11T15:04:16Z11 pages, 6 figuresFrancini Corrales-GarroDanny Valerio-RamírezSantiago Núñez-Corraleshttp://arxiv.org/abs/2311.11851v6Crash-Stop Failures in Asynchronous Multiparty Session Types2025-04-17T10:33:00ZSession types provide a typing discipline for message-passing systems. However, their theory often assumes an ideal world: one in which everything is reliable and without failures. Yet this is in stark contrast with distributed systems in the real world. To address this limitation, we introduce a new asynchronous multiparty session types (MPST) theory with crash-stop failures, where processes may crash arbitrarily and cease to interact after crashing. We augment asynchronous MPST and processes with crash handling branches, and integrate crash-stop failure semantics into types and processes. Our approach requires no user-level syntax extensions for global types, and features a formalisation of global semantics, which captures complex behaviours induced by crashed/crash handling processes.
Our new theory covers the entire spectrum, ranging from the ideal world of total reliability to entirely unreliable scenarios where any process may crash, using optional reliability assumptions. Under these assumptions, we demonstrate the sound and complete correspondence between global and local type semantics, which guarantee deadlock-freedom, protocol conformance, and liveness of well-typed processes by construction, even in the presence of crashes.2023-11-20T15:38:04ZarXiv admin note: substantial text overlap with arXiv:2305.06238Logical Methods in Computer Science, Volume 21, Issue 2 (April 18, 2025) lmcs:12622Adam D. BarwellPing HouNobuko YoshidaFangyi Zhou10.46298/lmcs-21(2:5)2025http://arxiv.org/abs/2312.07263v2A Saturation-Based Unification Algorithm for Higher-Order Rational Patterns2025-04-16T20:50:24ZHigher-order unification has been shown to be undecidable. Miller discovered the pattern fragment and subsequently showed that higher-order pattern unification is decidable and has most general unifiers. We extend the algorithm to higher-order rational terms (a.k.a. regular Böhm trees, a form of cyclic $λ$-terms) and show that pattern unification on higher-order rational terms is decidable and has most general unifiers. We prove the soundness and completeness of the algorithm.2023-12-12T13:44:34ZZhibo ChenFrank Pfenninghttp://arxiv.org/abs/2504.12464v1Canonicity for Cost-Aware Logical Framework via Synthetic Tait Computability2025-04-16T19:56:15ZIn the original work on the cost-aware logical framework by Niu et al., a dependent variant of the call-by-push-value language for cost analysis, the authors conjectured that the canonicity property of the type theory can be succinctly proved via Sterling's synthetic Tait computability. This work resolves the conjecture affirmatively.2025-04-16T19:56:15ZRunming LiRobert Harperhttp://arxiv.org/abs/2311.05740v2Generating Pragmatic Examples to Train Neural Program Synthesizers2025-04-16T18:08:02ZProgramming-by-example is the task of synthesizing a program that is consistent with a set of user-provided input-output examples. As examples are often an under-specification of one's intent, a good synthesizer must choose the intended program from the many that are consistent with the given set of examples. Prior work frames program synthesis as a cooperative game between a listener (that synthesizes programs) and a speaker (a user choosing examples), and shows that models of computational pragmatic inference are effective in choosing the user intended programs. However, these models require counterfactual reasoning over a large set of programs and examples, which is infeasible in realistic program spaces. In this paper, we propose PraX, a novel way to amortize this search with neural networks. We sample pairs of programs and examples via self-play between listener and speaker models, and use pragmatic inference to choose informative training examples from this sample. We then use the informative dataset to train models to improve the synthesizer's ability to disambiguate user-provided examples without human supervision. We validate PraX on the challenging task of synthesizing regular expressions from example strings, and find that our method (1) outperforms models trained without choosing pragmatic examples by 23% (a 51% relative increase) (2) matches the performance of supervised learning on a dataset of pragmatic examples provided by humans, despite using no human data in training.2023-11-09T20:53:00ZICLR 2024Saujas VaduguruDaniel FriedYewen Puhttp://arxiv.org/abs/2503.16514v3VeriMind: Agentic LLM for Automated Verilog Generation with a Novel Evaluation Metric2025-04-16T14:58:48ZDesigning Verilog modules requires meticulous attention to correctness, efficiency, and adherence to design specifications. However, manually writing Verilog code remains a complex and time-consuming task that demands both expert knowledge and iterative refinement. Leveraging recent advancements in large language models (LLMs) and their structured text generation capabilities, we propose VeriMind, an agentic LLM framework for Verilog code generation that significantly automates and optimizes the synthesis process. Unlike traditional LLM-based code generators, VeriMind employs a structured reasoning approach: given a user-provided prompt describing design requirements, the system first formulates a detailed train of thought before the final Verilog code is generated. This multi-step methodology enhances interpretability, accuracy, and adaptability in hardware design. In addition, we introduce a novel evaluation metric-pass@ARC-which combines the conventional pass@k measure with Average Refinement Cycles (ARC) to capture both success rate and the efficiency of iterative refinement. Experimental results on diverse hardware design tasks demonstrated that our approach achieved up to $8.3\%$ improvement on pass@k metric and $8.1\%$ on pass@ARC metric. These findings underscore the transformative potential of agentic LLMs in automated hardware design, RTL development, and digital system synthesis.2025-03-15T23:43:06ZBardia NadimiGhali Omar BoutaibHao Zhenghttp://arxiv.org/abs/2504.12031v1Proof-Carrying Neuro-Symbolic Code2025-04-16T12:42:18ZThis invited paper introduces the concept of "proof-carrying neuro-symbolic code" and explains its meaning and value, from both the "neural" and the "symbolic" perspectives. The talk outlines the first successes and challenges that this new area of research faces.2025-04-16T12:42:18ZInvited paper at CiE 2025. arXiv admin note: text overlap with arXiv:2501.05867Ekaterina Komendantskayahttp://arxiv.org/abs/2502.15382v2Verified Parameterized Choreographies Technical Report2025-04-15T13:37:18ZThis technical report contains the full set of definitions and projection rules of the paper ``Verified Parameterized Choreographies'' by Rubbens et al. It also supplements the artefact.2025-02-21T11:06:24ZChanges 2025-04-15: - Updated DOI to artefactRobert RubbensPetra van den BosMarieke Huisman