https://arxiv.org/api/MyE0Tal32yPbjJca6slS4ymOK/82026-06-28T17:52:52Z9951163515http://arxiv.org/abs/2504.10323v2Rel: A Programming Language for Relational Data2025-04-24T14:33:37ZFrom the moment of their inception, languages for relational data have been described as sublanguages embedded in a host programming language. Rel is a new relational language whose key design goal is to go beyond this paradigm with features that allow for programming in the large, making it possible to fully describe end to end application semantics. With the new approach we can model the semantics of entire enterprise applications relationally, which helps significantly reduce architecture complexity and avoid the well-known impedance mismatch problem. This paradigm shift is enabled by 50 years of database research, making it possible to revisit the sublanguage/host language paradigm, starting from the fundamental principles. We present the main features of Rel: those that give it the power to express traditional query language operations and those that are designed to grow the language and allow programming in the large.2025-04-14T15:32:47ZMolham ArefPaolo GuagliardoGeorge KastrinisLeonid LibkinVictor MarsaultWim MartensMary McGrathFilip MurlakNathaniel NystromLiat PeterfreundAllison RogersCristina SirangeloDomagoj VrgocDavid ZhaoAbdul Zreika10.1145/3722212.3724450http://arxiv.org/abs/2504.17336v1Operational Semantics for Crystality: A Smart Contract Language for Parallel EVMs2025-04-24T07:53:30ZThe increasing demand for scalable blockchain has driven research into parallel execution models for smart contracts. Crystality is a novel smart contract programming language designed for parallel Ethereum Virtual Machines (EVMs), enabling fine-grained concurrency through Programmable Contract Scopes and Asynchronous Functional Relay. This paper presents the first formal structural operational semantics for Crystality, providing a rigorous framework to reason about its execution. We mechanize the syntax and semantics of Crystality in the theorem-proving assistant Coq, enabling formal verification of correctness properties. As a case study, we verify a simplified token transfer function, demonstrating the applicability of our semantics in ensuring smart contract correctness. Our work lays the foundation for formally verified parallel smart contracts, contributing to the security and scalability of blockchain systems.2025-04-24T07:53:30ZZiyun XuHao WangMeng Sunhttp://arxiv.org/abs/2412.02107v2Efficient, Portable, Census-Polymorphic Choreographic Programming2025-04-23T18:10:19ZChoreographic programming (CP) is a paradigm for implementing distributed systems that uses a single global program to define the actions and interactions of all participants. Library-level CP implementations, like HasChor, integrate well with mainstream programming languages but have several limitations: Their conditionals require extra communication; they require specific host-language features (e.g., monads); and they lack support for programming patterns that are essential for implementing realistic distributed applications.
We make three contributions to library-level CP to specifically address these challenges. First, we propose and formalize conclaves and multiply-located values, which enable efficient conditionals in library-level CP without redundant communication. Second, we propose end-point projection as dependency injection, a design pattern that enables library-level CP in host languages without support for monads. Third, we propose census polymorphism, a technique for abstracting over the number of participants in a choreography. We demonstrate these contributions via implementations in Haskell, Rust, and TypeScript.2024-12-03T03:00:52ZPresenting at PLDI25Mako BatesShun KashiwaSyed JafriGan ShenLindsey KuperJoseph P. Near10.1145/3729296http://arxiv.org/abs/2504.16775v1IsaBIL: A Framework for Verifying (In)correctness of Binaries in Isabelle/HOL (Extended Version)2025-04-23T14:47:38ZThis paper presents IsaBIL, a binary analysis framework in Isabelle/HOL that is based on the widely used Binary Analysis Platform (BAP). Specifically, in IsaBIL, we formalise BAP's intermediate language, called BIL and integrate it with Hoare logic (to enable proofs of correctness) as well as incorrectness logic (to enable proofs of incorrectness). IsaBIL inherits the full flexibility of BAP, allowing us to verify binaries for a wide range of languages (C, C++, Rust), toolchains (LLVM, Ghidra) and target architectures (x86, RISC-V), and can also be used when the source code for a binary is unavailable.
To make verification tractable, we develop a number of big-step rules that combine BIL's existing small-step rules at different levels of abstraction to support reuse. We develop high-level reasoning rules for RISC-V instructions (our main target architecture) to further optimise verification. Additionally, we develop Isabelle proof tactics that exploit common patterns in C binaries for RISC-V to discharge large numbers of proof goals (often in the 100s) automatically. IsaBIL includes an Isabelle/ML based parser for BIL programs, allowing one to automatically generate the associated Isabelle/HOL program locale from a BAP output. Taken together, IsaBIL provides a highly flexible proof environment for program binaries. As examples, we prove correctness of key examples from the Joint Strike Fighter coding standards and the MITRE database.2025-04-23T14:47:38ZMatt GriffinBrijesh DongolAzalea Raadhttp://arxiv.org/abs/1907.01257v5A robust graph-based approach to observational equivalence2025-04-23T13:57:46ZWe propose a new step-wise approach to proving observational equivalence, and in particular reasoning about fragility of observational equivalence. Our approach is based on what we call local reasoning. The local reasoning exploits the graphical concept of neighbourhood, and it extracts a new, formal, concept of robustness as a key sufficient condition of observational equivalence. Moreover, our proof methodology is capable of proving a generalised notion of observational equivalence. The generalised notion can be quantified over syntactically restricted contexts instead of all contexts, and also quantitatively constrained in terms of the number of reduction steps. The operational machinery we use is given by a hypergraph-rewriting abstract machine inspired by Girard's Geometry of Interaction. The behaviour of language features, including function abstraction and application, is provided by hypergraph-rewriting rules. We demonstrate our proof methodology using the call-by-value lambda-calculus equipped with (higher-order) state.2019-07-02T09:34:59ZLogical Methods in Computer Science, Volume 21, Issue 2 (April 24, 2025) lmcs:8524Dan R. GhicaKoko MuroyaTodd Waugh Ambridge10.46298/lmcs-21(2:8)2025http://arxiv.org/abs/2504.16579v1Optimization Framework for Reducing Mid-circuit Measurements and Resets2025-04-23T10:01:00ZThe paper addresses the optimization of dynamic circuits in quantum computing, with a focus on reducing the cost of mid-circuit measurements and resets. We extend the probabilistic circuit model (PCM) and implement an optimization framework that targets both mid-circuit measurements and resets. To overcome the limitation of the prior PCM-based pass, where optimizations are only possible on pure single-qubit states, we incorporate circuit synthesis to enable optimizations on multi-qubit states. With a parameter $n_{pcm}$, our framework balances optimization level against resource usage.We evaluate our framework using a large dataset of randomly generated dynamic circuits. Experimental results demonstrate that our method is highly effective in reducing mid-circuit measurements and resets. In our demonstrative example, when applying our optimization framework to the Bernstein-Vazirani algorithm after employing qubit reuse, we significantly reduce its runtime overhead by removing all of the resets.2025-04-23T10:01:00ZAccepted by The International Conference on Computational Science 2025Yanbin ChenInnocenzo FulginitiChristian B. Mendl10.1007/978-3-031-97570-7_13http://arxiv.org/abs/2504.12729v2Dead Gate Elimination2025-04-23T09:45:20ZHybrid quantum algorithms combine the strengths of quantum and classical computing. Many quantum algorithms, such as the variational quantum eigensolver (VQE), leverage this synergy. However, quantum circuits are executed in full, even when only subsets of measurement outcomes contribute to subsequent classical computations. In this manuscript, we propose a novel circuit optimization technique that identifies and removes dead gates. We prove that the removal of dead gates has no influence on the probability distribution of the measurement outcomes that contribute to the subsequent calculation result. We implemented and evaluated our optimization on a VQE instance, a quantum phase estimation (QPE) instance, and hybrid programs embedded with random circuits of varying circuit width, confirming its capability to remove a non-trivial number of dead gates in real-world algorithms. The effect of our optimization scales up as more measurement outcomes are identified as non-contributory, resulting in a proportionally greater reduction of dead gates.2025-04-17T08:10:59ZAccepted by 25th International Conference on Computational Science, 2025Yanbin ChenChristian B. MendlHelmut Seidl10.1007/978-3-031-97632-2_10http://arxiv.org/abs/2407.01204v2A Language for Smart Contracts with Secure Control Flow (Technical Report)2025-04-22T22:04:29ZSmart contracts are frequently vulnerable to control-flow attacks based on confused deputies, reentrancy, and incorrect error handling. These attacks exploit the complexity of interactions among multiple possibly unknown contracts. Existing best practices to prevent vulnerabilities rely on code patterns and heuristics that produce both false positives and false negatives. Even with extensive audits and heuristic tools, new vulnerabilities continue to arise, routinely costing tens of millions of dollars.
We introduce SCIF, a language for secure smart contracts, that addresses these classes of control-flow attacks. By extending secure information flow mechanisms in a principled way, SCIF enforces both classic end-to-end information flow security and new security restrictions on control flow, even when SCIF contracts interact with malicious non-SCIF code. SCIF is implemented as a compiler to Solidity. We show how SCIF can secure contracts with minimal overhead through case studies of applications with intricate security reasoning and a large corpus of insecure code.2024-07-01T11:51:21ZSiqiu YaoHaobin NiStephanie MaNoah SchiffAndrew C. MyersEthan Cecchettihttp://arxiv.org/abs/2504.16027v1Benchmarking LLM for Code Smells Detection: OpenAI GPT-4.0 vs DeepSeek-V32025-04-22T16:44:39ZDetermining the most effective Large Language Model for code smell detection presents a complex challenge. This study introduces a structured methodology and evaluation matrix to tackle this issue, leveraging a curated dataset of code samples consistently annotated with known smells. The dataset spans four prominent programming languages Java, Python, JavaScript, and C++; allowing for cross language comparison. We benchmark two state of the art LLMs, OpenAI GPT 4.0 and DeepSeek-V3, using precision, recall, and F1 score as evaluation metrics. Our analysis covers three levels of detail: overall performance, category level performance, and individual code smell type performance. Additionally, we explore cost effectiveness by comparing the token based detection approach of GPT 4.0 with the pattern-matching techniques employed by DeepSeek V3. The study also includes a cost analysis relative to traditional static analysis tools such as SonarQube. The findings offer valuable guidance for practitioners in selecting an efficient, cost effective solution for automated code smell detection2025-04-22T16:44:39ZAhmed R. SadikSiddhata Govindhttp://arxiv.org/abs/2504.15936v1An effectful object calculus2025-04-22T14:24:59ZWe show how to smoothly incorporate in the object-oriented paradigm constructs to raise, compose, and handle effects in an arbitrary monad. The underlying pure calculus is meant to be a representative of the last generation of OO languages, and the effectful extension is manageable enough for ordinary programmers; notably, constructs to raise effects are just special methods. We equip the calculus with an expressive type-and-effect system, which, again by relying on standard features such as inheritance and generic types, allows a simple form of effect polymorphism. The soundness of the type-and-effect system is expressed and proved by a recently introduced technique, where the semantics is formalized by a one-step reduction relation from language expressions into monadic ones, so that it is enough to prove progress and subject reduction properties on this relation.2025-04-22T14:24:59ZFrancesco DagninoPaola GianniniElena Zuccahttp://arxiv.org/abs/2504.07483v2Program Skeletons for Automated Program Translation2025-04-22T12:49:11ZTranslating software between programming languages is a challenging task, for which automated techniques have been elusive and hard to scale up to larger programs. A key difficulty in cross-language translation is that one has to re-express the intended behavior of the source program into idiomatic constructs of a different target language. This task needs abstracting away from the source language-specific details, while keeping the overall functionality the same. In this work, we propose a novel and systematic approach for making such translation amenable to automation based on a framework we call program skeletons. A program skeleton retains the high-level structure of the source program by abstracting away and effectively summarizing lower-level concrete code fragments, which can be mechanically translated to the target programming language. A skeleton, by design, permits many different ways of filling in the concrete implementation for fragments, which can work in conjunction with existing data-driven code synthesizers. Most importantly, skeletons can conceptually enable sound decomposition, i.e., if each individual fragment is correctly translated, taken together with the mechanically translated skeleton, the final translated program is deemed to be correct as a whole. We present a prototype system called Skel embodying the idea of skeleton-based translation from Python to JavaScript. Our results show promising scalability compared to prior works. For 9 real-world Python programs, some with more than about 1k lines of code, 95% of their code fragments can be automatically translated, while about 5% require manual effort. All the final translations are correct with respect to whole-program test suites.2025-04-10T06:25:17ZAccepted by PLDI 2025 (46th ACM SIGPLAN Conference on Programming Language Design and Implementation)Bo WangTianyu LiRuishi LiUmang MathurPrateek Saxena10.1145/3729287http://arxiv.org/abs/2501.07472v2LitmusKt: Concurrency Stress Testing for Kotlin2025-04-22T09:57:15ZWe present LitmusKt - the first tool for litmus testing concurrent programs in Kotlin. The tool's novelty also lies in the fact that Kotlin is a multiplatform language, i.e., it compiles into multiple platforms, which means that the concurrency has to be tested on several of them. Our tool allows writing litmus tests in a single custom DSL, and these tests are then run in Kotlin/Native and Kotlin/JVM, two main platforms for concurrent programming in Kotlin. Using LitmusKt, we discovered novel bugs in the Kotlin compiler, which we then fixed and they are no longer present. Moreover, LitmusKt was integrated into the CI pipeline for Kotlin. LitmusKt is available on GitHub: https://github.com/JetBrains-Research/litmuskt. The demo is available on YouTube: https://youtu.be/oWCZp_Huwss.2025-01-13T16:40:34ZAccepted to FSE'25 Demonstrations, 5 pages, 3 figuresDenis LochmelisEvgenii MoiseenkoYaroslav GolubevAnton Podkopaev10.1145/3696630.3728584http://arxiv.org/abs/2504.15637v1DR.FIX: Automatically Fixing Data Races at Industry Scale2025-04-22T06:56:15ZData races are a prevalent class of concurrency bugs in shared-memory parallel programs, posing significant challenges to software reliability and reproducibility. While there is an extensive body of research on detecting data races and a wealth of practical detection tools across various programming languages, considerably less effort has been directed toward automatically fixing data races at an industrial scale. In large codebases, data races are continuously introduced and exhibit myriad patterns, making automated fixing particularly challenging.
In this paper, we tackle the problem of automatically fixing data races at an industrial scale. We present Dr.Fix, a tool that combines large language models (LLMs) with program analysis to generate fixes for data races in real-world settings, effectively addressing a broad spectrum of racy patterns in complex code contexts. Implemented for Go--the programming language widely used in modern microservice architectures where concurrency is pervasive and data races are common--Dr.Fix seamlessly integrates into existing development workflows. We detail the design of Dr.Fix and examine how individual design choices influence the quality of the fixes produced. Over the past 18 months, Dr.Fix has been integrated into developer workflows at Uber demonstrating its practical utility. During this period, Dr.Fix produced patches for 224 (55%) from a corpus of 404 data races spanning various categories; 193 of these patches (86%) were accepted by more than a hundred developers via code reviews and integrated into the codebase.2025-04-22T06:56:15ZTo appear in PLDI 2025Farnaz BehrangZhizhou ZhangGeorgian-Vlad SaiocPeng LiuMilind Chabbihttp://arxiv.org/abs/2504.15550v1Smooth, Integrated Proofs of Cryptographic Constant Time for Nondeterministic Programs and Compilers2025-04-22T03:12:31ZFormal verification of software and compilers has been used to rule out large classes of security-critical issues, but risk of unintentional information leakage has received much less consideration. It is a key requirement for formal specifications to leave some details of a system's behavior unspecified so that future implementation changes can be accommodated, and yet it is nonetheless expected that these choices would not be made based on confidential information the system handles. This paper formalizes that notion using omnisemantics and plain single-copy assertions, giving for the first time a specification of what it means for a nondeterministic program to be constant-time or more generally to avoid leaking (a part of) its inputs. We use this theory to prove data-leak-free execution of core cryptographic routines compiled from Bedrock2 C to RISC-V machine code, showing that the smooth specification and proof experience omnisemantics provides for nondeterminism extends to constant-time properties in the same setting. We also study variants of the key program-compiler contract, highlighting pitfalls of tempting simplifications and subtle consequences of how inputs to nondeterministic choices are constrained. Our results are backed by modular program-logic and compiler-correctness theorems, and they integrate into a neat end-to-end theorem in the Coq proof assistant.2025-04-22T03:12:31Z34 pages, 1 table, 0 figures. to be published in PLDI 2025 proceedingsOwen ConolyAndres ErbsenAdam Chlipalahttp://arxiv.org/abs/2503.16588v3A Unified Framework for Quantitative Cache Analysis2025-04-21T15:36:53ZIn this work we unify two existing lines of work towards cache analysis for non-LRU policies. To this end, we extend the notion of competitiveness to block competitiveness and systematically analyze the competitiveness and block competitiveness of FIFO and MRU relative to LRU for arbitrary associativities. We show how competitiveness and block competitiveness can be exploited in state-of-the-art WCET analysis based on the results of existing persistence analyses for LRU. Unlike prior work, our approach is applicable to microarchitectures that exhibit timing anomalies. We experimentally evaluate the precision and cost of our approach on benchmarks from TACLeBench. The experiments demonstrate that quantitative cache analysis for FIFO and MRU comes close to the precision of LRU.2025-03-20T17:37:15ZExtended version of RTAS 2025 paperSophie KahlenJan Reineke