https://arxiv.org/api/7AwtTWOmeNlU6igBugeeoIbtoJI 2026-06-30T11:46:40Z 9958 1875 15 http://arxiv.org/abs/2501.15002v1 A Proof-Producing Compiler for Blockchain Applications 2025-01-25T00:31:47Z

CairoZero is a programming language for running decentralized applications (dApps) at scale. Programs written in the CairoZero language are compiled to machine code for the Cairo CPU architecture and cryptographic protocols are used to verify the results of execution efficiently on blockchain. We explain how we have extended the CairoZero compiler with tooling that enables users to prove, in the Lean 3 proof assistant, that compiled code satisfies high-level functional specifications. We demonstrate the success of our approach by verifying primitives for computation with the secp256k1 and secp256r1 curves over a large finite field as well as the validation of cryptographic signatures using the former. We also verify a mechanism for simulating a read-write dictionary data structure in a read-only setting. Finally, we reflect on our methodology and discuss some of the benefits of our approach.

2025-01-25T00:31:47Z Jeremy Avigad Lior Goldberg David Levit Yoav Seginer Alon Titelman http://arxiv.org/abs/2501.14702v1 Multiparty Session Types with a Bang! 2025-01-24T18:24:20Z

Replication is an alternative construct to recursion for describing infinite behaviours in the pi-calculus. In this paper we explore the implications of including type-level replication in Multiparty Session Types (MPST), a behavioural type theory for message-passing programs. We introduce MPST!, a session-typed multiparty process calculus with replication and first-class roles. We show that replication is not an equivalent alternative to recursion in MPST, and that using both replication and recursion in one type system in fact allows us to express both context-free protocols and protocols that support mutual exclusion and races. We demonstrate the expressiveness of MPST! on examples including binary tree serialisation, dining philosophers, and a model of an auction, and explore the implications of replication on the decidability of typechecking.

2025-01-24T18:24:20Z Matthew Alan Le Brun Simon Fowler Ornela Dardha http://arxiv.org/abs/2403.10144v3 NLP Verification: Towards a General Methodology for Certifying Robustness 2025-01-24T15:43:41Z

Machine Learning (ML) has exhibited substantial success in the field of Natural Language Processing (NLP). For example large language models have empirically proven to be capable of producing text of high complexity and cohesion. However, they are prone to inaccuracies and hallucinations. As these systems are increasingly integrated into real-world applications, ensuring their safety and reliability becomes a primary concern. There are safety critical contexts where such models must be robust to variability or attack, and give guarantees over their output. Computer Vision had pioneered the use of formal verification of neural networks for such scenarios and developed common verification standards and pipelines, leveraging precise formal reasoning about geometric properties of data manifolds. In contrast, NLP verification methods have only recently appeared in the literature. While presenting sophisticated algorithms, these papers have not yet crystallised into a common methodology. They are often light on the pragmatical issues of NLP verification and the area remains fragmented. In this paper, we attempt to distil and evaluate general components of an NLP verification pipeline, that emerges from the progress in the field to date. Our contributions are two-fold. Firstly, we propose a general methodology to analyse the effect of the embedding gap, a problem that refers to the discrepancy between verification of geometric subspaces and the semantic meaning of sentences, which the geometric subspaces are supposed to represent. We propose a number of practical NLP methods that can help to quantify the effects of the embedding gap. Secondly, we give a general method for training and verification of neural networks that leverages a more precise geometric estimation of semantic similarity of sentences in the embedding space and helps to overcome the effects of the embedding gap in practice.

2024-03-15T09:43:52Z Marco Casadio Tanvi Dinkar Ekaterina Komendantskaya Luca Arnaboldi Matthew L. Daggitt Omri Isac Guy Katz Verena Rieser Oliver Lemon 10.1017/S0956792525000099 http://arxiv.org/abs/2501.14555v1 Exploring Answer Set Programming for Provenance Graph-Based Cyber Threat Detection: A Novel Approach 2025-01-24T14:57:27Z

Provenance graphs are useful and powerful tools for representing system-level activities in cybersecurity; however, existing approaches often struggle with complex queries and flexible reasoning. This paper presents a novel approach using Answer Set Programming (ASP) to model and analyze provenance graphs. We introduce an ASP-based representation that captures intricate relationships between system entities, including temporal and causal dependencies. Our model enables sophisticated analysis capabilities such as attack path tracing, data exfiltration detection, and anomaly identification. The declarative nature of ASP allows for concise expression of complex security patterns and policies, facilitating both real-time threat detection and forensic analysis. We demonstrate our approach's effectiveness through case studies showcasing its threat detection capabilities. Experimental results illustrate the model's ability to handle large-scale provenance graphs while providing expressive querying. The model's extensibility allows for incorporation of new system behaviors and security rules, adapting to evolving cyber threats. This work contributes a powerful, flexible, and explainable framework for reasoning about system behaviors and security incidents, advancing the development of effective threat detection and forensic investigation tools.

2025-01-24T14:57:27Z The 27th International Symposium on Practical Aspects of Declarative Languages (PADL 2025) Fang Li Fei Zuo Gopal Gupta http://arxiv.org/abs/2501.14438v1 Data-efficient Performance Modeling via Pre-training 2025-01-24T12:14:53Z

Performance models are essential for automatic code optimization, enabling compilers to predict the effects of code transformations on performance and guide search for optimal transformations. Building state-of-the-art performance models with deep learning, however, requires vast labeled datasets of random programs -- an expensive and time-consuming process, stretching over months. This paper introduces a self-supervised pre-training scheme with autoencoders to reduce the need for labeled data. By pre-training on a large dataset of random programs, the autoencoder learns representations of code and transformations, which are then used to embed programs for the performance model. Implemented in the Tiramisu autoscheduler, our approach improves model accuracy with less data. For example, to achieve a MAPE of 20.72%, the original model requires 18 million data points, whereas our method achieves a similar MAPE of 22.44% with only 3.6 million data points, reducing data requirements by 5x.

2025-01-24T12:14:53Z Chunting Liu Riyadh Baghdadi http://arxiv.org/abs/2410.18212v2 CUTECat: Concolic Execution for Computational Law 2025-01-23T10:10:41Z

Many legal computations, including the amount of tax owed by a citizen, whether they are eligible to social benefits, or the wages due to civil state servants, are specified by computational laws. Their application, however, is performed by expert computer programs intended to faithfully transcribe the law into computer code. Bugs in these programs can lead to dramatic societal impact, e.g., paying employees incorrect amounts, or not awarding benefits to families in need. To address this issue, we consider concolic unit testing, a combination of concrete execution with SMT-based symbolic execution, and propose CUTECat, a concolic execution tool targeting implementations of computational laws. Such laws typically follow a pattern where a base case is later refined by many exceptions in following law articles, a pattern that can be formally modeled using default logic. We show how to handle default logic inside a concolic execution tool, and implement our approach in the context of Catala, a recent domain-specific language tailored to implement computational laws. We evaluate CUTECat on several programs, including the Catala implementation of the French housing benefits and Section 132 of the US tax code. We show that CUTECat can successfully generate hundreds of thousands of testcases covering all branches of these bodies of law. Through several heuristics, we improve CUTECat's scalability and usability, making the testcases understandable by lawyers and programmers alike. We believe CUTECat thus paves the way for the use of formal methods during legislative processes.

2024-10-23T18:30:23Z Pierre Goutagny Aymeric Fromherz Raphaël Monat http://arxiv.org/abs/2105.12077v2 A beginner guide to Iris, Coq and separation logic 2025-01-23T04:59:44Z

Creating safe concurrent algorithms is challenging and error-prone. For this reason, a formal verification framework is necessary especially when those concurrent algorithms are used in safety-critical systems. The goal of this guide is to provide resources for beginners to get started in their journey of formal verification using the powerful tool Iris. The difference between this guide and many others is that it provides (i) an in-depth explanation of examples and tactics, (ii) an explicit discussion of separation logic, and (iii) a thorough coverage of Iris and Coq. References to other guides and to papers are included throughout to provide readers with resources through which to continue their learning.

2021-05-25T16:57:25Z Elizabeth Dietrich http://arxiv.org/abs/2501.13262v1 ASDF: A Compiler for Qwerty, a Basis-Oriented Quantum Programming Language 2025-01-22T22:54:13Z

Qwerty is a high-level quantum programming language built on bases and functions rather than circuits. This new paradigm introduces new challenges in compilation, namely synthesizing circuits from basis translations and automatically specializing adjoint or predicated forms of functions. This paper presents ASDF, an open-source compiler for Qwerty that answers these challenges in compiling basis-oriented languages. Enabled with a novel high-level quantum IR implemented in the MLIR framework, our compiler produces OpenQASM 3 or QIR for either simulation or execution on hardware. Our compiler is evaluated by comparing the fault-tolerant resource requirements of generated circuits with other compilers, finding that ASDF produces circuits with comparable cost to prior circuit-oriented compilers.

2025-01-22T22:54:13Z 20 pages, 14 figures. To appear in CGO '25 Austin J. Adams Sharjeel Khan Arjun S. Bhamra Ryan R. Abusaada Anthony M. Cabrera Cameron C. Hoechst Travis S. Humble Jeffrey S. Young Thomas M. Conte http://arxiv.org/abs/2501.13194v1 Corecursive Coding of High Computational Derivatives and Power Series 2025-01-22T19:55:43Z

We discuss the functional lazy techniques in generation and handling of arbitrarily long sequences of derivatives of numerical expressions in one ``variable''; the domain to which the paper belongs is usually nicknamed ``Automatic differentiation''. Two models thereof are considered, the chains of ``pure'' derivatives, and the infinite power series, similar, but algorithmically a bit different. We deal with their arithmetic/algebra, and with more convoluted procedures, such as composition and reversion. Some more specific applications of these structures are also presented.

2025-01-22T19:55:43Z 22 pages, 2 figures. Category: cs.PL Jerzy Karczmarczuk http://arxiv.org/abs/2501.13002v1 Constructive characterisations of the must-preorder for asynchrony 2025-01-22T16:42:03Z

De Nicola and Hennessy's must-preorder is a contextual refinement which states that a server q refines a server p if all clients satisfied by p are also satisfied by q. Owing to the universal quantification over clients, this definition does not yield a practical proof method for the must-preorder, and alternative characterisations are necessary to reason over it. Finding these characterisations for asynchronous semantics, i.e. where outputs are non-blocking, has thus far proven to be a challenge, usually tackled via ad-hoc definitions. We show that the standard characterisations of the must-preorder carry over as they stand to asynchronous communication, if servers are enhanced to act as forwarders, i.e. they can input any message as long as they store it back into the shared buffer. Our development is constructive, is completely mechanised in Coq, and is independent of any calculus: our results pertain to Selinger output-buffered agents with feedback. This is a class of Labelled Transition Systems that captures programs that communicate via a shared unordered buffer, as in asynchronous CCS or the asynchronous pi-calculus. We show that the standard coinductive characterisation lets us prove in Coq that concrete programs are related by the must-preorder. Finally, our proofs show that Brouwer's bar induction principle is a useful technique to reason on liveness preserving program transformations.

2025-01-22T16:42:03Z Giovanni Bernardi Ilaria Castellani Paul Laforgue Léo Stefanesco http://arxiv.org/abs/2502.00028v1 VRank: Enhancing Verilog Code Generation from Large Language Models via Self-Consistency 2025-01-22T08:36:29Z

Large Language Models (LLMs) have demonstrated promising capabilities in generating Verilog code from module specifications. To improve the quality of such generated Verilog codes, previous methods require either time-consuming manual inspection or generation of multiple Verilog codes, from which the one with the highest quality is selected with manually designed testbenches. To enhance the generation efficiency while maintaining the quality of the generated codes, we propose VRank, an automatic framework that generates Verilog codes with LLMs. In our framework, multiple code candidates are generated with LLMs by leveraging their probabilistic nature. Afterwards, we group Verilog code candidates into clusters based on identical outputs when tested against the same testbench, which is also generated by LLMs. Clusters are ranked based on the consistency they show on testbench. To determine the best candidate, Chain-of-Thought is further applied to select the best candidate from the top-ranked clusters. By systematically analyzing diverse outputs of generated codes, VRank reduces errors and enhances the overall quality of the generated Verilog code. Experimental results on the VerilogEval-Human benchmark demonstrate a significant 10.5% average increase in functional correctness (passl1) across multiple LLMs, demonstrating VRank's effectiveness in improving the accuracy of automated hardware description language generation for complex design tasks.

2025-01-22T08:36:29Z accepted by ISQED2025 Zhuorui Zhao Ruidi Qiu Ing-Chao Lin Grace Li Zhang Bing Li Ulf Schlichtmann http://arxiv.org/abs/2501.12702v1 Paradigm-Based Automatic HDL Code Generation Using LLMs 2025-01-22T08:18:37Z

While large language models (LLMs) have demonstrated the ability to generate hardware description language (HDL) code for digital circuits, they still face the hallucination problem, which can result in the generation of incorrect HDL code or misinterpretation of specifications. In this work, we introduce a human-expert-inspired method to mitigate the hallucination of LLMs and enhance their performance in HDL code generation. We begin by constructing specialized paradigm blocks that consist of several steps designed to divide and conquer generation tasks, mirroring the design methodology of human experts. These steps include information extraction, human-like design flows, and the integration of external tools. LLMs are then instructed to classify the type of circuit in order to match it with the appropriate paradigm block and execute the block to generate the HDL codes. Additionally, we propose a two-phase workflow for multi-round generation, aimed at effectively improving the testbench pass rate of the generated HDL codes within a limited number of generation and verification rounds. Experimental results demonstrate that our method significantly enhances the functional correctness of the generated Verilog code

2025-01-22T08:18:37Z accepted by ISQED2025. arXiv admin note: text overlap with arXiv:2407.18326 Wenhao Sun Bing Li Grace Li Zhang Xunzhao Yin Cheng Zhuo Ulf Schlichtmann http://arxiv.org/abs/2501.12313v1 Correctness Witnesses with Function Contracts 2025-01-21T17:27:59Z

Software verification witnesses are a common exchange format for software verification tools. They were developed to provide arguments supporting the verification result, allowing other tools to reproduce the verification results. Correctness witnesses in the current format (version 2.0) allow only for the encoding of loop and location invariants using C expressions. This limits the correctness arguments that verifiers can express in the witness format. One particular limitation is the inability to express function contracts, which consist of a pre-condition and a post-condition for a function. We propose an extension to the existing witness format 2.0 to allow for the specification of function contracts. Our extension includes support for several features inspired by ACSL (\result, \old, \at). This allows for the export of more information from tools and for the exchange of information with tools that require function contracts.

2025-01-21T17:27:59Z 9 pages, 3 figures, 1 table Matthias Heizmann Dominik Klumpp Marian Lingsch-Rosenfeld Frank Schüssele http://arxiv.org/abs/2501.11501v1 FLAT: Formal Languages as Types 2025-01-20T14:16:24Z

Programmers regularly use strings to encode many types of data, such as Unix file paths, URLs, and email addresses, that are conceptually different. However, existing mainstream programming languages use a unified string type to represent them all. As a result, their type systems will keep quiet when a function requiring an email address is instead fed an HTML text, which may cause unexceptional failures or vulnerabilities. To let the type system distinguish such conceptually different string types, in this paper, we propose to regard \emph{formal languages as types} (FLAT), thereby restricting the set of valid strings by context-free grammars and semantic constraints if needed. To this end, email addresses and HTML text are treated as different types. We realize this idea in Python as a testing framework FLAT-PY. It contains user annotations, all directly attached to the user's code, to (1) define such \emph{language types}, (2) specify pre-/post-conditions serving as \emph{semantic oracles} or contracts for functions, and (3) fuzz functions via random string inputs generated from a \emph{language-based fuzzer}. From these annotations, FLAY-PY \emph{automatically} checks type correctness at runtime via \emph{code instrumentation}, and reports any detected type error as soon as possible, preventing bugs from flowing deeply into other parts of the code. Case studies on real Python code fragments show that FLAT-PY is enable to catch logical bugs from random inputs, requiring a reasonable amount of user annotations.

2025-01-20T14:16:24Z Fengmin Zhu Andreas Zeller http://arxiv.org/abs/2404.04132v2 Accurate and Extensible Symbolic Execution of Binary Code based on Formal ISA Semantics 2025-01-20T11:56:45Z

Symbolic execution is an SMT-based software verification and testing technique. Symbolic execution requires tracking performed computations during software simulation to reason about branches in the software under test. The prevailing approach on symbolic execution of binary code tracks computations by transforming the code to be tested to an architecture-independent IR and then symbolically executes this IR. However, the resulting IR must be semantically equivalent to the binary code, making this process complex and error-prone. The semantics of the binary code are specified by the targeted ISA, commonly given in natural language and requiring a manual implementation of the transformation to an IR. In recent years, the use of formal languages to describe ISA semantics in a machine-readable way has gained increased popularity. We investigate the utilization of such formal semantics for symbolic execution of binary code, achieving an accurate representation of instruction semantics. We present a prototype for the RISC-V ISA and conduct a case study to demonstrate that it can be easily extended to additional instructions. Furthermore, we perform an experimental comparison with prior work which resulted in the discovery of five previously unknown bugs in the ISA implementation of the popular IR-based symbolic executor angr.

2024-04-05T14:29:39Z To be published in the proceedings of the 2025 Design, Automation and Test in Europe Conference (DATE'25) Sören Tempel Tobias Brandt Christoph Lüth Christian Dietrich Rolf Drechsler 10.23919/DATE64628.2025.10993257