Thread and Memory-Safe Programming with CLASS

2025-05-27T08:02:01Z

CLASS is a proof-of-concept general purpose linear programming language, flexibly supporting realistic concurrent programming idioms, and featuring an expressive linear type system ensuring that programs (1) never misuse or leak stateful resources or memory, (2) never deadlock, and (3) always terminate. The design of CLASS and the strong static guarantees of its type system originates in its Linear Logic and proposition-as-types foundations. However, instead of focusing on its theoretical foundations, this paper briefly illustrates, in a tutorial form, an identifiable CLASS session-based programming style where strong correctness properties are automatically ensured by type-checking. Our more challenging examples include concurrent thread and memory-safe mutable ADTs, lazy stream programming, and manipulation of linear digital assets as used in smart contracts.

Choreographies as Macros

2025-05-27T08:01:26Z

Concurrent programming often entails meticulous pairing of sends and receives between participants to avoid deadlock. Choreographic programming alleviates this burden by specifying the system as a single program. However, there are more applications than implementations of choreographies, and developing new implementations takes a lot of time and effort. Our work uses Racket to expedite building a new choreographic language called Choret. Racket has a powerful macro system which allows Choret to reuse much of its infrastructure for greater functionality and correctness.

GPUMC: A Stateless Model Checker for GPU Weak Memory Concurrency

2025-05-26T16:47:44Z

GPU computing is embracing weak memory concurrency for performance improvement. However, compared to CPUs, modern GPUs provide more fine-grained concurrency features such as scopes, have additional properties like divergence, and thereby follow different weak memory consistency models. These features and properties make concurrent programming on GPUs more complex and error-prone. To this end, we present GPUMC, a stateless model checker to check the correctness of GPU shared-memory concurrent programs under scoped-RC11 weak memory concurrency model. GPUMC explores all possible executions in GPU programs to reveal various errors - races, barrier divergence, and assertion violations. In addition, GPUMC also automatically repairs these errors in the appropriate cases. We evaluate GPUMC with benchmarks and real-life GPU programs. GPUMC is efficient both in time and memory in verifying large GPU programs where state-of-the-art tools are timed out. In addition, GPUMC identifies all known errors in these benchmarks compared to the state-of-the-art tools.

The CodeInverter Suite: Control-Flow and Data-Mapping Augmented Binary Decompilation with LLMs

2025-05-26T15:58:11Z

Binary decompilation plays a vital role in various cybersecurity and software engineering tasks. Recently, end-to-end decompilation methods powered by large language models (LLMs) have garnered significant attention due to their ability to generate highly readable source code with minimal human intervention. However, existing LLM-based approaches face several critical challenges, including limited capability in reconstructing code structure and logic, low accuracy in data recovery, concerns over data security and privacy, and high computational resource requirements. To address these issues, we develop the CodeInverter Suite, making three contributions: (1) the CodeInverter Workflow (CIW) is a novel prompt engineering workflow that incorporates control flow graphs (CFG) and explicit data mappings to improve LLM-based decompilation. (2) Using CIW on well-known source code datasets, we curate the CodeInverter Dataset (CID), a domain-specific dataset containing 8.69 million samples that contains CFGs and data mapping tables. (3) We train the CoderInverter Models (CIMs) on CID, generating two lightweight LLMs (with 1.3B and 6.7B parameters) intended for efficient inference in privacy-sensitive or resource-constrained environments. Extensive experiments on two benchmarks demonstrate that the CIW substantially enhances the performance of various LLMs across multiple metrics. Our CIM-6.7B can achieve state-of-the-art decompilation performance, outperforming existing LLMs even with over 100x more parameters in decompilation tasks, an average improvement of 11.03% in re-executability, 6.27% in edit similarity.

Abstractions-of-Thought: Intermediate Representations for LLM Reasoning in Hardware Design

2025-05-26T15:41:42Z

Large language models (LLMs) have achieved impressive proficiency on logic and programming tasks, often rivaling expert-level performance. However, generating functionally correct hardware description language (HDL) code from natural language specifications remains challenging, primarily in data-scarce domains. Therefore, we present Abstractions-of-Thought (AoT) - a training-free, inference-only prompting framework to mitigate misinterpretations and reasoning pitfalls of LLMs through a series of task-based abstractions within the prompting procedure, assisting in the transition from high-level to low-level representations of hardware. Furthermore, AoT consists of the following stages: (1) an LLM-based classification of hardware design patterns, (2) a structured intermediate representation (IR) to separate functional decomposition from code syntax, and (3) a line-by-line pseudocode solution enabling a more direct mapping to the final Verilog implementation. Experimental results on the VerilogEval benchmark depict that AoT demonstrates improvements in functionality when applied to large non-reasoning models (such as GPT-4o, outperforming all baseline techniques (including 1-shot, Chain-of-Thought, and Tree-of-Thought) while significantly reducing the generated tokens by 1.8-5.2x compared to popular Tree-of-Thought prompting.

LEGO-Compiler: Enhancing Neural Compilation Through Translation Composability

2025-05-26T07:07:54Z

Large language models (LLMs) have the potential to revolutionize how we design and implement compilers and code translation tools. However, existing LLMs struggle to handle long and complex programs. We introduce LEGO-Compiler, a novel neural compilation system that leverages LLMs to translate high-level languages into assembly code. Our approach centers on three key innovations: LEGO translation, which decomposes the input program into manageable blocks; breaking down the complex compilation process into smaller, simpler verifiable steps by organizing it as a verifiable LLM workflow by external tests; and a feedback mechanism for self-correction. Supported by formal proofs of translation composability, LEGO-Compiler demonstrates high accuracy on multiple datasets, including over 99% on ExeBench and 97.9% on industrial-grade AnsiBench. Additionally, LEGO-Compiler has also acheived near one order-of-magnitude improvement on compilable code size scalability. This work opens new avenues for applying LLMs to system-level tasks, complementing traditional compiler technologies.

Fully Randomized Pointers

2025-05-26T03:01:16Z

Memory errors continue to be a critical concern for programs written in low-level programming languages such as C and C++. Many different memory error defenses have been proposed, each with varying trade-offs in terms of overhead, compatibility, and attack resistance. Some defenses are highly compatible but only provide minimal protection, and can be easily bypassed by knowledgeable attackers. On the other end of the spectrum, capability systems offer very strong (unforgeable) protection, but require novel software and hardware implementations that are incompatible by definition. The challenge is to achieve both very strong protection and high compatibility. In this paper, we propose {\em Fully Randomized Pointers} FRP as a strong memory error defense that also maintains compatibility with existing binary software. The key idea behind FRP is to design a new pointer encoding scheme that allows for the full randomization of most pointer bits, rendering even brute force attacks impractical. We design a FRP encoding that is: (1) compatible with existing binary code (recompilation not needed); and (2) decoupled from the underlying object layout. FRP is prototyped as: (i) a software implementation (BlueFat) to test security and compatibility; and (ii) a proof-of-concept hardware implementation (GreenFat) to evaluate performance. We show FRP is secure, practical, and compatible at the binary level, while our hardware implementation achieves low performance overheads (< 4%).

Proceedings 16th International Workshop on Programming Language Approaches to Concurrency and Communication-cEntric Software

2025-05-25T10:23:50Z

This volume contains the proceedings of PLACES 2025, the 16th edition of the Workshop on Programming Language Approaches to Concurrency and Communication-cEntric Software. The workshop is scheduled to take place in Hamilton, Canada, on May 4, 2025, as a satellite event of ETAPS, the European Joint Conferences on Theory and Practice of Software. PLACES offers a forum for exchanging new ideas on how to address the challenges of concurrent and distributed programming and how to improve the foundations of modern and future computer applications. PLACES welcomes researchers from various fields, and its topics include the design of new programming languages, models for concurrent and distributed systems, type systems, program verification, and applications in various areas (e.g., microservices, sensor networks, blockchains, event processing, business process management).

Algorithmic Language Models with Neurally Compiled Libraries

2025-05-24T23:33:15Z

Important tasks such as reasoning and planning are fundamentally algorithmic, meaning that solving them robustly requires acquiring true reasoning or planning algorithms, rather than shortcuts. Large Language Models lack true algorithmic ability primarily because of the limitations of neural network optimization algorithms, their optimization data and optimization objective, but also due to architectural inexpressivity. To solve this, our paper proposes augmenting LLMs with a library of fundamental operations and sophisticated differentiable programs, so that common algorithms do not need to be learned from scratch. We add memory, registers, basic operations, and adaptive recurrence to a transformer architecture built on LLaMA3. Then, we define a method for directly compiling algorithms into a differentiable starting library, which is used natively and propagates gradients for optimization. In this preliminary study, we explore the feasability of augmenting LLaMA3 with a differentiable computer, for instance by fine-tuning small transformers on simple algorithmic tasks with variable computational depth.

Automated Verification of Monotonic Data Structure Traversals in C

2025-05-24T18:25:22Z

Bespoke data structure operations are common in real-world C code. We identify one common subclass, monotonic data structure traversals (MDSTs), that iterate monotonically through the structure. For example, strlen iterates from start to end of a character array until a null byte is found, and a binary search tree insert iterates from the tree root towards a leaf. We describe a new automated verification tool, Shrinker, to verify MDSTs written in C. Shrinker uses a new program analysis strategy called scapegoating size descent, which is designed to take advantage of the fact that many MDSTs produce very similar traces when executed on an input (e.g., some large list) as when executed on a 'shrunk' version of the input (e.g., the same list but with its first element deleted). We introduce a new benchmark set containing over one hundred instances proving correctness, equivalence, and memory safety properties of dozens of MDSTs found in major C codebases including Linux, NetBSD, OpenBSD, QEMU, Git, and Musl. Shrinker significantly increases the number of monotonic string and list traversals that can be verified vs. a portfolio of state-of-the-art tools.

ZeroML: A Next Generation AutoML Language

2025-05-23T16:01:49Z

ZeroML is a new generation programming language for AutoML to drive the ML pipeline in a compiled and multi-paradigm way, with a pure functional core. Meeting the shortcomings introduced by Python, R, or Julia such as slow-running time, brittle pipelines or high dependency cost ZeroML brings the Microservices-based architecture adding the modular, reusable pieces such as DataCleaner, FeatureEngineer or ModelSelector. As a native multithread and memory-aware search optimized toolkit, and with one command deployability ability, ZeroML ensures non-coders and ML professionals to create high-accuracy models super fast and in a more reproducible way. The verbosity of the language ensures that when it comes to dropping into the backend, the code we will be creating is extremely clear but the level of repetition and boilerplate required when developing on the front end is now removed.

Active Learning of Symbolic NetKAT Automata

2025-05-23T05:46:34Z

NetKAT is a domain-specific programming language and logic that has been successfully used to specify and verify the behavior of packet-switched networks. This paper develops techniques for automatically learning NetKAT models of unknown networks using active learning. Prior work has explored active learning for a wide range of automata (e.g., deterministic, register, Büchi, timed etc.) and also developed applications, such as validating implementations of network protocols. We present algorithms for learning different types of NetKAT automata, including symbolic automata proposed in recent work. We prove the soundness of these algorithms, build a prototype implementation, and evaluate it on a standard benchmark. Our results highlight the applicability of symbolic NetKAT learning for realistic network configurations and topologies.

AutoMCQ -- Automatically Generate Code Comprehension Questions using GenAI

2025-05-22T09:14:41Z

Students often do not fully understand the code they have written. This sometimes does not become evident until later in their education, which can mean it is harder to fix their incorrect knowledge or misunderstandings. In addition, being able to fully understand code is increasingly important in a world where students have access to generative artificial intelligence (GenAI) tools, such as GitHub Copilot. One effective solution is to utilise code comprehension questions, where a marker asks questions about a submission to gauge understanding, this can also have the side effect of helping to detect plagiarism. However, this approach is time consuming and can be difficult and/or expensive to scale. This paper introduces AutoMCQ, which uses GenAI for the automatic generation of multiple-choice code comprehension questions. This is integrated with the CodeRunner automated assessment platform.

Data-driven Verification of Procedural Programs with Integer Arrays

2025-05-21T19:19:21Z

We address the problem of verifying automatically procedural programs manipulating parametric-size arrays of integers, encoded as a constrained Horn clauses solving problem. We propose a new algorithmic method for synthesizing loop invariants and procedure pre/post-conditions represented as universally quantified first-order formulas constraining the array elements and program variables. We adopt a data-driven approach that extends the decision tree Horn-ICE framework to handle arrays. We provide a powerful learning technique based on reducing a complex classification problem of vectors of integer arrays to a simpler classification problem of vectors of integers. The obtained classifier is generalized to get universally quantified invariants and procedure pre/post-conditions. We have implemented our method and shown its efficiency and competitiveness w.r.t. state-of-the-art tools on a significant benchmark.

An Intermediate Program Representation for Optimizing Stream-Based Languages

2025-05-21T10:50:41Z

Stream-based runtime monitors are safety assurance tools that check at runtime whether the system's behavior satisfies a formal specification. Specifications consist of stream equations, which relate input streams, containing sensor readings and other incoming information, to output streams, representing filtered and aggregated data. This paper presents a framework for the stream-based specification language RTLola. We introduce a new intermediate representation for stream-based languages, the StreamIR, which, like the specification language, operates on streams of unbounded length; while the stream equations are replaced by imperative programs. We developed a set of optimizations based on static analysis of the specification and have implemented an interpreter and a compiler for several target languages. In our evaluation, we measure the performance of several real-world case studies. The results show that using the StreamIR framework reduces the runtime significantly compared to the existing StreamIR interpreter. We evaluate the effect of the optimizations and show that significant performance gains are possible beyond the optimizations of the target language's compiler. While our current implementation is limited to RTLola, the StreamIR is designed to accommodate other stream-based languages, enabling their interpretation and compilation into all available target languages.