&inator: Correct, Precise C-to-Rust Interface Translation

2026-04-21T18:27:06Z

Automatically translating system software from C to Rust is an appealing but challenging problem, as it requires whole-program reasoning to satisfy Rust's ownership and borrowing discipline. A key enabling step in whole-program translation is interface translation, which produces Rust declarations for the C program's top-level declarations (i.e., structs and function signatures), enabling modular and incremental code translation. This paper introduces correct, precise C-to-Rust interface translation, called &inator. &inator employs a novel constraint-based formulation of semantic equivalence and type correctness including borrow-checking rules to produce a Rust interface that is correct (i.e., the interface admits a semantics-preserving implementation in safe Rust) and precise (i.e., it uses the simplest, least costly types). Our results show &inator produces correct, precise Rust interfaces for real C programs, but support for certain C features and scaling to large programs are challenges left for future work. This work advances the state of the art by being the first correct, precise approach to C-to-Rust interface translation.

Adding Compilation Metadata To Binaries To Make Disassembly Decidable

2026-04-21T16:16:10Z

The binary executable format is the standard method for distributing and executing software. Yet, it is also as opaque a representation of software as can be. If the binary format were augmented with metadata that provides security-relevant information, such as which data is intended by the compiler to be executable instructions, or how memory regions are expected to be bounded, that would dramatically improve the safety and maintainability of software. In this paper, we propose a binary format that is a middle ground between a stripped black-box binary and open source. We provide a tool that generates metadata capturing the compiler's intent and inserts it into the binary. This metadata enables lifting to a correct and recompilable higher-level representation and makes analysis and instrumentation more reliable. Our evaluation shows that adding metadata does not affect runtime behavior or performance. Compared to DWARF, our metadata is roughly 17% of its size. We validate correctness by compiling a comprehensive set of real-world C and C++ binaries and demonstrating that they can be lifted, instrumented, and recompiled without altering their behavior.

Context: Proactive Goal-Directed Intelligence via Composable Sandboxed Programs, Declarative Wiring, and Structured Interaction

2026-04-21T14:39:40Z

We present Context, the intelligence layer of the Magarshak Architecture, which replaces reactive query-response chatbots with proactive goal-directed agents that advance shared tasks without waiting for user prompts. The architecture rests on three mutually reinforcing mechanisms. Write-time context assembly precomputes enriched typed attributes via Groker agents, assembling interaction context as a deterministic pure function of graph state; context blocks are byte-identical across turns between semantic changes, enabling near-100% KV-cache reuse. Composable sandboxed wisdom programs form a governed library of LM-generated imperative programs declaratively wired to goal types via typed stream relations, composed via phase ordering, and executed at interaction time without further LM calls. Proactive goal stream state machines drive conversations toward terminal states by inspecting graph state and emitting structured interaction content (option arrays, governance affordances, clarification prompts) without awaiting user input. We prove six formal results: the Context Stability Theorem, bounding per-turn LM cost as a function of semantic change rate; a Program Composition Correctness Theorem; a Declarative Wiring Soundness Theorem; the Proactive Dominance Theorem, proving proactive agents weakly dominate reactive agents on expected turns-to-terminal-state; Coordination Overhead Elimination and Quality Preservation, establishing Pareto improvements in multi-participant goal chats; and a Cross-Platform Vote Consistency Theorem. Implemented in the open-source Qbix / Safebox / Safebots stack.

CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark

2026-04-21T00:48:00Z

Cross-architecture GPU code transpilation is essential for unlocking low-level hardware portability, yet no scalable solution exists. We introduce CASS, the first dataset and model suite for source- and assembly-level GPU translation (CUDA <--> HIP, SASS <--> RDNA3). CASS contains 60k verified host-device code pairs, enabling learning-based translation across both ISA and runtime boundaries. We generate each sample using our automated pipeline that scrapes, translates, compiles, and aligns GPU programs across vendor stacks. Leveraging CASS, we train a suite of domain-specific translation models that achieve 88.2% accuracy on CUDA -> HIP and 69.1% on SASS -> RDNA3, outperforming commercial baselines including GPT-5.1, Claude-4.5, and Hipify by wide margins. Generated code matches native performance in 85% of cases, preserving both runtime and memory behavior. To support rigorous evaluation, we introduce CASS-Bench, a curated benchmark spanning 18 GPU domains with ground-truth execution. All data, models, and evaluation tools will be released as open source to support progress in GPU compiler tooling, binary compatibility, and LLM-guided code translation.

Event Tensor: A Unified Abstraction for Compiling Dynamic Megakernel

2026-04-21T00:31:44Z

Modern GPU workloads, especially large language model (LLM) inference, suffer from kernel launch overheads and coarse synchronization that limit inter-kernel parallelism. Recent megakernel techniques fuse multiple operators into a single persistent kernel to eliminate launch gaps and expose inter-kernel parallelism, but struggle to handle dynamic shapes and data-dependent computation in real workloads. We present Event Tensor, a unified compiler abstraction for dynamic megakernels. Event Tensor encodes dependencies between tiled tasks, and enables first-class support for both shape and data-dependent dynamism. Built atop this abstraction, our Event Tensor Compiler (ETC) applies static and dynamic scheduling transformations to generate high-performance persistent kernels. Evaluations show that ETC achieves state-of-the-art LLM serving latency while significantly reducing system warmup overhead.

Linear effects, exceptions, and resource safety: a Curry-Howard correspondence for destructors

2026-04-21T00:23:45Z

We analyse the problem of combining linearity, effects, and exceptions, in abstract models of programming languages, as the issue of providing some kind of strength for a monad $T(- \oplus E)$ in a linear setting. We consider in particular for $T$ the allocation monad, which we introduce to model and study resource-safety properties. We apply these results to a series of two linear effectful calculi for which we establish their resource-safety properties. The first calculus is a linear (optionally ordered) call-by-push-value language with two allocation effects $\mathbf{new}$ and $\mathbf{delete}$. The resource-safety properties follow from the linear and ordered character of the typing rules. We then integrate exceptions with linearity and effects by adjoining default destruction actions to types, as inspired by C++/Rust destructors. We see destructors as objects $δ: A\rightarrow TI$ in the slice category over $TI$. This construction gives rise to a second calculus, the resource call-by-push-value, featuring exceptions and destructors, and whose weakening and exchange rules perform side-effects. It is therefore affine at the level of types but ordered at the level of derivations. As in C++ and Rust, a ``move'' operation -- the side-effecting exchange rule -- is necessary for releasing resources in random order, as opposed to LIFO order.

Formally Verified Patent Analysis via Dependent Type Theory: Machine-Checkable Certificates from a Hybrid AI + Lean 4 Pipeline

2026-04-20T22:02:57Z

We present a formally verified framework for patent analysis as a hybrid AI + Lean 4 pipeline. The DAG-coverage core (Algorithm 1b) is fully machine-verified once bounded match scores are fixed. Freedom-to-operate, claim-construction sensitivity, cross-claim consistency, and doctrine-of-equivalents analyses are formalized at the specification level with kernel-checked candidate certificates. Existing patent-analysis approaches rely on manual expert analysis (slow, non-scalable) or ML/NLP methods (probabilistic, opaque, non-compositional). To our knowledge, this is the first framework that applies interactive theorem proving based on dependent type theory to intellectual property analysis. Claims are encoded as DAGs in Lean 4, match strengths as elements of a verified complete lattice, and confidence scores propagate through dependencies via proven-correct monotone functions. We formalize five IP use cases (patent-to-product mapping, freedom-to-operate, claim construction sensitivity, cross-claim consistency, doctrine of equivalents) via six algorithms. Structural lemmas, the coverage-core generator, and the closed-path identity coverage = W_cov are machine-verified in Lean 4. Higher-level theorems for the other use cases remain informal proof sketches, and their proof-generation functions are architecturally mitigated (untrusted generators whose outputs are kernel-checked and sorry-free axiom-audited). Guarantees are conditional on the ML layer: they certify mathematical correctness of computations downstream of ML scores, not the accuracy of the scores themselves. A case study on a synthetic memory-module claim demonstrates weighted coverage and construction-sensitivity analysis. Validation against adjudicated cases is future work.

Compositional security definitions for higher-order where declassification

2026-04-20T14:05:46Z

To ensure programs do not leak private data, we often want to be able to provide formal guarantees ensuring such data is handled correctly. Often, we cannot keep such data secret entirely; instead programmers specify how private data may be declassified. While security definitions for declassification exist, they mostly do not handle higher-order programs. In fact, in the higher-order setting no compositional security definition exists for intensional information-flow properties such as where declassification, which allows declassification in specific parts of a program. We use logical relations to build a model (and thus security definition) of where declassification. The key insight required for our model is that we must stop enforcing indistinguishability once a \emph{relevant declassification} has occurred. We show that the resulting security definition provides more security than the most related previous definition, which is for the lower-order setting. This paper is an extended version of the paper of the same name published at OOPSLA 2023 ([21]).

Block-encodings as programming abstractions: The Eclipse Qrisp BlockEncoding Interface

2026-04-20T13:51:06Z

Block-encoding is a foundational technique in modern quantum algorithms, enabling the implementation of non-unitary operations by embedding them into larger unitary matrices. While theoretically powerful and essential for advanced protocols like Quantum Singular Value Transformation (QSVT) and Quantum Signal Processing (QSP), the generation of compilable implementations of block-encodings poses a formidable challenge. This work presents the BlockEncoding interface within the Eclipse Qrisp framework, establishing block-encodings as a high-level programming abstraction accessible to a broad scientific audience. Serving as both a technical framework introduction and a hands-on tutorial, this paper explicitly details key underlying concepts abstracted away by the interface, such as block-encoding construction and qubitization, and their practical integration into methods like the Childs-Kothari-Somma (CKS) algorithm. We outline the interface's software architecture, encompassing constructors, core utilities, arithmetic composition, and algorithmic applications such as matrix inversion, polynomial filtering, and Hamiltonian simulation. Through code examples, we demonstrate how this interface simplifies both the practical realization of advanced quantum algorithms and their associated resource estimation.

Hyper Separation Logic (extended version)

2026-04-20T13:08:38Z

Many important functional and security properties--including non-interference, determinism, and generalized non-interference (GNI)--are hyperproperties, i.e., properties relating multiple executions of a program. Existing separation logics allow one to reason about specific classes of hyperproperties, e.g., $\forall\forall$-hyperproperties such as non-interference and $\exists\exists$-properties such as non-determinism. However, they do not support quantifier alternation, which is for instance needed to express GNI. The only existing logic that can reason about such properties is Hyper Hoare Logic, but it does not support heap-manipulating programs and, thus, is not applicable to common imperative programs. This paper introduces Hyper Separation Logic (HSL), the first program logic that supports modular reasoning about hyperproperties with arbitrary quantifier alternation over programs that manipulate the heap. HSL generalizes Hyper Hoare Logic with a novel hyper separating conjunction that lifts the standard separating conjunction to sets of states, enabling a generalized frame rule for hyperproperties. We prove HSL sound in Isabelle/HOL and demonstrate its expressiveness for hyperproperties that lie beyond the reach of existing separation logics.

Soteria: Efficient Symbolic Execution as a Functional Library

2026-04-20T12:25:19Z

Symbolic execution (SE) tools often rely on intermediate languages (ILs) to support multiple programming languages, promising reusability and efficiency. In practice, this approach introduces trade-offs between performance, accuracy, and language feature support. We argue that building SE engines \emph{directly} for each source language is both simpler and more effective. We present Soteria, a lightweight OCaml library for writing SE engines in a functional style, without compromising on performance, accuracy or feature support. Soteria enables developers to construct SE engines that operate directly over source-language semantics, offering \emph{configurability}, compositional reasoning, and ease of implementation. Using Soteria, we develop Soteria$^{\text{Rust}}$, the \emph{first} Rust SE engine supporting Tree Borrows (the intricate aliasing model of Rust), and Soteria$^{\text{C}}$, a compositional SE engine for C. Both tools are competitive with or outperform state-of-the-art tools such as Kani, Pulse, CBMC and Gillian-C in performance and the number of bugs detected. We formalise the theoretical foundations of Soteria and prove its soundness, demonstrating that sound, efficient, accurate, and expressive SE can be achieved without the compromises of ILs.

Sockeye: a language for analyzing hardware documentation

2026-04-20T11:34:41Z

The ever increasing complexity of hardware platforms poses a challenge to systems programmers. Correctly programming a multitude of components, providing functionality and security, is difficult: semantics of individual units are described in prose, underspecified, and prone to inaccuracies. Rigorous statements about platform security are often impossible. We introduce a domain-specific language to describe hardware semantics, assumptions about software behavior, and desired security properties. We then create machine-readable specifications for a diverse set of eight platforms from their reference manuals, and formally prove their (in-)security. In addition to security proofs about memory confidentiality and integrity, we discover a handful of documentation errors. Finally, our analysis also revealed a vulnerability on a real-world server chip, which was confirmed by the vendor to apply to a wide family of deployed network appliances. Our tooling offers system integrators a way of formally describing security properties for whole platforms, and the means to find counterexamples, or proving them correct.

Surreal Arithmetic, Lazily

2026-04-20T06:56:48Z

Conway's surreal numbers were aptly named by Knuth. This note examines how far one can get towards implementing surreals and the arithmetic operations on them so that they execute efficiently. Lazy evaluation and recursive data structures yield a considerable speed up.

Enabling AI ASICs for Zero Knowledge Proof

2026-04-20T05:00:47Z

Zero-knowledge proof (ZKP) provers remain costly because multi-scalar multiplication (MSM) and number-theoretic transforms (NTTs) dominate runtime as they need significant computation. AI ASICs such as TPUs provide massive matrix throughput and SotA energy efficiency. We present MORPH, the first framework that reformulates ZKP kernels to match AI-ASIC execution. We introduce Big-T complexity, a hardware-aware complexity model that exposes heterogeneous bottlenecks and layout-transformation costs ignored by Big-O. Guided by this analysis, (1) at arithmetic level, MORPH develops an MXU-centric extended-RNS lazy reduction that converts high-precision modular arithmetic into dense low-precision GEMMs, eliminating all carry chains, and (2) at dataflow level, MORPH constructs a unified-sharding layout-stationary TPU Pippenger MSM and optimized 3/5-step NTT that avoid on-TPU shuffles to minimize costly memory reorganization. Implemented in JAX, MORPH enables TPUv6e8 to achieve up-to 10x higher throughput on NTT and comparable throughput on MSM than GZKP. Our code: https://github.com/EfficientPPML/MORPH.

SDLLMFuzz: Dynamic-static LLM-assisted greybox fuzzing for structured input programs

2026-04-20T03:10:31Z

Fuzzing has become a widely adopted technique for vulnerability discovery, yet it remains ineffective for structured-input programs due to strict syntactic constraints and limited semantic awareness. Traditional greybox fuzzers rely on mutation-based strategies and coarse-grained coverage feedback, which often fail to generate valid inputs and explore deep execution paths. Recent advances in large language models (LLMs) have shown promise in improving input generation, but existing approaches primarily focus on seed generation and largely overlook the effective use of runtime feedback. In this paper, we propose SDLLMFuzz, a dynamic-static LLM-assisted greybox fuzzing framework for structured-input programs. Our approach integrates LLM-based structure-aware seed generation with static crash analysis, forming a unified feedback loop that iteratively refines test inputs. Specifically, we leverage LLMs to generate syntactically valid and semantically diverse inputs, while extracting rich semantic information from crash artifacts (e.g., core dumps and execution traces) to guide subsequent input generation. This dynamic-static feedback mechanism enables more efficient exploration of complex program behaviors. We evaluate SDLLMFuzz on the Magma benchmark across multiple structured-input programs, including libxml2, libpng, and libsndfile. Experimental results show that SDLLMFuzz significantly outperforms traditional greybox fuzzers and LLM-assisted baselines in terms of bug discovery and time-to-bug. These results demonstrate that combining semantic input generation with feedback-driven refinement is an effective direction for improving fuzzing performance on structured-input programs.