https://arxiv.org/api//uhg80FkdSL66UIY7uTFdjp/mxs2026-06-23T07:26:28Z993491515http://arxiv.org/abs/2511.22419v1On Circuit Description Languages, Indexed Monads, and Resource Analysis2025-11-27T12:57:46ZIn this paper, a monad-based denotational model is introduced and shown adequate for the Proto-Quipper family of calculi, themselves being idealized versions of the Quipper programming language. The use of a monadic approach allows us to separate the value to which a term reduces from the circuit that the term itself produces as a side effect. In turn, this enables the denotational interpretation and validation of rich type systems in which the size of the produced circuit can be controlled. Notably, the proposed semantic framework, through the novel concept of circuit algebra, suggests forms of effect typing guaranteeing quantitative properties about the resulting circuit, even in presence of optimizations.2025-11-27T12:57:46ZExtended version of a paper to be published at POPL 2026Ken SakayoriAndrea ColledanUgo Dal Lagohttp://arxiv.org/abs/2501.16607v3MCTS-SQL: Light-Weight LLMs can Master the Text-to-SQL through Monte Carlo Tree Search2025-11-27T09:37:42ZText-to-SQL is a fundamental yet challenging task in the NLP area, aiming at translating natural language questions into SQL queries. While recent advances in large language models have greatly improved performance, most existing approaches depend on models with tens of billions of parameters or costly APIs, limiting their applicability in resource-constrained environments. For real world, especially on edge devices, it is crucial for Text-to-SQL to ensure cost-effectiveness. Therefore, enabling the light-weight models for Text-to-SQL is of great practical significance. However, smaller LLMs often struggle with complicated user instruction, redundant schema linking or syntax correctness. To address these challenges, we propose MCTS-SQL, a novel framework that uses Monte Carlo Tree Search to guide SQL generation through multi-step refinement. Since the light-weight models' weak performance of single-shot prediction, we generate better results through several trials with feedback. However, directly applying MCTS-based methods inevitably leads to significant time and computational overhead. Driven by this issue, we propose a token-level prefix-cache mechanism that stores prior information during iterations, effectively improved the execution speed. Experiments results on the SPIDER and BIRD benchmarks demonstrate the effectiveness of our approach. Using a small open-source Qwen2.5-Coder-1.5B, our method outperforms ChatGPT-3.5. When leveraging a more powerful model Gemini 2.5 to explore the performance upper bound, we achieved results competitive with the SOTA. Our findings demonstrate that even small models can be effectively deployed in practical Text-to-SQL systems with the right strategy.2025-01-28T00:52:23ZAccepted by AAAI 2026Shuozhi YuanLimin ChenMiaomiao YuanZhao Jinhttp://arxiv.org/abs/2511.22075v1Expanding Specification Capabilities of a Gradual Verifier with Pure Functions2025-11-27T03:53:44ZGradual verification soundly combines static checking and dynamic checking to provide an incremental approach for software verification. With gradual verification, programs can be partially specified first, and then the full specification of a program can be achieved in incremental steps. The first and only practicable gradual verifier based on symbolic execution, Gradual C0, supports recursive heap data structures. Despite recent efforts to improve the expressivity of Gradual C0's specification language, Gradual C0's specification language is still limited in its capabilities for complex expressions. This work explores an extension to Gradual C0's design with a common construct supported by many static verification tools, pure functions, which both extend the specification capabilities of Gradual C0 and increase the ease of encoding observer methods in Gradual C0. Our approach addresses the technical challenges related to the axiomatisation of pure functions with imprecise specifications.2025-11-27T03:53:44ZSubmitted to the 53rd ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2026) Student Research CompetitionDoruk Alp Mutluhttp://arxiv.org/abs/2511.21994v1When Are Reactive Notebooks Not Reactive?2025-11-27T00:43:27ZComputational notebooks are convenient for programmers, but can easily become confusing and inconsistent due to the ability to incrementally edit a program that is running. Recent reactive notebook systems, such as Ipyflow, Marimo and Observable, strive to keep notebook state in sync with the current cell code by re-executing a minimal set of cells upon modification. However, each system defines reactivity a different way. Additionally, within any definition, we find simple notebook modifications that can break each system. Overall, these inconsistencies make it difficult for users to construct a mental model of their reactive notebook's implementation. This paper proposes Rex, a fine-grained test suite to discuss and assess reactivity capabilities within reactive notebook systems. We evaluate Rex on three existing reactive notebook systems and classify their failures with the aims of (i) helping programmers understand when reactivity fails and (ii) helping notebook implementations improve.2025-11-27T00:43:27ZMegan ZhengWill CrichtonAkshay NarayanDeepti RaghavanNikos Vasilakishttp://arxiv.org/abs/2510.15585v2Leveraging Test Driven Development with Large Language Models for Reliable and Verifiable Spreadsheet Code Generation: A Research Framework2025-11-26T17:42:12ZLarge Language Models (LLMs), such as ChatGPT, are increasingly leveraged for generating both traditional software code and spreadsheet logic. Despite their impressive generative capabilities, these models frequently exhibit critical issues such as hallucinations, subtle logical inconsistencies, and syntactic errors, risks particularly acute in high stakes domains like financial modelling and scientific computations, where accuracy and reliability are paramount. This position paper proposes a structured research framework that integrates the proven software engineering practice of Test-Driven Development (TDD) with Large Language Model (LLM) driven generation to enhance the correctness of, reliability of, and user confidence in generated outputs. We hypothesise that a "test first" methodology provides both technical constraints and cognitive scaffolding, guiding LLM outputs towards more accurate, verifiable, and comprehensible solutions. Our framework, applicable across diverse programming contexts, from spreadsheet formula generation to scripting languages such as Python and strongly typed languages like Rust, includes an explicitly outlined experimental design with clearly defined participant groups, evaluation metrics, and illustrative TDD based prompting examples. By emphasising test driven thinking, we aim to improve computational thinking, prompt engineering skills, and user engagement, particularly benefiting spreadsheet users who often lack formal programming training yet face serious consequences from logical errors. We invite collaboration to refine and empirically evaluate this approach, ultimately aiming to establish responsible and reliable LLM integration in both educational and professional development practices.2025-10-17T12:28:16Z16 pagesProceedings of the EuSpRIG 2025 Conference "Spreadsheet Productivity & Risks" ISBN : 978-1-905404-60-5Simon ThorneAdvait Sarkarhttp://arxiv.org/abs/2511.21509v1SV-LIB 1.0: A Standard Exchange Format for Software-Verification Tasks2025-11-26T15:44:54ZIn the past two decades, significant research and development effort went into the development of verification tools for individual languages, such asC, C++, and Java. Many of the used verification approaches are in fact language-agnostic and it would be beneficial for the technology transfer to allow for using the implementations also for other programming and modeling languages. To address the problem, we propose SV-LIB, an exchange format and intermediate language for software-verification tasks, including programs, specifications, and verification witnesses. SV-LIBis based on well-known concepts from imperative programming languages and uses SMT-LIB to represent expressions and sorts used in the program. This makes it easy to parse and to build into existing infrastructure, since many verification tools are based on SMT solvers already. Furthermore, SV-LIBdefines a witness format for both correct and incorrect SV-LIB programs, together with means for specifying witness-validation tasks. This makes it possible both to implement independent witness validators and to reuse some verifiers also as validators for witnesses. This paper presents version 1.0 of the SV-LIBformat, including its design goals, the syntax, and informal semantics. Formal semantics and further extensions to concurrency are planned for future versions.2025-11-26T15:44:54ZDirk BeyerGidon ErnstMartin JonášMarian Lingsch-Rosenfeldhttp://arxiv.org/abs/2507.18509v2Higher-Order Behavioural Conformances via Fibrations2025-11-26T15:43:08ZCoinduction is a widely used technique for establishing behavioural equivalence of programs in higher-order languages. In recent years, the rise of languages with quantitative (e.g.~probabilistic) features has led to extensions of coinductive methods to more refined types of behavioural conformances, most notably notions of behavioural distance. To guarantee soundness of coinductive reasoning, one needs to show that the behavioural conformance at hand forms a program congruence, i.e. it is suitably compatible with the operations of the language. This is usually achieved by a complex proof technique known as \emph{Howe's method}, which needs to be carefully adapted to both the specific language and the targeted notion of behavioural conformance. We develop a uniform categorical approach to Howe's method that features two orthogonal dimensions of abstraction: (1) the underlying higher-order language is modelled by an \emph{abstract higher-order specification} (AHOS), a novel and very general categorical account of operational semantics, and (2) notions of behavioural conformance (such as relations or metrics) are modelled via fibrations over the base category of an AHOS. Our main result is a fundamental congruence theorem at this level of generality: Under natural conditions on the categorical ingredients and the operational rules of a language modelled by an AHOS, the greatest behavioural (bi)conformance on its operational model forms a congruence. We illustrate our theory by deriving congruence of bisimilarity and behavioural pseudometrics for probabilistic higher-order languages.2025-07-24T15:28:08ZHenning Urbathttp://arxiv.org/abs/2511.21209v1Towards Computational UIP in Cubical Agda2025-11-26T09:40:01ZSome advantages of Cubical Type Theory, as implemented by Cubical Agda, over intensional Martin-Löf Type Theory include Quotient Inductive Types (QITs), which exist as instances of Higher Inductive Types, and functional extensionality, which is provable in Cubical Type Theory. However, HoTT features an infinite hierarchy of equalities that may become unwieldy in formalisations. Fortunately, QITs and functional extensionality are both preserved even if the equality levels of Cubical Type Theory are truncated to only homotopical Sets (h-Sets). In other words, removing the univalence axiom from Cubical Type Theory and instead postulating a conflicting axiom: the Uniqueness of Identity Proofs (UIP) postulate. Since univalence is proved in Cubical Type Theory from the so-called Glue Types, therefore, it is known that one can first remove the Glue Types (thus removing univalence) and then set-truncate all equalities (essentially assuming UIP), à la XTT. The result is a "h-Set Cubical Type Theory" that retains features such as functional extensionality and QITs.
However, in Cubical Agda, there are currently only two unsatisfying ways to achieve h-Set Cubical Type Theory. The first is to give up on the canonicity of the theory and simply postulate the UIP axiom, while the second way is to use a standard result stating "type formers preserve h-levels" to manually prove UIP for every defined type. The latter is, however, laborious work best suited for an automatic implementation by the proof assistant. In this project, we analyse formulations of UIP and detail their computation rules for Cubical Agda, and evaluate their suitability for implementation. We also implement a variant of Cubical Agda without Glue, which is already compatible with postulated UIP, in anticipation of a future implementation of UIP in Cubical Agda.2025-11-26T09:40:01ZYee-Jian TanAndreas NuytsDominique Devriesehttp://arxiv.org/abs/2408.16234v2Quantum Programming Without the Quantum Physics2025-11-26T08:02:34ZWe propose a quantum programming paradigm where all data are familiar classical data, and the only non-classical element is a random number generator that can return results with negative probability. Currently, the vast majority of quantum programming languages instead work with quantum data types made up of qubits. The description of their behavior relies on heavy linear algebra and many interdependent concepts and intuitions from quantum physics, which takes dedicated study to understand. We demonstrate that the proposed view of quantum programming explains its central concepts and constraints in more accessible, computationally relevant terms. This is achieved by systematically reducing everything to the existence of that negative-probability random generator, avoiding mention of advanced physics as much as possible. This makes quantum programming more accessible to programmers without a deep background in physics or linear algebra. The bulk of this paper is written with such an audience in mind. As a working vehicle, we lay out a simple quantum programming language under this paradigm, showing that not only can it express all quantum programs, it also naturally captures the semantics of measurement without ever mentioning qubits or collapse. The language is proved to be implementable and universal.2024-08-29T03:21:08Z20 pages, 5 figures. Version accepted at APLAS 2024Inoue, J. (2025). Quantum Programming Without the Quantum Physics. In: Kiselyov, O. (eds) Programming Languages and Systems. APLAS 2024. Lecture Notes in Computer Science, vol 15194. Springer, SingaporeJun Inoue10.1007/978-981-97-8943-6_8http://arxiv.org/abs/2511.20617v1Translating Large-Scale C Repositories to Idiomatic Rust2025-11-25T18:42:46ZExisting C to Rust translation techniques fail to balance quality and scalability: transpilation-based approaches scale to large projects but produce code with poor safety, idiomaticity, and readability. In contrast, LLM-based techniques are prohibitively expensive due to their reliance on frontier models (without which they cannot reliably generate compilable translations), thus limiting scalability. This paper proposes Rustine, a fully automated pipeline for effective and efficient repository-level C to idiomatic safe Rust translation. Evaluating on a diverse set of 23 C programs, ranging from 27 to 13,200 lines of code, Rustine can generate fully compilable Rust code for all and achieve 87% functional equivalence (passing 1,063,099 assertions out of 1,221,192 in test suites with average function and line coverage of 74.7% and 72.2%). Compared to six prior repository-level C to Rust translation techniques, the translations by Rustine are overall safer (fewer raw pointers, pointer arithmetic, and unsafe constructs), more idiomatic (fewer Rust linter violations), and more readable. When the translations cannot pass all tests to fulfill functional equivalence, human developers were able to complete the task in 4.5 hours, on average, using Rustine as debugging support.2025-11-25T18:42:46Z21 pages, 14 figuresSaman DehghanUniversity of Illinois at Urbana-Champaign, USATianran SunShanghai Jiao Tong University, ChinaTianxiang WuUniversity of Illinois at Urbana-Champaign, USAZihan LiUniversity of Illinois at Urbana-Champaign, USAReyhaneh JabbarvandUniversity of Illinois at Urbana-Champaign, USAhttp://arxiv.org/abs/2411.11662v3Probabilistic Concurrent Reasoning in Outcome Logic: Independence, Conditioning, and Invariants2025-11-25T18:27:23ZAlthough randomization has long been used in distributed computing, formal methods for reasoning about probabilistic concurrent programs have lagged behind. No existing program logics can express specifications about the full distributions of outcomes resulting from programs that are both probabilistic and concurrent. To address this, we introduce Probabilistic Concurrent Outcome Logic (pcOL), which incorporates ideas from concurrent and probabilistic separation logics into Outcome Logic to introduce new compositional reasoning principles. At its core, pcOL reinterprets the rules of Concurrent Separation Logic in a setting where separation models probabilistic independence, so as to compositionally describe joint distributions over variables in concurrent threads. Reasoning about outcomes also proves crucial, as case analysis is often necessary to derive precise information about threads that rely on randomized shared state. We demonstrate pcOL on a variety of examples, including to prove almost sure termination of unbounded loops.2024-11-18T15:38:36ZProceedings of the ACM on Programming Languages (PACMPL). Volume 10, Issue POPL, Article 9 (January 2026)Noam ZilbersteinAlexandra SilvaJoseph Tassarotti10.1145/3776651http://arxiv.org/abs/2509.15074v2Weighted Automata for Exact Inference in Discrete Probabilistic Programs2025-11-25T16:32:13ZIn probabilistic programming, the inference problem asks to determine a program's posterior distribution conditioned on its "observe" instructions. Inference is challenging, especially when exact rather than approximate results are required. Inspired by recent work on probability generating functions (PGFs), we propose encoding distributions on $\mathbb{N}^k$ as weighted automata over a commutative alphabet with $k$ symbols. Based on this, we map the semantics of various imperative programming statements to automata-theoretic constructions. For a rich class of programs, this results in an effective translation from prior to posterior distribution, both encoded as automata. We prove that our approach is sound with respect to a standard operational program semantics.2025-09-18T15:35:40ZDominik GeißlerTobias Winklerhttp://arxiv.org/abs/2511.10361v2Lazy Linearity for a Core Functional Language2025-11-25T15:08:22ZTraditionally, in linearly typed languages, consuming a linear resource is synonymous with its syntactic occurrence in the program. However, under the lens of non-strict evaluation, linearity can be further understood semantically, where a syntactic occurrence of a resource does not necessarily entail using that resource when the program is executed. While this distinction has been largely unexplored, it turns out to be inescapable in Haskell's optimising compiler, which heavily rewrites the source program in ways that break syntactic linearity but preserve the program's semantics. We introduce Linear Core, a novel system which accepts the lazy semantics of linearity statically and is suitable for lazy languages such as the Core intermediate language of the Glasgow Haskell Compiler. We prove that Linear Core is sound, guaranteeing linear resource usage, and that multiple optimising transformations preserve linearity in Linear Core while failing to do so in Core. We have implemented Linear Core as a compiler plugin to validate the system against linearity-heavy libraries, including linear-base.2025-11-13T14:38:41ZExtended version of POPL 2026 paperRodrigo MesquitaBernardo Toninho10.1145/3776711http://arxiv.org/abs/2506.06835v2Hadamard-Pi: Equational Quantum Programming2025-11-25T11:09:59ZQuantum computing offers advantages over classical computation, yet the precise features that set the two apart remain unclear. In the standard quantum circuit model, adding a 1-qubit basis-changing gate -- commonly chosen to be the Hadamard gate -- to a universal set of classical reversible gates yields computationally universal quantum computation. However, the computational behaviours enabled by this addition are not fully characterised. We give such a characterisation by introducing a small quantum programming language extending the universal classical reversible programming language $Π$ with a single primitive corresponding to the Hadamard gate. The language comes equipped with a sound and complete categorical semantics that is specified by a purely equational theory. Completeness is shown by means of a novel finite presentation, and a corresponding synthesis algorithm, for the groups of orthogonal matrices with entries in the ring $\mathbb{Z}[\tfrac{1}{\sqrt{2}}]$.2025-06-07T15:26:33Z116 pages; v2: Extended version of POPL 2026 publicationProc. ACM Program. Lang. 10, POPL, Article 5 (January 2026), 27 pagesWang FangChris HeunenRobin Kaarsgaard10.1145/3776647http://arxiv.org/abs/2511.11055v2Data Race Detection by Digest-Driven Abstract Interpretation (Extended Version)2025-11-25T02:50:57ZSound static analysis can prove the absence of data races by establishing that no two conflicting memory accesses can occur at the same time. We repurpose the concept of digests -- summaries of computational histories originally introduced to bring tunable concurrency-sensitivity to thread-modular value analysis by abstract interpretation, extending this idea to race detection: We use digests to capture the conditions under which conflicting accesses may not happen in parallel. To formalize this, we give a definition of data races in the thread-modular local trace semantics and show how exclusion criteria for potential conflicts can be expressed as digests. We report on our implementation of digest-driven data race detection in the static analyzer Goblint, and evaluate it on the SV-COMP benchmark suite. Combining the lockset digest with digests reasoning on thread ids and thread joins increases the number of correctly solved tasks by more than a factor of five compared to lockset reasoning alone.2025-11-14T08:11:31ZExtended of paper accepted to appear at VMCAI'26; 29 Pages, including 2 AppendicesMichael SchwarzJulian Erhard