https://arxiv.org/api/QQaea4BqGed9jqmcWEsCM5kwch02026-06-30T19:34:22Z9958199515http://arxiv.org/abs/2411.10393v2Guaranteed Bounds on Posterior Distributions of Discrete Probabilistic Programs with Loops2024-12-05T09:39:17ZWe study the problem of bounding the posterior distribution of discrete probabilistic programs with unbounded support, loops, and conditioning. Loops pose the main difficulty in this setting: even if exact Bayesian inference is possible, the state of the art requires user-provided loop invariant templates. By contrast, we aim to find guaranteed bounds, which sandwich the true distribution. They are fully automated, applicable to more programs and provide more provable guarantees than approximate sampling-based inference. Since lower bounds can be obtained by unrolling loops, the main challenge is upper bounds, and we attack it in two ways. The first is called residual mass semantics, which is a flat bound based on the residual probability mass of a loop. The approach is simple, efficient, and has provable guarantees.
The main novelty of our work is the second approach, called geometric bound semantics. It operates on a novel family of distributions, called eventually geometric distributions (EGDs), and can bound the distribution of loops with a new form of loop invariants called contraction invariants. The invariant synthesis problem reduces to a system of polynomial inequality constraints, which is a decidable problem with automated solvers. If a solution exists, it yields an exponentially decreasing bound on the whole distribution, and can therefore bound moments and tail asymptotics as well, not just probabilities as in the first approach.
Both semantics enjoy desirable theoretical properties. In particular, we prove soundness and convergence, i.e. the bounds converge to the exact posterior as loops are unrolled further. On the practical side, we describe Diabolo, a fully-automated implementation of both semantics, and evaluate them on a variety of benchmarks from the literature, demonstrating their general applicability and the utility of the resulting bounds.2024-11-15T18:01:40ZFull version of the POPL 2025 article, including proofs and other supplementary materialFabian ZaiserAndrzej S. MurawskiC. -H. Luke Ong10.1145/3704874http://arxiv.org/abs/2305.18945v3String Diagrams for $λ$-calculi and Functional Computation2024-12-04T17:17:51ZThis tutorial gives an advanced introduction to string diagrams and graph languages for higher-order computation. The subject matter develops in a principled way, starting from the two dimensional syntax of key categorical concepts such as functors, adjunctions, and strictification, and leading up to Cartesian Closed Categories, the core mathematical model of the lambda calculus and of functional programming languages. This methodology inverts the usual approach of proceeding from syntax to a categorical interpretation, by rationally reconstructing a syntax from the categorical model. The result is a graph syntax -- more precisely, a hierarchical hypergraph syntax -- which in many ways is shown to be an improvement over the conventional linear term syntax. The rest of the tutorial focuses on applications of interest to programming languages: operational semantics, general frameworks for type inference, and complex whole-program transformations such as closure conversion and automatic differentiation.2023-05-30T11:24:27ZDan GhicaFabio Zanasihttp://arxiv.org/abs/2412.03612v1Chatting with Logs: An exploratory study on Finetuning LLMs for LogQL2024-12-04T14:06:24ZLogging is a critical function in modern distributed applications, but the lack of standardization in log query languages and formats creates significant challenges. Developers currently must write ad hoc queries in platform-specific languages, requiring expertise in both the query language and application-specific log details -- an impractical expectation given the variety of platforms and volume of logs and applications. While generating these queries with large language models (LLMs) seems intuitive, we show that current LLMs struggle with log-specific query generation due to the lack of exposure to domain-specific knowledge. We propose a novel natural language (NL) interface to address these inconsistencies and aide log query generation, enabling developers to create queries in a target log query language by providing NL inputs. We further introduce ~\textbf{NL2QL}, a manually annotated, real-world dataset of natural language questions paired with corresponding LogQL queries spread across three log formats, to promote the training and evaluation of NL-to-loq query systems. Using NL2QL, we subsequently fine-tune and evaluate several state of the art LLMs, and demonstrate their improved capability to generate accurate LogQL queries. We perform further ablation studies to demonstrate the effect of additional training data, and the transferability across different log formats. In our experiments, we find up to 75\% improvement of finetuned models to generate LogQL queries compared to non finetuned models.2024-12-04T14:06:24Zdraft under submission at another venueVishwanath SeshagiriSiddharth BalyanVaastav AnandKaustubh DholeIshan SharmaAvani WildaniJosé CambroneroAndreas Züflehttp://arxiv.org/abs/2412.03310v1Grounded Language Design for Lightweight Diagramming for Formal Methods2024-12-04T13:37:59ZModel finding, as embodied by SAT solvers and similar tools, is used widely, both in embedding settings and as a tool in its own right. For instance, tools like Alloy target SAT to enable users to incrementally define, explore, verify, and diagnose sophisticated specifications for a large number of complex systems.
These tools critically include a visualizer that lets users graphically explore these generated models. As we show, however, default visualizers, which know nothing about the domain, are unhelpful and even actively violate presentational and cognitive principles. At the other extreme, full-blown visualizations require significant effort as well as knowledge a specifier might not possess; they can also exhibit bad failure modes (including silent failure). Instead, we need a language to capture essential domain information for lightweight diagramming. We ground our language design in both the cognitive science literature on diagrams and on a large number of example custom visualizations. This identifies the key elements of lightweight diagrams. We distill these into a small set of orthogonal primitives. We extend an Alloy-like tool to support these primitives. We evaluate the effectiveness of the produced diagrams, finding them good for reasoning. We then compare this against many other drawing languages and tools to show that this work defines a new niche that is lightweight, effective, and driven by sound principles.2024-12-04T13:37:59ZSiddhartha PrasadBen GreenmanTim NelsonShriram Krishnamurthihttp://arxiv.org/abs/2412.03127v1Summa Summarum: Moessner's Theorem without Dynamic Programming2024-12-04T08:44:02ZSeventy years on, Moessner's theorem and Moessner's process -- i.e., the additive computation of integral powers -- continue to fascinate. They have given rise to a variety of elegant proofs, to an implementation in hardware, to generalizations, and now even to a popular video, "The Moessner Miracle.'' The existence of this video, and even more its title, indicate that while the "what'' of Moessner's process is understood, its "how'' and even more its "why'' are still elusive. And indeed all the proofs of Moessner's theorem involve more complicated concepts than both the theorem and the process. This article identifies that Moessner's process implements an additive function with dynamic programming. A version of this implementation without dynamic programming (1) gives rise to a simpler statement of Moessner's theorem and (2) can be abstracted and then instantiated into related additive computations. The simpler statement also suggests a simpler and more efficient implementation to compute integral powers as well as simple additive functions to compute, e.g., Factorial numbers. It also reveals the source of -- to quote John Conway and Richard Guy -- Moessner's magic.2024-12-04T08:44:02ZIn Proceedings PT 2024, arXiv:2412.01856EPTCS 413, 2024, pp. 57-92Olivier DanvyNational University of Singapore10.4204/EPTCS.413.5http://arxiv.org/abs/2412.03126v1Completing the Functional Approach in Object-Oriented Languages2024-12-04T08:43:46ZOver the last two decades practically all object-oriented programming languages have introduced features that are well-known from functional programming languages. But many features that were introduced were fragmentary. In Java-TX we address the latter features and propose a completion. Java-TX (i.e. Type eXtended) is a language based on Java. The predominant new features are global type inference and real function types for lambda expressions. Global type inference means that all type annotations can be omitted, and the compiler infers them without losing the static type property. We introduce the function types in a similar fashion as in Scala but additionally integrated them into the Java target-typing as proposed in the so-called strawman approach. In this paper, we provide an integrated presentation of all Java-TX features. The focus is therby on the automatic inference of type parameters for classes and their methods, and on the heterogeneous translation of function types, which permits the preservation of the argument and return types in bytecode.2024-12-04T08:43:46ZIn Proceedings PT 2024, arXiv:2412.01856EPTCS 413, 2024, pp. 43-56Martin Pluemicke10.4204/EPTCS.413.4http://arxiv.org/abs/2412.03125v1Gradual Guarantee via Step-Indexed Logical Relations in Agda2024-12-04T08:43:30ZThe gradual guarantee is an important litmus test for gradually typed languages, that is, languages that enable a mixture of static and dynamic typing. The gradual guarantee states that changing the precision of a type annotation does not change the behavior of the program, except perhaps to trigger an error if the type annotation is incorrect. Siek et al. (2015) proved that the Gradually Typed Lambda Calculus (GTLC) satisfies the gradual guarantee using a simulation-based proof and mechanized their proof in Isabelle. In the following decade, researchers have proved the gradual guarantee for more sophisticated calculi, using step-indexed logical relations. However, given the complexity of that style of proof, there has not yet been a mechanized proof of the gradual guarantee using step-indexed logical relations. This paper reports on a mechanized proof of the gradual guarantee for the GTLC carried out in the Agda proof assistant.2024-12-04T08:43:30ZIn Proceedings PT 2024, arXiv:2412.01856EPTCS 413, 2024, pp. 27-42Jeremy G. SiekIndiana University10.4204/EPTCS.413.3http://arxiv.org/abs/2412.03124v1Explicit Weakening2024-12-04T08:43:14ZI present a novel formulation of substitution, where facts about substitution that previously required tens or hundreds of lines to justify in a proof assistant now follow immediately - they can be justified by writing the four letters "refl". The paper is an executable literate Agda script, and source of the paper is available as an artifact in the file Weaken.lagda.md.
Not all consequences of the pandemic have been awful. For the last three years, I've had the great pleasure of meeting with Peter Thiemann and Jeremy Siek for a couple of hours every week, via Zoom, exploring topics including core calculi, gradual typing, and formalisation in Agda. The work reported here arose from those discussions, and is dedicated to Peter on the occasion of his 60th birthday.2024-12-04T08:43:14ZIn Proceedings PT 2024, arXiv:2412.01856EPTCS 413, 2024, pp. 15-26Philip WadlerUniversity of Edinburgh10.4204/EPTCS.413.2http://arxiv.org/abs/2412.03122v1Inversion by Partial Evaluation: A Reversible Interpreter Experiment2024-12-04T08:42:58ZA computational limit of combining partial evaluation and program inversion is investigated. Using a reversible Turing machine interpreter, we show that the first Futamura and inversion projections can produce not only functionally but also textually equivalent programs. The construction of the interpreter in a reversible flowchart language is shown in full. Insights are provided on the practical interplay between reversible interpreters, program inverters, and partial evaluators. We conclude that both projections must be included in the program transformation toolbox.2024-12-04T08:42:58ZIn Proceedings PT 2024, arXiv:2412.01856. Dedicated to Peter Thiemann on the Occasion of his 60th BirthdayEPTCS 413, 2024, pp. 1-14Robert GlückDIKU, Dept. of Computer Science, University of CopenhagenLouis Marott NormannDIKU, Dept. of Computer Science, University of Copenhagen10.4204/EPTCS.413.1http://arxiv.org/abs/2412.02765v1Massimult: A Novel Parallel CPU Architecture Based on Combinator Reduction2024-12-03T19:07:03ZThe Massimult project aims to design and implement an innovative CPU architecture based on combinator reduction with a novel combinator base and a new abstract machine. The evaluation of programs within this architecture is inherently highly parallel and localized, allowing for faster computation, reduced energy consumption, improved scalability, enhanced reliability, and increased resistance to attacks. In this paper, we introduce the machine language LambdaM, detail its compilation into KVY assembler code, and describe the abstract machine Matrima. The best part of Matrima is its ability to exploit inherent parallelism and locality in combinator reduction, leading to significantly faster computations with lower energy consumption, scalability across multiple processors, and enhanced security against various types of attacks. Matrima can be simulated as a software virtual machine and is intended for future hardware implementation.2024-12-03T19:07:03ZJurgen Nicklisch-FrankenRuslan Feizerakhmanovhttp://arxiv.org/abs/2407.14107v2Approximate Relational Reasoning for Higher-Order Probabilistic Programs2024-12-03T14:02:57ZProperties such as provable security and correctness for randomized programs are naturally expressed relationally as approximate equivalences. As a result, a number of relational program logics have been developed to reason about such approximate equivalences of probabilistic programs. However, existing approximate relational logics are mostly restricted to first-order programs without general state.
In this paper we develop Approxis, a higher-order approximate relational separation logic for reasoning about approximate equivalence of programs written in an expressive ML-like language with discrete probabilistic sampling, higher-order functions, and higher-order state. The Approxis logic recasts the concept of error credits in the relational setting to reason about relational approximation, which allows for expressive notions of modularity and composition, a range of new approximate relational rules, and an internalization of a standard limiting argument for showing exact probabilistic equivalences by approximation. We also use Approxis to develop a logical relation model that quantifies over error credits, which can be used to prove exact contextual equivalence. We demonstrate the flexibility of our approach on a range of examples, including the PRP/PRF switching lemma, IND\$-CPA security of an encryption scheme, and a collection of rejection samplers. All of the results have been mechanized in the Coq proof assistant and the Iris separation logic framework.2024-07-19T08:22:10ZCamera-ready POPL submission including additional appendixProc. ACM Program. Lang. 9, POPL, Article 41 (January 2025)Philipp G. HaselwarterKwing Hei LiAlejandro AguirreSimon Oddershede GregersenJoseph TassarottiLars Birkedal10.1145/3704877http://arxiv.org/abs/2412.01545v1Beyond SICP -- Design and Implementation of a Notional Machine for Scheme2024-12-02T14:34:30ZComputer science education has been at the heart of Scheme from the beginning. The language was designed in the 1970s concurrently with the MIT course 6.001 and the textbook "Structure and Interpretation of Computer Programs" (SICP). To explain the scope of variables at run time in the presence of higher-order procedures, SICP introduces a mental model called the environment model, along with a pictorial representation of environments and data structures. Recently, the concept of notional machines has emerged in computer science education: a predictive set of abstractions that define the structure and behavior of a computational device. Proponents of notional machines argue that learners benefit when complex dynamic concepts such as the computational structure of Scheme are accompanied with concise notional machines. In this paper, we start with a sublanguage of Scheme sufficient for all programs in SICP that we call SICP Scheme. We extend the environment model to a full notional machine for SICP Scheme that is simple enough to serve as the central mental model in a CS1 course and demonstrate the machine with computer-generated visualizations. Moving beyond SICP Scheme, we show how the notional machine can be further extended to explain Scheme's call/cc and thus make this powerful concept accessible to beginners through a coherent mental model. The presented notional machine serves as the core of a web-based implementation of Scheme that is under development at our university.2024-12-02T14:34:30ZKyriel AbadMartin Henzhttp://arxiv.org/abs/2405.02326v2Evaluating LLMs for Hardware Design and Test2024-12-02T01:59:30ZLarge Language Models (LLMs) have demonstrated capabilities for producing code in Hardware Description Languages (HDLs). However, most of the focus remains on their abilities to write functional code, not test code. The hardware design process consists of both design and test, and so eschewing validation and verification leaves considerable potential benefit unexplored, given that a design and test framework may allow for progress towards full automation of the digital design pipeline. In this work, we perform one of the first studies exploring how a LLM can both design and test hardware modules from provided specifications. Using a suite of 8 representative benchmarks, we examined the capabilities and limitations of the state-of-the-art conversational LLMs when producing Verilog for functional and verification purposes. We taped out the benchmarks on a Skywater 130nm shuttle and received the functional chip.2024-04-23T18:55:49ZJason BlockloveSiddharth GargRamesh KarriHammond Pearce10.1109/LAD62341.2024.10691811http://arxiv.org/abs/2412.00983v1Dual-Use Commercial and Military Communications on a Single Platform using RAN Domain Specific Language2024-12-01T22:16:24ZDespite the success of the O-RAN Alliance in developing a set of interoperable interfaces, development of unique Radio Access Network (RAN) deployments remains challenging. This is especially true for military communications, where deployments are highly specialized with limited volume. The construction and maintenance of the RAN, which is a real time embedded system, is an ill-defined NP problem requiring teams of specialized system engineers, with specialized knowledge of the hardware platform. In this paper, we introduce a RAN Domain Specific Language (RDSL(TM)) to formally describe use cases, constraints, and multi-vendor hardware/software abstraction to allow automation of RAN construction. In this DSL, system requirements are declarative, and performance constraints are guaranteed by construction using an automated system solver. Using our RAN system solver platform, Gabriel(TM) we show how a system engineer can confidently modify RAN functionality without knowledge of the underlying hardware. We show benefits for specific system requirements when compared to the manually optimized, default configuration of the Intel FlexRAN(TM), and conclude that DSL/automation driven construction of the RAN can lead to significant power and latency benefits when the deployment constraints are tuned for a specific case. We give examples of how constraints and requirements can be formatted in a "Kubernetes style" YAML format which allows the use of other tools, such as Ansible, to integrate the generation of these requirements into higher level automation flows such as Service Management and Orchestration (SMO).2024-12-01T22:16:24Z6 pages, 11 figures, 19 references. Presented at the IEEE Military Communications Conference, 28 Oct - 1 Nov 2024, Washington DCAlan GathererChaitali SenguptaSudipta SenJeffery H. Reedhttp://arxiv.org/abs/2412.01856v1A Second Soul: Celebrating the Many Languages of Programming -- Festschrift in Honor of Peter Thiemann's Sixtieth Birthday2024-11-30T15:21:04ZThis Festschrift is dedicated to Peter Thiemann on the occasion of his sixtieth birthday, celebrating his significant contributions to the field of programming languages. Over the span of more than three decades, Peter has worked on a wide array of topics. This collection of five articles reflects the diversity of his work. The articles cover areas such as partial evaluation and reversible programming, proof assistants and dependent types, discrete mathematics and dynamic programming, functional and object-oriented programming.2024-11-30T15:21:04ZEPTCS 413, 2024Annette BieniusaUniversity of Kaiserslautern-LandauMarkus DegenUniversity of Applied Sciences AugsburgStefan WehrUniversity of Applied Sciences Offenburg10.4204/EPTCS.413