https://arxiv.org/api/QQaea4BqGed9jqmcWEsCM5kwch0 2026-06-30T19:34:22Z 9958 1995 15 http://arxiv.org/abs/2411.10393v2 Guaranteed Bounds on Posterior Distributions of Discrete Probabilistic Programs with Loops 2024-12-05T09:39:17Z

We study the problem of bounding the posterior distribution of discrete probabilistic programs with unbounded support, loops, and conditioning. Loops pose the main difficulty in this setting: even if exact Bayesian inference is possible, the state of the art requires user-provided loop invariant templates. By contrast, we aim to find guaranteed bounds, which sandwich the true distribution. They are fully automated, applicable to more programs and provide more provable guarantees than approximate sampling-based inference. Since lower bounds can be obtained by unrolling loops, the main challenge is upper bounds, and we attack it in two ways. The first is called residual mass semantics, which is a flat bound based on the residual probability mass of a loop. The approach is simple, efficient, and has provable guarantees. The main novelty of our work is the second approach, called geometric bound semantics. It operates on a novel family of distributions, called eventually geometric distributions (EGDs), and can bound the distribution of loops with a new form of loop invariants called contraction invariants. The invariant synthesis problem reduces to a system of polynomial inequality constraints, which is a decidable problem with automated solvers. If a solution exists, it yields an exponentially decreasing bound on the whole distribution, and can therefore bound moments and tail asymptotics as well, not just probabilities as in the first approach. Both semantics enjoy desirable theoretical properties. In particular, we prove soundness and convergence, i.e. the bounds converge to the exact posterior as loops are unrolled further. On the practical side, we describe Diabolo, a fully-automated implementation of both semantics, and evaluate them on a variety of benchmarks from the literature, demonstrating their general applicability and the utility of the resulting bounds.

2024-11-15T18:01:40Z Full version of the POPL 2025 article, including proofs and other supplementary material Fabian Zaiser Andrzej S. Murawski C. -H. Luke Ong 10.1145/3704874 http://arxiv.org/abs/2305.18945v3 String Diagrams for $λ$-calculi and Functional Computation 2024-12-04T17:17:51Z

This tutorial gives an advanced introduction to string diagrams and graph languages for higher-order computation. The subject matter develops in a principled way, starting from the two dimensional syntax of key categorical concepts such as functors, adjunctions, and strictification, and leading up to Cartesian Closed Categories, the core mathematical model of the lambda calculus and of functional programming languages. This methodology inverts the usual approach of proceeding from syntax to a categorical interpretation, by rationally reconstructing a syntax from the categorical model. The result is a graph syntax -- more precisely, a hierarchical hypergraph syntax -- which in many ways is shown to be an improvement over the conventional linear term syntax. The rest of the tutorial focuses on applications of interest to programming languages: operational semantics, general frameworks for type inference, and complex whole-program transformations such as closure conversion and automatic differentiation.

2023-05-30T11:24:27Z Dan Ghica Fabio Zanasi http://arxiv.org/abs/2412.03612v1 Chatting with Logs: An exploratory study on Finetuning LLMs for LogQL 2024-12-04T14:06:24Z

Logging is a critical function in modern distributed applications, but the lack of standardization in log query languages and formats creates significant challenges. Developers currently must write ad hoc queries in platform-specific languages, requiring expertise in both the query language and application-specific log details -- an impractical expectation given the variety of platforms and volume of logs and applications. While generating these queries with large language models (LLMs) seems intuitive, we show that current LLMs struggle with log-specific query generation due to the lack of exposure to domain-specific knowledge. We propose a novel natural language (NL) interface to address these inconsistencies and aide log query generation, enabling developers to create queries in a target log query language by providing NL inputs. We further introduce ~\textbf{NL2QL}, a manually annotated, real-world dataset of natural language questions paired with corresponding LogQL queries spread across three log formats, to promote the training and evaluation of NL-to-loq query systems. Using NL2QL, we subsequently fine-tune and evaluate several state of the art LLMs, and demonstrate their improved capability to generate accurate LogQL queries. We perform further ablation studies to demonstrate the effect of additional training data, and the transferability across different log formats. In our experiments, we find up to 75\% improvement of finetuned models to generate LogQL queries compared to non finetuned models.

2024-12-04T14:06:24Z draft under submission at another venue Vishwanath Seshagiri Siddharth Balyan Vaastav Anand Kaustubh Dhole Ishan Sharma Avani Wildani José Cambronero Andreas Züfle http://arxiv.org/abs/2412.03310v1 Grounded Language Design for Lightweight Diagramming for Formal Methods 2024-12-04T13:37:59Z

Model finding, as embodied by SAT solvers and similar tools, is used widely, both in embedding settings and as a tool in its own right. For instance, tools like Alloy target SAT to enable users to incrementally define, explore, verify, and diagnose sophisticated specifications for a large number of complex systems. These tools critically include a visualizer that lets users graphically explore these generated models. As we show, however, default visualizers, which know nothing about the domain, are unhelpful and even actively violate presentational and cognitive principles. At the other extreme, full-blown visualizations require significant effort as well as knowledge a specifier might not possess; they can also exhibit bad failure modes (including silent failure). Instead, we need a language to capture essential domain information for lightweight diagramming. We ground our language design in both the cognitive science literature on diagrams and on a large number of example custom visualizations. This identifies the key elements of lightweight diagrams. We distill these into a small set of orthogonal primitives. We extend an Alloy-like tool to support these primitives. We evaluate the effectiveness of the produced diagrams, finding them good for reasoning. We then compare this against many other drawing languages and tools to show that this work defines a new niche that is lightweight, effective, and driven by sound principles.

2024-12-04T13:37:59Z Siddhartha Prasad Ben Greenman Tim Nelson Shriram Krishnamurthi http://arxiv.org/abs/2412.03127v1 Summa Summarum: Moessner's Theorem without Dynamic Programming 2024-12-04T08:44:02Z

Seventy years on, Moessner's theorem and Moessner's process -- i.e., the additive computation of integral powers -- continue to fascinate. They have given rise to a variety of elegant proofs, to an implementation in hardware, to generalizations, and now even to a popular video, "The Moessner Miracle.'' The existence of this video, and even more its title, indicate that while the "what'' of Moessner's process is understood, its "how'' and even more its "why'' are still elusive. And indeed all the proofs of Moessner's theorem involve more complicated concepts than both the theorem and the process. This article identifies that Moessner's process implements an additive function with dynamic programming. A version of this implementation without dynamic programming (1) gives rise to a simpler statement of Moessner's theorem and (2) can be abstracted and then instantiated into related additive computations. The simpler statement also suggests a simpler and more efficient implementation to compute integral powers as well as simple additive functions to compute, e.g., Factorial numbers. It also reveals the source of -- to quote John Conway and Richard Guy -- Moessner's magic.

2024-12-04T08:44:02Z In Proceedings PT 2024, arXiv:2412.01856 EPTCS 413, 2024, pp. 57-92 Olivier Danvy National University of Singapore 10.4204/EPTCS.413.5 http://arxiv.org/abs/2412.03126v1 Completing the Functional Approach in Object-Oriented Languages 2024-12-04T08:43:46Z

Over the last two decades practically all object-oriented programming languages have introduced features that are well-known from functional programming languages. But many features that were introduced were fragmentary. In Java-TX we address the latter features and propose a completion. Java-TX (i.e. Type eXtended) is a language based on Java. The predominant new features are global type inference and real function types for lambda expressions. Global type inference means that all type annotations can be omitted, and the compiler infers them without losing the static type property. We introduce the function types in a similar fashion as in Scala but additionally integrated them into the Java target-typing as proposed in the so-called strawman approach. In this paper, we provide an integrated presentation of all Java-TX features. The focus is therby on the automatic inference of type parameters for classes and their methods, and on the heterogeneous translation of function types, which permits the preservation of the argument and return types in bytecode.

2024-12-04T08:43:46Z In Proceedings PT 2024, arXiv:2412.01856 EPTCS 413, 2024, pp. 43-56 Martin Pluemicke 10.4204/EPTCS.413.4 http://arxiv.org/abs/2412.03125v1 Gradual Guarantee via Step-Indexed Logical Relations in Agda 2024-12-04T08:43:30Z

The gradual guarantee is an important litmus test for gradually typed languages, that is, languages that enable a mixture of static and dynamic typing. The gradual guarantee states that changing the precision of a type annotation does not change the behavior of the program, except perhaps to trigger an error if the type annotation is incorrect. Siek et al. (2015) proved that the Gradually Typed Lambda Calculus (GTLC) satisfies the gradual guarantee using a simulation-based proof and mechanized their proof in Isabelle. In the following decade, researchers have proved the gradual guarantee for more sophisticated calculi, using step-indexed logical relations. However, given the complexity of that style of proof, there has not yet been a mechanized proof of the gradual guarantee using step-indexed logical relations. This paper reports on a mechanized proof of the gradual guarantee for the GTLC carried out in the Agda proof assistant.

2024-12-04T08:43:30Z In Proceedings PT 2024, arXiv:2412.01856 EPTCS 413, 2024, pp. 27-42 Jeremy G. Siek Indiana University 10.4204/EPTCS.413.3 http://arxiv.org/abs/2412.03124v1 Explicit Weakening 2024-12-04T08:43:14Z

I present a novel formulation of substitution, where facts about substitution that previously required tens or hundreds of lines to justify in a proof assistant now follow immediately - they can be justified by writing the four letters "refl". The paper is an executable literate Agda script, and source of the paper is available as an artifact in the file Weaken.lagda.md. Not all consequences of the pandemic have been awful. For the last three years, I've had the great pleasure of meeting with Peter Thiemann and Jeremy Siek for a couple of hours every week, via Zoom, exploring topics including core calculi, gradual typing, and formalisation in Agda. The work reported here arose from those discussions, and is dedicated to Peter on the occasion of his 60th birthday.

2024-12-04T08:43:14Z In Proceedings PT 2024, arXiv:2412.01856 EPTCS 413, 2024, pp. 15-26 Philip Wadler University of Edinburgh 10.4204/EPTCS.413.2 http://arxiv.org/abs/2412.03122v1 Inversion by Partial Evaluation: A Reversible Interpreter Experiment 2024-12-04T08:42:58Z

A computational limit of combining partial evaluation and program inversion is investigated. Using a reversible Turing machine interpreter, we show that the first Futamura and inversion projections can produce not only functionally but also textually equivalent programs. The construction of the interpreter in a reversible flowchart language is shown in full. Insights are provided on the practical interplay between reversible interpreters, program inverters, and partial evaluators. We conclude that both projections must be included in the program transformation toolbox.

2024-12-04T08:42:58Z In Proceedings PT 2024, arXiv:2412.01856. Dedicated to Peter Thiemann on the Occasion of his 60th Birthday EPTCS 413, 2024, pp. 1-14 Robert Glück DIKU, Dept. of Computer Science, University of Copenhagen Louis Marott Normann DIKU, Dept. of Computer Science, University of Copenhagen 10.4204/EPTCS.413.1 http://arxiv.org/abs/2412.02765v1 Massimult: A Novel Parallel CPU Architecture Based on Combinator Reduction 2024-12-03T19:07:03Z

The Massimult project aims to design and implement an innovative CPU architecture based on combinator reduction with a novel combinator base and a new abstract machine. The evaluation of programs within this architecture is inherently highly parallel and localized, allowing for faster computation, reduced energy consumption, improved scalability, enhanced reliability, and increased resistance to attacks. In this paper, we introduce the machine language LambdaM, detail its compilation into KVY assembler code, and describe the abstract machine Matrima. The best part of Matrima is its ability to exploit inherent parallelism and locality in combinator reduction, leading to significantly faster computations with lower energy consumption, scalability across multiple processors, and enhanced security against various types of attacks. Matrima can be simulated as a software virtual machine and is intended for future hardware implementation.

2024-12-03T19:07:03Z Jurgen Nicklisch-Franken Ruslan Feizerakhmanov http://arxiv.org/abs/2407.14107v2 Approximate Relational Reasoning for Higher-Order Probabilistic Programs 2024-12-03T14:02:57Z

Properties such as provable security and correctness for randomized programs are naturally expressed relationally as approximate equivalences. As a result, a number of relational program logics have been developed to reason about such approximate equivalences of probabilistic programs. However, existing approximate relational logics are mostly restricted to first-order programs without general state. In this paper we develop Approxis, a higher-order approximate relational separation logic for reasoning about approximate equivalence of programs written in an expressive ML-like language with discrete probabilistic sampling, higher-order functions, and higher-order state. The Approxis logic recasts the concept of error credits in the relational setting to reason about relational approximation, which allows for expressive notions of modularity and composition, a range of new approximate relational rules, and an internalization of a standard limiting argument for showing exact probabilistic equivalences by approximation. We also use Approxis to develop a logical relation model that quantifies over error credits, which can be used to prove exact contextual equivalence. We demonstrate the flexibility of our approach on a range of examples, including the PRP/PRF switching lemma, IND\$-CPA security of an encryption scheme, and a collection of rejection samplers. All of the results have been mechanized in the Coq proof assistant and the Iris separation logic framework.

2024-07-19T08:22:10Z Camera-ready POPL submission including additional appendix Proc. ACM Program. Lang. 9, POPL, Article 41 (January 2025) Philipp G. Haselwarter Kwing Hei Li Alejandro Aguirre Simon Oddershede Gregersen Joseph Tassarotti Lars Birkedal 10.1145/3704877 http://arxiv.org/abs/2412.01545v1 Beyond SICP -- Design and Implementation of a Notional Machine for Scheme 2024-12-02T14:34:30Z

Computer science education has been at the heart of Scheme from the beginning. The language was designed in the 1970s concurrently with the MIT course 6.001 and the textbook "Structure and Interpretation of Computer Programs" (SICP). To explain the scope of variables at run time in the presence of higher-order procedures, SICP introduces a mental model called the environment model, along with a pictorial representation of environments and data structures. Recently, the concept of notional machines has emerged in computer science education: a predictive set of abstractions that define the structure and behavior of a computational device. Proponents of notional machines argue that learners benefit when complex dynamic concepts such as the computational structure of Scheme are accompanied with concise notional machines. In this paper, we start with a sublanguage of Scheme sufficient for all programs in SICP that we call SICP Scheme. We extend the environment model to a full notional machine for SICP Scheme that is simple enough to serve as the central mental model in a CS1 course and demonstrate the machine with computer-generated visualizations. Moving beyond SICP Scheme, we show how the notional machine can be further extended to explain Scheme's call/cc and thus make this powerful concept accessible to beginners through a coherent mental model. The presented notional machine serves as the core of a web-based implementation of Scheme that is under development at our university.

2024-12-02T14:34:30Z Kyriel Abad Martin Henz http://arxiv.org/abs/2405.02326v2 Evaluating LLMs for Hardware Design and Test 2024-12-02T01:59:30Z

Large Language Models (LLMs) have demonstrated capabilities for producing code in Hardware Description Languages (HDLs). However, most of the focus remains on their abilities to write functional code, not test code. The hardware design process consists of both design and test, and so eschewing validation and verification leaves considerable potential benefit unexplored, given that a design and test framework may allow for progress towards full automation of the digital design pipeline. In this work, we perform one of the first studies exploring how a LLM can both design and test hardware modules from provided specifications. Using a suite of 8 representative benchmarks, we examined the capabilities and limitations of the state-of-the-art conversational LLMs when producing Verilog for functional and verification purposes. We taped out the benchmarks on a Skywater 130nm shuttle and received the functional chip.

2024-04-23T18:55:49Z Jason Blocklove Siddharth Garg Ramesh Karri Hammond Pearce 10.1109/LAD62341.2024.10691811 http://arxiv.org/abs/2412.00983v1 Dual-Use Commercial and Military Communications on a Single Platform using RAN Domain Specific Language 2024-12-01T22:16:24Z

Despite the success of the O-RAN Alliance in developing a set of interoperable interfaces, development of unique Radio Access Network (RAN) deployments remains challenging. This is especially true for military communications, where deployments are highly specialized with limited volume. The construction and maintenance of the RAN, which is a real time embedded system, is an ill-defined NP problem requiring teams of specialized system engineers, with specialized knowledge of the hardware platform. In this paper, we introduce a RAN Domain Specific Language (RDSL(TM)) to formally describe use cases, constraints, and multi-vendor hardware/software abstraction to allow automation of RAN construction. In this DSL, system requirements are declarative, and performance constraints are guaranteed by construction using an automated system solver. Using our RAN system solver platform, Gabriel(TM) we show how a system engineer can confidently modify RAN functionality without knowledge of the underlying hardware. We show benefits for specific system requirements when compared to the manually optimized, default configuration of the Intel FlexRAN(TM), and conclude that DSL/automation driven construction of the RAN can lead to significant power and latency benefits when the deployment constraints are tuned for a specific case. We give examples of how constraints and requirements can be formatted in a "Kubernetes style" YAML format which allows the use of other tools, such as Ansible, to integrate the generation of these requirements into higher level automation flows such as Service Management and Orchestration (SMO).

2024-12-01T22:16:24Z 6 pages, 11 figures, 19 references. Presented at the IEEE Military Communications Conference, 28 Oct - 1 Nov 2024, Washington DC Alan Gatherer Chaitali Sengupta Sudipta Sen Jeffery H. Reed http://arxiv.org/abs/2412.01856v1 A Second Soul: Celebrating the Many Languages of Programming -- Festschrift in Honor of Peter Thiemann's Sixtieth Birthday 2024-11-30T15:21:04Z

This Festschrift is dedicated to Peter Thiemann on the occasion of his sixtieth birthday, celebrating his significant contributions to the field of programming languages. Over the span of more than three decades, Peter has worked on a wide array of topics. This collection of five articles reflects the diversity of his work. The articles cover areas such as partial evaluation and reversible programming, proof assistants and dependent types, discrete mathematics and dynamic programming, functional and object-oriented programming.

2024-11-30T15:21:04Z EPTCS 413, 2024 Annette Bieniusa University of Kaiserslautern-Landau Markus Degen University of Applied Sciences Augsburg Stefan Wehr University of Applied Sciences Offenburg 10.4204/EPTCS.413