https://arxiv.org/api/z44EWZQU9qwgM2HppEWSOi6qjQE2026-06-26T20:02:11Z9951136515http://arxiv.org/abs/2301.02410v2CodePod: A Language-Agnostic Hierarchical Scoping System for Interactive Development2025-07-31T02:51:07ZInteractive development environments like Jupyter Notebooks enable incremental coding through cells with immediate feedback, but their linear structure and global namespace limit scalability for large software projects. We present CodePod, a hierarchical extension of Jupyter that introduces a novel scoped execution model with formal semantics. Our key contribution is a language-agnostic runtime system that performs source-level transformations to implement hierarchical scoping rules, enabling true incremental evaluation across nested modules without requiring language-specific kernel modifications. We formalize the scoping semantics as a mathematical framework with precise visibility relations and prove key properties including uniqueness of symbol resolution and correctness of the resolution algorithm. A qualitative user study with seven senior developers demonstrates that CodePod enables significant improvements in project scalability compared to Jupyter, with notable reductions in navigation effort. We validate the system's effectiveness on large-scale projects with thousands of lines of code, demonstrating its applicability beyond traditional notebook boundaries. Our tool is open-source and available at https://codepod.io2023-01-06T07:48:51ZHebi LiForrest Sheng BaoQi XiaoJin Tianhttp://arxiv.org/abs/2507.23151v1Abstractions of Sequences, Functions and Operators2025-07-30T22:55:51ZWe present theoretical and practical results on the order theory of lattices of functions, focusing on Galois connections that abstract (sets of) functions - a topic known as higher-order abstract interpretation.
We are motivated by the challenge of inferring closed-form bounds on functions which are defined recursively, i.e. as the fixed point of an operator or, equivalently, as the solution to a functional equation. This has multiple applications in program analysis (e.g. cost analysis, loop acceleration, declarative language analysis) and in hybrid systems governed by differential equations.
Our main contribution is a new family of constraint-based abstract domains for abstracting numerical functions, B-bound domains, which abstract a function f by a conjunction of bounds from a preselected set of boundary functions. They allow inferring highly non-linear numerical invariants, which classical numerical abstract domains struggle with. We uncover a convexity property in the constraint space that simplifies, and, in some cases, fully automates, transfer function design.
We also introduce domain abstraction, a functor that lifts arbitrary mappings in value space to Galois connections in function space. This supports abstraction from symbolic to numerical functions (i.e. size abstraction), and enables dimensionality reduction of equations.
We base our constructions of transfer functions on a simple operator language, starting with sequences, and extending to more general functions, including multivariate, piecewise, and non-discrete domains.2025-07-30T22:55:51ZUnder consideration for publication in STTTLouis RustenholzPedro Lopez-GarciaManuel V. Hermenegildohttp://arxiv.org/abs/2506.16065v2Floating-Point Neural Networks Are Provably Robust Universal Approximators2025-07-30T01:31:24ZThe classical universal approximation (UA) theorem for neural networks establishes mild conditions under which a feedforward neural network can approximate a continuous function $f$ with arbitrary accuracy. A recent result shows that neural networks also enjoy a more general interval universal approximation (IUA) theorem, in the sense that the abstract interpretation semantics of the network using the interval domain can approximate the direct image map of $f$ (i.e., the result of applying $f$ to a set of inputs) with arbitrary accuracy. These theorems, however, rest on the unrealistic assumption that the neural network computes over infinitely precise real numbers, whereas their software implementations in practice compute over finite-precision floating-point numbers. An open question is whether the IUA theorem still holds in the floating-point setting.
This paper introduces the first IUA theorem for floating-point neural networks that proves their remarkable ability to perfectly capture the direct image map of any rounded target function $f$, showing no limits exist on their expressiveness. Our IUA theorem in the floating-point setting exhibits material differences from the real-valued setting, which reflects the fundamental distinctions between these two computational models. This theorem also implies surprising corollaries, which include (i) the existence of provably robust floating-point neural networks; and (ii) the computational completeness of the class of straight-line programs that use only floating-point additions and multiplications for the class of all floating-point programs that halt.2025-06-19T06:43:04Z70 pages, 4 figures. Appeared in CAV 2025Proceedings of the 37th International Conference on Computer Aided Verification (CAV), pp. 301-326. Springer, 2026Geonho HwangWonyeol LeeYeachan ParkSejun ParkFeras Saad10.1007/978-3-031-98679-6_14http://arxiv.org/abs/2507.22048v1Composable Effect Handling for Programming LLM-integrated Scripts2025-07-29T17:53:07ZImplementing LLM-integrated scripts introduces challenges in modularity and performance, as scripts are often coupled to specific LLM implementations and fail to exploit parallelization opportunities. This paper proposes using composable effect handling to separate workflow logic from effectful operations, such as LLM calls, I/O, and concurrency, enabling modularity without sacrificing the opportunity for performance optimization. By treating these operations as abstract interfaces and discharging them via effect handlers, this paper shows that scripts can achieve significant speedups (e.g., 10$\times$ in a Tree-of-Thoughts case study) without compromising modularity. This paper aims to promote composable effect handling as a programming style for LLM scripting.2025-07-29T17:53:07ZDi Wanghttp://arxiv.org/abs/2507.21439v1Fixed-Point-Oriented Programming: A Concise and Elegant Paradigm2025-07-29T02:21:19ZFixed-Point-Oriented Programming (FPOP) is an emerging paradigm designed to streamline the implementation of problems involving self-referential computations. These include graph algorithms, static analysis, parsing, and distributed computing-domains that traditionally require complex and tricky-to-implement work-queue algorithms. Existing programming paradigms lack direct support for these inherently fixed-point computations, leading to inefficient and error-prone implementations.
This white paper explores the potential of the FPOP paradigm, which offers a high-level abstraction that enables concise and expressive problem formulations. By leveraging structured inference rules and user-directed optimizations, FPOP allows developers to write declarative specifications while the compiler ensures efficient execution. It not only reduces implementation complexity for programmers but also enhances adaptability, making it easier for programmers to explore alternative solutions and optimizations without modifying the core logic of their program.
We demonstrate how FPOP simplifies algorithm implementation, improves maintainability, and enables rapid prototyping by allowing problems to be clearly and concisely expressed. For example, the graph distance problem can be expressed in only two executable lines of code with FPOP, while it takes an order of magnitude more code in other paradigms. By bridging the gap between theoretical fixed-point formulations and practical implementations, we aim to foster further research and adoption of this paradigm.2025-07-29T02:21:19ZYong Qi FooBrian Sze-Kai CheongMichael D. Adamshttp://arxiv.org/abs/2507.21317v1One Weird Trick to Untie Landin's Knot2025-07-28T20:19:06ZIn this work, we explore Landin's Knot, which is understood as a pattern for encoding general recursion, including non-termination, that is possible after adding higher-order references to an otherwise terminating language. We observe that this isn't always true -- higher-order references, by themselves, don't lead to non-termination. The key insight is that Landin's Knot relies not primarily on references storing functions, but on unrestricted quantification over a function's environment. We show this through a closure converted language, in which the function's environment is made explicit and hides the type of the environment through impredicative quantification. Once references are added, this impredicative quantification can be exploited to encode recursion. We conjecture that by restricting the quantification over the environment, higher-order references can be safely added to terminating languages, without resorting to more complex type systems such as linearity, and without restricting references from storing functions.2025-07-28T20:19:06ZPaulette KoronkevichWilliam J. Bowmanhttp://arxiv.org/abs/2507.22086v1TypyBench: Evaluating LLM Type Inference for Untyped Python Repositories2025-07-28T14:54:00ZType inference for dynamic languages like Python is a persistent challenge in software engineering. While large language models (LLMs) have shown promise in code understanding, their type inference capabilities remain underexplored. We introduce TypyBench, a benchmark designed to evaluate LLMs' type inference across entire Python repositories. TypyBench features two novel metrics: TypeSim, which captures nuanced semantic relationships between predicted and ground truth types, and TypeCheck, which assesses type consistency across codebases. Our evaluation of various LLMs on a curated dataset of 50 high-quality Python repositories reveals that, although LLMs achieve decent TypeSim scores, they struggle with complex nested types and exhibit significant type consistency errors. These findings suggest that future research should shift focus from improving type similarity to addressing repository-level consistency. TypyBench provides a foundation for this new direction, offering insights into model performance across different type complexities and usage contexts. Our code and data are available at https://github.com/typybench/typybench.2025-07-28T14:54:00ZProceedings of the 42nd International Conference on Machine Learning, Vancouver, Canada. PMLR 267, 2025Honghua DongJiacheng YangXun DengYuhe JiangGennady PekhimenkoFan LongXujie Sihttp://arxiv.org/abs/2507.20672v1Program Analysis for High-Value Smart Contract Vulnerabilities: Techniques and Insights2025-07-28T09:53:31ZA widespread belief in the blockchain security community is that automated techniques are only good for detecting shallow bugs, typically of small value. In this paper, we present the techniques and insights that have led us to repeatable success in automatically discovering high-value smart contract vulnerabilities. Our vulnerability disclosures have yielded 10 bug bounties, for a total of over $3M, over high-profile deployed code, as well as hundreds of bugs detected in pre-deployment or under-audit code.
We argue that the elements of this surprising success are a) a very high-completeness static analysis approach that manages to maintain acceptable precision; b) domain knowledge, provided by experts or captured via statistical inference. We present novel techniques for automatically inferring domain knowledge from statistical analysis of a large corpus of deployed contracts, as well as discuss insights on the ideal precision and warning rate of a promising vulnerability detector. In contrast to academic literature in program analysis, which routinely expects false-positive rates below 50% for publishable results, we posit that a useful analysis for high-value real-world vulnerabilities will likely flag very few programs (under 1%) and will do so with a high false-positive rate (e.g., 95%, meaning that only one-of-twenty human inspections will yield an exploitable vulnerability).2025-07-28T09:53:31ZYannis SmaragdakisNeville GrechSifis LagouvardosKonstantinos TriantafyllouIlias TsatirisYannis BollanosTony Rocco Valentinehttp://arxiv.org/abs/2410.16102v2Semantics of Sets of Programs2025-07-27T20:30:09ZApplications like program synthesis sometimes require proving that a property holds for all of the infinitely many programs described by a grammar - i.e., an inductively defined set of programs. Current verification frameworks overapproximate programs' behavior when sets of programs contain loops, including two Hoare-style logics that fail to be relatively complete when loops are allowed. In this work, we prove that compositionally verifying simple properties for infinite sets of programs requires tracking distinct program behaviors over unboundedly many executions. Tracking this information is both necessary and sufficient for verification. We prove this fact in a general, reusable theory of denotational semantics that can model the expressivity and compositionality of verification techniques over infinite sets of programs. We construct the minimal compositional semantics that captures simple properties of sets of programs and use it to derive the first sound and relatively complete Hoare-style logic for infinite sets of programs. Thus, our methods can be used to design minimally complex, compositional verification techniques for sets of programs.2024-10-21T15:30:46Z47 pages, 8 FiguresJinwoo KimShaan NagyThomas RepsLoris D'Antonihttp://arxiv.org/abs/2507.20251v1The Power of Negation in Higher-Order Datalog2025-07-27T12:33:21ZWe investigate the expressive power of Higher-Order Datalog$^\neg$ under both the well-founded and the stable model semantics, establishing tight connections with complexity classes. We prove that under the well-founded semantics, for all $k\geq 1$, $(k+1)$-Order Datalog$^\neg$ captures k-EXP, a result that holds without explicit ordering of the input database. The proof of this fact can be performed either by using the powerful existential predicate variables of the language or by using partially applied relations and relation enumeration. Furthermore, we demonstrate that this expressive power is retained within a stratified fragment of the language. Under the stable model semantics, we show that $(k+1)$-Order Datalog$^\neg$ captures co-(k-NEXP) using cautious reasoning and k-NEXP using brave reasoning, again with analogous results for the stratified fragment augmented with choice rules. Our results establish a hierarchy of expressive power, highlighting an interesting trade-off between order and non-determinism in the context of higher-order logic programming: increasing the order of programs under the well-founded semantics can surpass the expressive power of lower-order programs under the stable model semantics.2025-07-27T12:33:21ZTheory and Practice of Logic Programming 25 (2025) 595-611Angelos CharalambidisBabis KostopoulosChristos NomikosPanos Rondogiannis10.1017/S1471068425100227http://arxiv.org/abs/2507.19728v1Development and Evaluation of Adaptive LearningSupport System Based on Ontology of MultipleProgramming Languages2025-07-26T01:03:59ZThis paper introduces an ontology-based approach within an adaptive learning support system for computer programming. This system (named ADVENTURE) is designed to deliver personalized programming exercises that are tailored to individual learners' skill levels. ADVENTURE utilizes an ontology, named CONTINUOUS, which encompasses common concepts across multiple programming languages. The system leverages this ontology not only to visualize programming concepts but also to provide hints during practice programming exercises and recommend subsequent programming concepts. The adaptive mechanism is driven by the Elo Rating System, applied in an educational context to dynamically estimate the most appropriate exercise difficulty for each learner. An experimental study compared two instructional modes, adaptive and random, based on six features derived from 1,186 code submissions across all the experimental groups. The results indicate significant differences in four of six analyzed features between these two modes. Notably, the adaptive mode demonstrates a significant difference over the random mode in two features, the submission of correct answers and the number of pass concepts. Therefore, these results underscore that this adaptive learning support system may support learners in practicing programming exercises.2025-07-26T01:03:59ZThis document provides corrections to the published article. Corrections include clarifying figure legends and addressing grammatical issues to enhance clarity. The authors state that the scientific conclusions are unaffectedEducation Sciences, 2025, 15(6):724Lalita Na NongkhaiJingyun WangTakahiko Mendori10.3390/educsci15060724http://arxiv.org/abs/2507.19015v1An Enumerative Embedding of the Python Type System in ACL2s2025-07-25T07:08:43ZPython is a high-level interpreted language that has become an industry standard in a wide variety of applications. In this paper, we take a first step towards using ACL2s to reason about Python code by developing an embedding of a subset of the Python type system in ACL2s. The subset of Python types we support includes many of the most commonly used type annotations as well as user-defined types comprised of supported types. We provide ACL2s definitions of these types, as well as defdata enumerators that are customized to provide code coverage and identify errors in Python programs. Using the ACL2s embedding, we can generate instances of types that can then be used as inputs to fuzz Python programs, which allows us to identify bugs in Python code that are not detected by state-of-the-art Python type checkers. We evaluate our work against four open-source repositories, extracting their type information and generating inputs for fuzzing functions with type signatures that are in the supported subset of Python types. Note that we only use the type signatures of functions to generate inputs and treat the bodies of functions as black boxes. We measure code coverage, which ranges from about 68% to more than 80%, and identify code patterns that hinder coverage such as complex branch conditions and external file system dependencies. We conclude with a discussion of the results and recommendations for future work.2025-07-25T07:08:43ZIn Proceedings ACL2 2025, arXiv:2507.18567EPTCS 423, 2025, pp. 124-144Samuel XifarasNortheastern UniversityPanagiotis ManoliosNortheastern UniversityAndrew T. WalterNortheastern UniversityWilliam RobertsonNortheastern University10.4204/EPTCS.423.11http://arxiv.org/abs/2507.19012v1A Formalization of the Yul Language and Some Verified Yul Code Transformations2025-07-25T07:08:02ZYul is an intermediate language used in the compilation of the Solidity programming language for Ethereum smart contracts. The compiler applies customizable sequences of transformations to Yul code. To help ensure the correctness of these transformations and their sequencing, we used the ACL2 theorem prover to develop a formalization of the syntax and semantics of Yul, proofs relating static and dynamic semantics, a formalization of some Yul code transformations, and correctness proofs for these transformations.
2025-07-25T07:08:02ZIn Proceedings ACL2 2025, arXiv:2507.18567EPTCS 423, 2025, pp. 65-83Alessandro CoglioKestrel InstituteEric McCarthyKestrel Institute10.4204/EPTCS.423.8http://arxiv.org/abs/2508.00016v1Extended Abstract: Mutable Objects with Several Implementations2025-07-25T07:07:52ZThis extended abstract outlines an ACL2 feature, attach-stobj, that first appeared in ACL2 Version 8.6 (October, 2024). This feature supports different executable operations for a given abstract stobj, without requiring recertification of the book that introduces that stobj or theorems about it. The paper provides background as well as a user-level overview and some implementation notes.2025-07-25T07:07:52ZIn Proceedings ACL2 2025, arXiv:2507.18567EPTCS 423, 2025, pp. 60-64Matt KaufmannYahya SohailWarren A. Hunt10.4204/EPTCS.423.7http://arxiv.org/abs/2506.15135v2Towards Bug-Free Distributed Go Programs2025-07-25T06:48:00ZProgrammers of distributed systems need to reason about concurrency to avoid races. However, reasoning about concurrency is difficult, and unexpected races show up as bugs. Data race detection in shared memory systems is well-studied (dynamic data race detection [13], behavioral types [15], dynamic race detection [31]). Similar to how a data race consists of reads and writes not related by happens-before at a shared memory location, a communication race consists of receives and sends not related by happens-before on a shared channel. Communication races are problematic: a receiver expects a specific message from a specific sender, but with a communication race, the receiver can receive a message meant for another receiver, or not receive anything at all. In this work, we describe a verification framework that can prove the absence of communication races for distributed programs that use a subset of the Go programming language, where synchronization is mainly achieved via message passing. We statically reason about how a distributed program executes, using a happens-before order, extended to buffered and unbuffered channels.2025-06-16T17:19:22ZVersion 1. B.Comp. DissertationZhengqun Koo