https://arxiv.org/api/XpYSLDnjHV2gLxM16lSHlzKsiAI2026-06-13T20:12:01Z988513515http://arxiv.org/abs/2605.20923v1Causal Past Logic for Runtime Verification of Distributed LLM Agent Workflows2026-05-20T09:09:13ZDistributed LLM agent workflows should not be monitored as if they produced a single sequential log. In an asynchronous execution, a decision can only depend on events that are causally visible to the lifeline that makes it: an event that appears earlier in some log may still be unknown locally. We extend the ZipperGen agent-workflow framework with Causal Past Logic (CPL), a small past-time temporal logic for guards in conditionals and while loops. In addition to standard past-time modalities such as previous and since, a guard can inspect the latest causally visible event of another lifeline and selected variables stored there. The formula is a source-level guard: it is evaluated online by the owner lifeline and can influence control flow at runtime. We give a vector-clock monitor with latest-value views and prove that the locally computed monitor value coincides with the denotational semantics of the guard at the current event. Thus runtime verification becomes part of the coordination language itself, rather than a post-hoc check over an execution log.2026-05-20T09:09:13Z20 pagesBenedikt Bollighttp://arxiv.org/abs/2606.02588v1Lean-GAP: A Dataset of Formalized Graduate Algebra Problems2026-05-20T08:15:43ZWe present Lean-GAP (Lean-Graduate Agebra Problems), 430 formalized graduate-level algebra problems from the textbook Abstract Algebra by Dummit and Foote. We develop a scalable pipeline consisting of PDF-to-LaTeX preprocessing, autoformalization into Lean 4, and verification of informal-formal correspondence. While the preprocessing and autoformalization stages can be largely automated, we find that verification remains the most subtle and labor-intensive component, requiring careful human oversight. Our contributions include (i) the construction of a structured dataset of formalized exercises, (ii) a systematic methodology for formalizing textbook mathematics, and (iii) an analysis of recurring challenges in the formalization process. We also compare the performance of different autoformalization models and highlight key bottlenecks in translating informal statements into formal language.2026-05-20T08:15:43ZSeewoo LeeByung-Hak HwangHyojae LimJihoon HyunIlkyoo ChoiYeachan ParkJineon BaekHyukpyo HongKeewoo LeeJaeseong HeoHyungryul BaikChul-hee LeeKyu-Hwan Leehttp://arxiv.org/abs/2605.17164v2Charon: A Unified and Fine-Grained Simulator for Large-Scale LLM Training and Inference2026-05-19T23:51:14ZDeploying large-scale LLM training and inference with optimal performance is exceptionally challenging due to a complex design space of parallelism strategies, system optimizations, and hardware configurations. Accurate and rapid performance simulation is critical for guiding optimization efforts and system studies by validating "what-if" Hooker Figure hypotheses. To address this, we introduce Charon, a unified, modular, and fine-grained simulator for accurately predicting LLM performance. Experiments show Charon achieves high accuracy across different models and configurations, with an overall prediction error consistently under 5.35%, and even under 3.74% for training with a large-scale GPU cluster. In a practical inference deployment case, Charon discovered a configuration that improved system throughput over an engineering-tuned baseline, demonstrating its significant real-world value.2026-05-16T21:28:22ZAccepted by MLSys 2026Mengtian YangZhekun ZhangMingheng WuJianwen YanHanshi SunLi-wen Changhttp://arxiv.org/abs/2602.23520v9Zero-Error Recovery under Deterministic Partial Views: Matroid Bounds and Verifiable Realizability2026-05-19T22:00:17ZZero-error recovery under deterministic partial views is graph recovery for the induced confusability relation. A finite family of coordinate-subset observations determines a graph on latent states; $T$-ary exact recovery is graph $T$-colorability, block composition is strong powering, and asymptotic recoverability is Shannon capacity.
Coordinate structure gives tractable certificates inside the graph semantics. For affine realized state families with explicit linear presentations, restricted coordinate ranks form a representable matroid certificate giving polynomial-time upper bounds on one-shot confusability and asymptotic capacity, with rank additivity matching direct-sum block composition. In the full tuple-space coordinate model, the realizable confusability relations are exactly the upward-closed coordinate-agreement families. Transitive confusability is equivalent to intersection closure of the generated agreement family, yielding a cluster graph whose capacity is determined by connected components.
Host-level realizability determines when the latent state family is canonical. Verifiable rate-$1$ realizability for structural facts holds if and only if the host provides zero-delay synchronization and structural side-information; eleven representative host architectures instantiate the criterion. The same clique-size bit-budget bound governs both the graph-level and host-level layers. All cited results are mechanized in Lean 4 against a shared formalization library.2026-02-26T21:47:11ZMain PDF: 52 pages, 1 figure, 2 tables. Supplementary: 11 pages, 2 tables. Lean 4 artifact available at https://doi.org/10.5281/zenodo.18141365Tristan Simashttp://arxiv.org/abs/2605.21532v1Contract Based Verification of Non-functional Requirements for Embedded Automotive C Code2026-05-19T21:12:45ZCode contracts provide a robust way to specify functional requirements of safety-critical software in embedded systems. For example, the ANSI/ISO C Specification Language (ACSL) can be used to specify the functional behavior of C code that is then formally verified by the Frama-C framework's Wp plugin. However, non-functional requirements, such as restrictions on control flow and data flow, are also important for embedded systems safety. Untrusted code developed by subcontractors, junior developers, or generated by large language models, can be verified by Wp but may nevertheless call unsafe functions or use uninitialized program variables. To address this problem, we constructed a set of general rules concerning non-functional requirements of C code in safety-critical embedded systems. Our rules are orthogonal to popular C rulesets such as MISRA-C and center on modules and their interaction through interfaces. To enable checking our rules, we propose an interface specification contract language for C modules. We implemented a checker for our rules as a Frama-C plugin, which takes as input a C module and its contract and checks control flow and data flow properties, ensuring, e.g., that only permitted functions are called by the module. We integrated our checker in a toolchain to enable specification and verification of module contracts and ACSL contracts for untrusted code. We report on two case studies on safety-critical C code using software in Scania trucks, where we defined module contracts and ACSL function contracts based on informal system requirements and verified them using our toolchain.2026-05-19T21:12:45ZJesper AmilonMerlijn SevenhuijsenMattias NybergKarl Palmskoghttp://arxiv.org/abs/2605.20370v1Clove: Object-Level CXL Memory Management in Managed Runtimes2026-05-19T18:21:07ZObject-level management of tiered memory has been studied to address the inefficiencies in page-based systems. However, object-level management for CXL-tiered memory remains underexplored due to CXL's tight performance budget and load/store interface. As a result, existing approaches remain limited in scope, primarily targeting unmanaged-language applications with bespoke runtimes or compiler support.
This paper identifies and explores a new design point for object-level CXL management: managed languages and their runtimes. The key observation is that existing managed runtimes already provide highly optimized mechanisms for problems closely related to object-level management, including object relocation and dynamic code generation. However, they still lack the features needed for tiered memory management, such as hotness tracking and relocation policies, and thus must be carefully extended to fully realize this direction.
We present Clove, a system that extends existing managed runtimes to support object-level CXL management for managed-language applications. Clove combines profile-guided object hotness tracking with object relocation techniques and policies. Our JVM prototype demonstrates that this extension enables high utilization of fast-tier memory while bounding runtime overhead, reducing application slowdown by 22-84% compared to page-based systems.2026-05-19T18:21:07Z12 pages (15 pages including references), 13 figuresSam SonZhihong LuoWen ZhangSylvia RatnasamyScott Shenkerhttp://arxiv.org/abs/2507.08759v4Dependent Multiplicities in Dependent Linear Type Theory2026-05-19T17:17:07ZWe present a novel dependent linear type theory in which the multiplicity of some variable-i.e., the number of times the variable can be used in a program-can depend on other variables. This allows us to give precise resource annotations to programs involving branching and recursion that cannot be adequately typed in other theories. Our type system is obtained by embedding linear logic into dependent type theory and specifying how the embedded logic interacts with the host theory. We can then use the natural numbers of the dependent type theory to derive a quantitative typing system with dependent multiplicities. Our theory supports W-types, thereby giving a principled resource-aware treatment of a large class of inductive types. We characterise the semantics as Categories with Families indexed in symmetric monoidal categories, thereby generalising Quantitative Categories with Families. Existing dependently typed languages can easily be extended with our system, which we demonstrate with an implementation in Agda.2025-07-11T17:12:11ZMaximilian Doréhttp://arxiv.org/abs/2603.13671v3Grassroots Bonds as a Foundation for Market Liquidity2026-05-19T14:57:55ZGlobal cryptocurrencies are unbacked and have high transaction cost incurred by global consensus. In contrast, grassroots cryptocurrencies are backed by the goods and services of their issuers -- any person, natural or legal -- and have no transaction cost beyond operating a smartphone. Liquidity in grassroots cryptocurrencies arises from mutual credit via coin exchange among issuers. However, as grassroots coins are redeemable 1-for-1 against any other grassroots coin, the credit-forming exchange must also be 1-for-1, lest prompt redemption after exchange would leave the parties with undue profit or loss. Thus, grassroots coins are incongruent with liquidity through interest-bearing credit.
Here we introduce grassroots bonds, which extend grassroots coins with a maturity date, reframing grassroots coins -- cash -- as mature grassroots bonds. Bond redemption generalises coin redemption, allowing the lending of liquid coins in exchange for interest-bearing future-maturity bonds. We show that digital social contracts -- voluntary agreements among persons, specified, fulfilled, and enforced digitally -- can express the full gamut of financial instruments as the voluntary swap of grassroots bonds, including loans, sale of debt, forward contracts, options, and escrow-based instruments, and that classical liquidity ratios are applicable just as well to grassroots bonds. Grassroots bonds may thus allow local digital economies to form and grow without initial capital or external credit, harnessing mutual trust within communities into liquidity.
The formal specification presented here was implemented in GLP, a concurrent logic programming language running on Dart for smartphone deployment. The implementation is illustrated by a running multiagent village market scenario in GLP.2026-03-14T00:44:25ZEhud Shapirohttp://arxiv.org/abs/2501.09144v5Rule-Based Graph Programs Matching the Time Complexity of Imperative Algorithms2026-05-19T13:38:47ZWe report on recent advances in rule-based graph programming, which allow us to match the time complexity of some fundamental imperative graph algorithms. In general, achieving the time complexity of graph algorithms implemented in conventional languages using a rule-based graph-transformation language is challenging due to the cost of graph matching. Previous work demonstrated that with rooted rules, certain algorithms can be implemented in the graph programming language GP 2 such that their runtime matches the time complexity of imperative implementations. However, this required input graphs to have a bounded node degree and (for some algorithms) to be connected. In this paper, we overcome these limitations by enhancing the graph data structure generated by the GP 2 compiler and exploiting the new structure in programs. We present three case studies: the first program checks whether input graphs are connected, the second program checks whether input graphs are acyclic, and the third program solves the single-source shortest-paths problem for graphs with integer edge-weights. The first two programs run in linear time on (possibly disconnected) input graphs with arbitrary node degrees. The third program runs in time $O(nm)$ on arbitrary input graphs, matching the time complexity of imperative implementations of the Bellman-Ford algorithm. For each program, we formally prove its correctness and time complexity, and provide runtime experiments on various graph classes.2025-01-15T20:52:37ZLogical Methods in Computer Science, Volume 22, Issue 2 (May 20, 2026) lmcs:15098Ziad Ismaili AlaouiDetlef Plump10.46298/lmcs-22(2:20)2026http://arxiv.org/abs/2603.25710v2Stone Duality for Monads2026-05-19T10:03:51ZWe introduce a contravariant idempotent adjunction between (i) the category of ranked monads on $\mathsf{Set}$; and (ii) the category of internal categories and internal retrofunctors in the category of locales. The left adjoint takes a monad $T$-viewed as a notion of computation, following Moggi-to its localic behaviour category $\mathsf{LB}T$. This behaviour category is understood as "the universal transition system" for interacting with $T$: its "objects" are states and the "morphisms" are transitions. On the other hand, the right adjoint takes a localic category $\mathsf{LC}$-similarly understood as a transition system-to the monad $Γ\mathsf{LC}$ where $(Γ\mathsf{LC})A$ is the set of $A$-indexed families of local sections to the source map which jointly partition the locale of objects. The fixed points of this adjunction consist of (i) hyperaffine-unary monads, i.e., those monads where term $t$ admits a read-only operation $\bar{t}$ predicting the output of $t$; and (ii) ample localic categories, i.e., whose source maps are local homeomorphisms and whose locale of objects are strongly zero-dimensional. The hyperaffine-unary monads arise in earlier works by Johnstone and Garner as a syntactic characterization of those monads with Cartesian closed Eilenberg-Moore categories. This equivalence is the Stone duality for monads; so-called because it further restricts to the classical Stone duality by viewing a Boolean algebra $B$ as a monad of $B$-partitions and the corresponding Stone space as a localic category with only identity morphisms.2026-03-26T17:53:12Z18 pages without appendix, 34 pages total, to appear in pre-proceedings of MFPS2026Richard GarnerAlyssa RenataNicolas Wuhttp://arxiv.org/abs/2605.19412v1DRReduce: Enhancing Syntax-Guided Program Reduction with Dependency Reconstruction2026-05-19T06:06:37ZProgram reduction is a technique for simplifying large, failure-inducing programs into minimal reproducible test cases. Language-specific tools such as CReduce achieve strong performance by leveraging deep semantic knowledge of C/C++, but are tightly coupled to a single language family. Language-agnostic reducers such as Perses address this by applying syntax-guided search across any grammar, yet share a fundamental limitation: deleting a node or subtree in isolation often breaks semantic coherence causing the property checker to reject the deletion and forcing the reducer to backtrack, limiting overall reduction effectiveness and efficiency. In this paper, we propose DRReduce, a framework that bridges this gap by augmenting language-agnostic syntactic reduction with a lightweight semantic layer: dependency reconstruction, which repairs program dependencies broken by a deletion in order to preserve the semantic validity of intermediate programs and increase the acceptance rate of the property checker. DRReduce constructs a semantic dependency graph from the input program, performs semantically coherent deletions with dependency reconstruction, and delegates further minimization to a syntax-guided reducer. We implement DRReduce for C and Java and evaluate it on real-world bug-triggering programs. Compared to SOTA syntax-guided reducers, DRReduce achieves average size reductions of 51.9%, 14.9%, and 19.8% over Perses, WDD, and CDD respectively, while completing reduction faster on the majority of programs. Compared to language-specific tools, DRReduce achieves results comparable to CReduce and Latra without any language-specific transformation rules, at 3.3x and 1.2x higher efficiency than CReduce and Latra on average, respectively. An ablation study confirms that dependency reconstruction reduces query invocations by 80.2%, reduction time by 58.7%, and final token count by over 55.1%.2026-05-19T06:06:37Z22 pages, 6 images, 5 tables, Manuscript submitted to a Journal (2026)Qiong FengXiaotian MaYongqiang TianWei SongPeng Lianghttp://arxiv.org/abs/2605.19261v1When Web Apps Heal Themselves: A MAPE-K Based Approach to Fault Tolerance and Adaptive Recovery2026-05-19T02:17:24ZEnsuring the reliability and resilience of modern web applications remains a critical challenge due to increasing system complexity and dynamic runtime environments. This study proposes a modular self-healing framework based on the monitor-analyze-plan-execute over a shared knowledge base (MAPE-K) model, integrated with an AutoFix-inspired mechanism for adaptive fault recovery. Using a design and development research (DDR) approach, the system was implemented and evaluated through controlled fault injection experiments across twenty runtime failure scenarios, including service crashes, memory leaks, and database disconnections. Experimental results demonstrate that the proposed framework achieved a mean fault detection F1-score of 90.7% and a recovery success rate of 93.2%. The AutoFix module reduced the average time-to-recovery (TTR) by 56.2%, achieving an average recovery time of 3.92 seconds. System throughput was maintained between 88% and 95% during fault conditions, with only a 3.1% increase in response time. Additionally, iterative feedback mechanisms improved recovery efficiency by 18.6% over multiple cycles. These findings indicate that the proposed framework provides a practical and extensible approach to enhancing fault tolerance in web applications through feedback-driven adaptation. While the current implementation relies on predefined recovery strategies, the integration of learning-oriented feedback establishes a foundation for future development of more autonomous self-healing systems.2026-05-19T02:17:24Z12 pages, 3 figures, 2 tablesAribe, Sales G. & Oracion, R. J. G. (2026). When web apps heal themselves- A MAPE-K based approach to fault tolerance and adaptive recovery. International Journal of Informatics and Communication Technology, 15(2), 729-740Sales AribeRov Japheth Oracion10.11591/ijict.v15i2.pp729-740http://arxiv.org/abs/2605.19112v1Ordered Adjoint Logic (Extended Version)2026-05-18T20:58:25ZOrdered logics and type systems have been used in a variety of applications
including computational linguistics, memory allocation, stream processing,
logical frameworks, parametricity, and enforcing security protocols. In most
formulations, ordered types are also linear, requiring each resource to be
used exactly once. Prior work by Kanovich et al. has investigated calculi
that relax this constraint through subexponentials within a linear ordered
logic. We generalize their work by using adjoint modalities to combine logics
with varying fine-grained structural properties, including weakening, left
contraction, right contraction, left mobility, and right mobility. We show
that the resulting sequent calculus admits cut elimination.
We further provide a natural deduction formulation in which structural rules
are implicit, and show that proof checking for this system is decidable. This
makes it a suitable foundation for an expressive adjoint programming language
or logical framework.2026-05-18T20:58:25ZAn extended version of Ordered Adjoint Logic to appear at IJCAR 2026Sophia RoshalFrank Pfenninghttp://arxiv.org/abs/2306.12935v4Special Delivery: Programming with Mailbox Types (Extended Version)2026-05-18T20:01:16ZThe asynchronous and unidirectional communication model supported by mailboxes is a key reason for the success of actor languages like Erlang and Elixir for implementing reliable and scalable distributed systems. Although actors eliminate many of the issues stemming from shared memory concurrency, they remain vulnerable to communication errors such as protocol violations and deadlocks. Behavioural types make it possible to detect communication errors early in the development process, but most work has addressed channel-based languages rather than actor languages.
Mailbox types are a novel behavioural type system for actors first introduced for a process calculus, and capture the contents of a mailbox as a commutative regular expression. Due to aliasing and nested evaluation contexts, moving from a process calculus to a programming language is challenging. This paper presents Pat, the first programming language incorporating mailbox types, and describes an algorithmic type system. Pat is a higher-order functional language with sums, products, and lists, along with interfaces that allow finer-grained reasoning about mailbox contents. Typechecking in Pat detects four classes of behavioural error: protocol violation, unexpected message, forgotten reply and self-deadlock, as well as the usual data type errors.
The Pat type system makes essential use of quasi-linear typing to tame some of the complexity introduced by aliasing. We make use of a co-contextual algorithmic type system, achieved through a novel use of backwards bidirectional typing, and we prove it sound and complete with respect to our declarative type system. We implement a type checker, and use it to demonstrate the expressiveness of Pat on a factory automation case study and a series of examples from the Savina actor benchmark suite. This establishes a foundation for applying mailbox typing to practical actor languages.2023-06-22T14:48:48ZRevised and extended version of paper accepted to ICFP'23Simon FowlerDuncan Paul AttardDanielle MarshallSimon J. GayPhil Trinderhttp://arxiv.org/abs/2605.18697v1PopPy: Opportunistically Exploiting Parallelism in Python Compound AI Applications2026-05-18T17:33:50ZCompound AI applications, which compose calls to ML models using a general-purpose programming language like Python, are widely used for a variety of user-facing tasks, from software engineering to enterprise automation, making their end-to-end latency a critical bottleneck. In contrast to traditional applications, execution time is dominated by the external components, which cannot be handled by traditional language optimization systems, like optimizing compilers.
To address this problem, we develop PopPy, a system that can uncover parallelization opportunities in Python applications that invoke these heavy external components, including those used in compound AI applications. PopPy supports a very expressive fragment of Python and requires minimal developer input to uncover parallelism. It combines an ahead-of-time compiler with a runtime, addressing three key challenges in extracting parallelism from Python applications: language complexity, dynamic dispatch, and variable mutation. On a set of real-world compound AI applications, PopPy achieves up to $6.4\times$ speedups in end-to-end execution time compared to standard Python execution while preserving the sequential program semantics.2026-05-18T17:33:50ZStephen MellDavid MellKonstantinos KallasSteve ZdancewicOsbert Bastani