https://arxiv.org/api/Zdb1/mZpOhxnNzpdV7d6Tr+VRfc 2026-03-22T10:16:18Z 5866 15 15 http://arxiv.org/abs/2603.10733v2 The complexity of finite smooth words over binary alphabets 2026-03-12T16:03:43Z

Smooth words over an alphabet of non-negative integers $\{a,b\}$ are infinite words that are infinitely derivable, the most famous example being the Oldenburger-Kolakoski word over $\{1,2\}$. The main way to study their language is to consider a finite version of smooth words that we call f-smooth words. In this paper we prove that the f-smooth words are exactly the factors of smooth words, and we make progress towards the conjecture of Sing that the complexity of f-smooth words over $\{a,b\}$ grows like $Θ\left(n^{\log(a+b)/\log((a+b)/2)}\right)$: we prove it over even alphabets, we prove the lower bound over any binary alphabet and we improve the known upper bound over odd alphabets.

2026-03-11T13:06:44Z Julien Cassaigne Raphaël Henry http://arxiv.org/abs/2603.11648v1 Visibly Recursive Automata 2026-03-12T08:18:09Z

As an alternative to visibly pushdown automata, we introduce visibly recursive automata (VRAs), composed of a set of classical automata that can call each other. VRAs are a strict extension of so-called systems of procedural automata, a model proposed by Frohme and Steffen. We study the complexity of standard language-theoretic operations and classical decision problems for VRAs. Since the class of deterministic VRAs forms a strict subclass in terms of expressiveness, we propose a (weaker) notion that does not restrict expressive power and which we call codeterminism. Codeterminism comes with many desirable algorithmic properties that we demonstrate by using it, e.g., as a stepping stone towards implementing complementation of VRAs.

2026-03-12T08:18:09Z Kévin Dubrulle Véronique Bruyère Guillermo A. Pérez Gaëtan Staquet http://arxiv.org/abs/2402.05854v4 Slightly Non-Linear Higher-Order Tree Transducers 2026-03-11T23:24:20Z

We investigate the tree-to-tree functions computed by "affine $λ$-transducers": tree automata whose memory consists of an affine $λ$-term instead of a finite state. They can be seen as variations on Gallot, Lemay and Salvati's Linear High-Order Deterministic Tree Transducers. When the memory is almost purely affine ($\textit{à la}$ Kanazawa), we show that these machines can be translated to tree-walking transducers (and with a purely affine memory, we get a reversible tree-walking transducer). This leads to a proof of an inexpressivity conjecture of Nguyên and Pradic on "implicit automata" in an affine $λ$-calculus. We also prove that a more powerful variant, extended with preprocessing by an MSO relabeling and allowing a limited amount of non-linearity, is equivalent in expressive power to Engelfriet, Hoogeboom and Samwel's invisible pebble tree transducers. The key technical tool in our proofs is the Interaction Abstract Machine (IAM), an operational avatar of Girard's geometry of interaction, a semantics of linear logic. We work with ad-hoc specializations to $λ$-terms of low exponential depth of a tree-generating version of the IAM.

2024-02-08T17:38:29Z Lê Thành Dũng Nguyên Gabriele Vanoni http://arxiv.org/abs/2509.16150v2 New properties of the $\varphi$-representation of integers 2026-03-11T17:56:15Z

We prove a few new properties of the $\varphi$-representation of integers, where $\varphi = (1+\sqrt{5})/2$. In particular, we prove a 2012 conjecture of Kimberling. As software assistants, we used the Walnut theorem-prover, and in one proof, ChatGPT 5.

2025-09-19T16:57:20Z Jeffrey Shallit Ingrid Vukusic http://arxiv.org/abs/2603.11104v1 Type-safe Monitoring of Parameterized Streams 2026-03-11T10:03:57Z

Stream-based monitoring is a real-time safety assurance mechanism for complex cyber-physical systems such as unmanned aerial vehicles. The monitor aggregates streams of input data from sensors and other sources to give real-time statistics and assessments of the system's health. Since the monitor is a safety-critical component, it is mandatory to ensure the absence of runtime errors in the monitor. Providing such guarantees is particularly challenging when the monitor must handle unbounded data domains, like an unlimited number of airspace participants, requiring the use of dynamic data structures. This paper provides a type-safe integration of parameterized streams into the stream-based monitoring framework RTLola. Parameterized streams generalize individual streams to sets of an unbounded number of stream instances and provide a systematic mechanism for memory management. We show that the absence of runtime errors is, in general, undecidable but can be effectively ensured with a refinement type system that guarantees all memory references are either successful or backed by a default value. We report on the performance of the type analysis on example specifications from a range of benchmarks, including specifications from the monitoring of autonomous aircraft.

2026-03-11T10:03:57Z Jan Baumeister Bernd Finkbeiner Florian Kohn http://arxiv.org/abs/2603.11083v1 Probabilistic Disjunctive Normal Forms in Temporal Logic and Automata Theory 2026-03-11T00:12:28Z

This article introduces probabilistic disjunctive normal forms (PDNFs) as a framework for representing and reasoning about uncertainty in logical systems. Unlike classical DNFs, PDNFs assign real-valued weights to variables, encoding probabilistic information about their presence, absence, or negation. Then we construct a vector space of PDNFs that allows algebraic evidence combination. PDNFs are interpreted as probability distributions over venjunctions (temporal logic constructs) and as integrable functions over partitioned intervals, where the integrals determine variable probabilities. This dual perspective allows for a Banach space structure and the application of functional analysis. We demonstrate that, under exponential parametrisation, PDNF addition aligns with Bayesian evidence fusion and derive bounds for outcome identification from random samples. The formalism thus bridges logic, numerical methods, and continuous probability.

2026-03-11T00:12:28Z Alexander Kuznetsov http://arxiv.org/abs/2603.10139v1 The Generation-Recognition Asymmetry: Six Dimensions of a Fundamental Divide in Formal Language Theory 2026-03-10T18:21:53Z

Every formal grammar defines a language and can in principle be used in three ways: to generate strings (production), to recognize them (parsing), or -- given only examples -- to infer the grammar itself (grammar induction). Generation and recognition are extensionally equivalent -- they characterize the same set -- but operationally asymmetric in multiple independent ways. Inference is a qualitatively harder problem: it does not have access to a known grammar. Despite the centrality of this triad to compiler design, natural language processing, and formal language theory, no survey has treated it as a unified, multidimensional phenomenon. We identify six dimensions along which generation and recognition diverge: computational complexity, ambiguity, directionality, information availability, grammar inference, and temporality. We show that the common characterization "generation is easy, parsing is hard" is misleading: unconstrained generation is trivial, but generation under constraints can be NP-hard. The real asymmetry is that parsing is always constrained (the input is given) while generation need not be. Two of these dimensions -- directionality and temporality -- have not previously been identified as dimensions of the generation-recognition asymmetry. We connect the temporal dimension to the surprisal framework of Hale (2001) and Levy (2008), arguing that surprisal formalizes the temporal asymmetry between a generator (surprisal = 0) and a parser that predicts under uncertainty (surprisal > 0). We review bidirectional systems in NLP and observe that bidirectionality has been available for fifty years yet has not transferred to most domain-specific applications. We conclude with a discussion of large language models, which architecturally unify generation and recognition while operationally preserving the asymmetry.

2026-03-10T18:21:53Z Submitted to Information and Computation. 32 pages, 6 figures, 4 tables Romain Peyrichou http://arxiv.org/abs/2603.08624v1 Context-Free Trees 2026-03-09T17:04:50Z

Muller and Schupp introduced the concept of context-free graphs (originating from Cayley graphs of context-free groups). These graphs are always tree-like (i.e. quasi-isometric to a tree) and in this paper we investigate the subclass of bona fide context-free trees. We show that they have a finite-state description using multi-edge NFAs and that this specializes to certain partial DFAs in the case of deterministic graphs. We investigate this form of encoding algorithmically and show that the isomorphism problem for deterministic context-free trees is NL-complete in the rooted and the non-rooted case.

2026-03-09T17:04:50Z Jan Philipp Wächter http://arxiv.org/abs/2310.04764v9 Characterizations of Monadic Second Order Definable Context-Free Sets of Graphs 2026-03-09T14:47:17Z

We give a characterization of the sets of graphs that are both definable in Counting Monadic Second Order Logic (CMSO) and context-free, i.e., least solutions of Hyperedge-Replacement (HR) grammars introduced by Courcelle and Engelfriet. We prove the equivalence of these sets with: (a) recognizable sets (in the algebra of graphs with HR-operations) of bounded tree-width; we refine this condition further and show equivalence with recognizability in a finitely generated subalgebra of the HR-algebra of graphs; (b) parsable sets, for which there is a definable transduction from graphs to a set of derivation trees labelled by HR operations, such that the set of graphs is the image of the set of derivation trees under the canonical evaluation of the HR operations; (c) images of recognizable unranked sets of trees under a definable transduction, whose inverse is also definable. We rely on a novel connection between two seminal results, a logical characterization of context-free graph languages in terms of tree-to-graph definable transductions, by Courcelle and Engelfriet and a proof that an optimal-width tree decomposition of a graph can be built by an definable transduction, by Bojanczyk and Pilipczuk.

2023-10-07T09:53:52Z Logical Methods in Computer Science, Volume 22, Issue 1 (March 10, 2026) lmcs:13735 Radu Iosif Florian Zuleger 10.46298/lmcs-22(1:22)2026 http://arxiv.org/abs/2603.08331v1 Turn Complexity of Context-free Languages, Pushdown Automata and One-Counter Automata 2026-03-09T12:48:15Z

A turn in a computation of a pushdown automaton is a switch from a phase in which the height of the pushdown store increases to a phase in which it decreases. Given a pushdown or one-counter automaton, we consider, for each string in its language, the minimum number of turns made in accepting computations. We prove that it cannot be decided if this number is bounded by any constants. Furthermore, we obtain a non-recursive trade-off between pushdown and one-counter automata accepting in a finite number of turns and finite-turn pushdown automata, that are defined requiring that the constant bound is satisfied by each accepting computation. We prove that there are languages accepted in a sublinear but not constant number of turns, with respect to the input length. Furthermore, there exists an infinite proper hierarchy of complexity classes, with the number of turns bounded by different sublinear functions. In addition, there is a language requiring a number of turns which is not constant but grows slower than each of the functions defining the above hierarchy.

2026-03-09T12:48:15Z Extended version of a paper presented at the conference DLT 2025. In order to make the presentation easier, the witness languages in Section 5 and 6 are different from those presented in the conference version. However, the arguments and the techniques are similar Giovanni Pighizzini http://arxiv.org/abs/2603.08134v1 Forgetting Event Order in Higher-Dimensional Automata 2026-03-09T09:13:51Z

Higher dimensional automata (HDAs) provide a geometric model of true concurrency, yet their standard formulation encodes an artificial total order on events. This representational artifact causes a fundamental mismatch between the combinatorial structure of HDAs and their observable behavior, leading to logical asymmetries and complicating the application of categorical tools. In this paper, we resolve this tension by developing a semantics for HDAs that is independent of event order, based on interval ipomsets (partially ordered multisets with interfaces) that preserve only precedence and concurrency. We prove that for any HDA, the traditional ST trace of an execution path corresponds precisely to its associated interval ipomset. On the structural side, we show that the presheaf theoretic presentation with an unordered base and the combinatorial presentation of symmetric HDAs are categorically isomorphic. Finally, by characterizing ST and hereditary history preserving (hhp) bisimulation via ipomset isomorphism, we provide a unified, order free foundation for HDA semantics. Our results resolve several critical ambiguities in the literature: they provide the necessary path category structure to canonically apply the Open Maps framework, eliminate representational artifacts in temporal and modal logics, and bridge systematic mismatches between HDAs and other models of concurrency such as Petri nets.

2026-03-09T09:13:51Z Safa Zouari http://arxiv.org/abs/2511.02872v4 FATE: A Formal Benchmark Series for Frontier Algebra of Multiple Difficulty Levels 2026-03-08T09:01:49Z

Recent advances in large language models (LLMs) have demonstrated impressive capabilities in formal theorem proving, particularly on contest-based mathematical benchmarks like the IMO. However, these contests do not reflect the depth, breadth, and abstraction of modern mathematical research. To bridge this gap, we introduce FATE (Formal Algebra Theorem Evaluation), a new benchmark series in formal algebra designed to chart a course toward advanced mathematical reasoning. We present two new components, FATE-H and FATE-X, each with 100 problems in abstract and commutative algebra. The FATE series spans a difficulty spectrum from undergraduate exercises to problems exceeding PhD qualifying exams. Notably, FATE-X is the first formal benchmark to surpass both PhD-level exam difficulty and the coverage of the Mathlib library. Our evaluations of state-of-the-art LLM provers on this new benchmark reveal a stark performance gap compared to contest math: the best model achieves only 3% (pass@64) accuracy on FATE-H and 0% on FATE-X. Our two-stage evaluation reveals that models' natural-language reasoning is notably more accurate than their ability to formalize this reasoning. We systematically classify the common errors that arise during this formalization process. Furthermore, a comparative study shows that a specialized prover can exhibit less effective reflection than general-purpose models, reducing its accuracy at the natural-language stage. We believe FATE provides a robust and challenging benchmark that establishes essential checkpoints on the path toward research-level formal mathematical reasoning.

2025-11-04T03:25:17Z Jiedong Jiang Wanyi He Yuefeng Wang Guoxiong Gao Yongle Hu Jingting Wang Nailin Guan Peihao Wu Chunbo Dai Liang Xiao Bin Dong http://arxiv.org/abs/2508.00749v3 Dynamic Symbolic Execution for Semantic Difference Analysis of Component and Connector Architectures 2026-03-07T18:27:00Z

In the context of model-driven development, ensuring the correctness and consistency of evolving models is paramount. This paper investigates the application of Dynamic Symbolic Execution (DSE) for semantic difference analysis of component-and-connector architectures, specifically utilizing MontiArc models. We have enhanced the existing MontiArc-to-Java generator to gather both symbolic and concrete execution data at runtime, encompassing transition conditions, visited states, and internal variables of automata. This data facilitates the identification of significant execution traces that provide critical insights into system behavior. We evaluate various execution strategies based on the criteria of runtime efficiency, minimality, and completeness, establishing a framework for assessing the applicability of DSE in semantic difference analysis. Our findings indicate that while DSE shows promise for analyzing component and connector architectures, scalability remains a primary limitation, suggesting further research is needed to enhance its practical utility in larger systems.

2025-08-01T16:24:58Z Johanna Grahl Bernhard Rumpe Max Stachon Sebastian Stüber http://arxiv.org/abs/2512.10017v3 Complexity of Linear Subsequences of $k$-Automatic Sequences 2026-03-07T17:10:07Z

We construct automata with input(s) in base $k$ recognizing some basic relations and study their number of states. We also consider some basic operations on $k$-automatic sequences $(h(i))_{i \geq 0}$ and discuss their state complexity. We find a relationship between subword complexity of the interior sequence $(h'(i))_{i \geq 0}$ and state complexity of the linear subsequence $(h(ni+c))_{i \geq 0}$. We resolve a recent question of Zantema and Bosma about linear subsequences of $k$-automatic sequences with input in most-significant-digit-first format. We also discuss the state complexity and runtime complexity of using a reasonable interpretation of Büchi arithmetic to actually construct some of the studied automata recognizing relations or carrying out operations on automatic sequences.

2025-12-10T19:10:26Z Theorem 21 added in this version Delaram Moradi Narad Rampersad Jeffrey Shallit http://arxiv.org/abs/2603.07094v1 Randomise Alone, Reach as a Team 2026-03-07T08:04:08Z

We study concurrent graph games where n players cooperate against an opponent to reach a set of target states. Unlike traditional settings, we study distributed randomisation: team players do not share a source of randomness, and their private random sources are hidden from the opponent and from each other. We show that memoryless strategies are sufficient for the threshold problem (deciding whether there is a strategy for the team that ensures winning with probability that exceeds a threshold), a result that not only places the problem in the Existential Theory of the Reals (\exists\mathbb{R}) but also enables the construction of value iteration algorithms. We additionally show that the threshold problem is NP-hard. For the almost-sure reachability problem, we prove NP-completeness. We introduce Individually Randomised Alternating-time Temporal Logic (IRATL). This logic extends the standard ATL framework to reason about probability thresholds, with semantics explicitly designed for coalitions that lack a shared source of randomness. On the practical side, we implement and evaluate a solver for the threshold and almost-sure problem based on the algorithms that we develop.

2026-03-07T08:04:08Z 50 pages, 7 figures Léonard Brice Thomas A. Henzinger Alipasha Montaseri Ali Shafiee K. S. Thejaswini