https://arxiv.org/api/Zdb1/mZpOhxnNzpdV7d6Tr+VRfc2026-03-22T10:16:18Z58661515http://arxiv.org/abs/2603.10733v2The complexity of finite smooth words over binary alphabets2026-03-12T16:03:43ZSmooth words over an alphabet of non-negative integers $\{a,b\}$ are infinite words that are infinitely derivable, the most famous example being the Oldenburger-Kolakoski word over $\{1,2\}$. The main way to study their language is to consider a finite version of smooth words that we call f-smooth words. In this paper we prove that the f-smooth words are exactly the factors of smooth words, and we make progress towards the conjecture of Sing that the complexity of f-smooth words over $\{a,b\}$ grows like $Θ\left(n^{\log(a+b)/\log((a+b)/2)}\right)$: we prove it over even alphabets, we prove the lower bound over any binary alphabet and we improve the known upper bound over odd alphabets.2026-03-11T13:06:44ZJulien CassaigneRaphaël Henryhttp://arxiv.org/abs/2603.11648v1Visibly Recursive Automata2026-03-12T08:18:09ZAs an alternative to visibly pushdown automata, we introduce visibly recursive automata (VRAs), composed of a set of classical automata that can call each other. VRAs are a strict extension of so-called systems of procedural automata, a model proposed by Frohme and Steffen. We study the complexity of standard language-theoretic operations and classical decision problems for VRAs. Since the class of deterministic VRAs forms a strict subclass in terms of expressiveness, we propose a (weaker) notion that does not restrict expressive power and which we call codeterminism. Codeterminism comes with many desirable algorithmic properties that we demonstrate by using it, e.g., as a stepping stone towards implementing complementation of VRAs.2026-03-12T08:18:09ZKévin DubrulleVéronique BruyèreGuillermo A. PérezGaëtan Staquethttp://arxiv.org/abs/2402.05854v4Slightly Non-Linear Higher-Order Tree Transducers2026-03-11T23:24:20ZWe investigate the tree-to-tree functions computed by "affine $λ$-transducers": tree automata whose memory consists of an affine $λ$-term instead of a finite state. They can be seen as variations on Gallot, Lemay and Salvati's Linear High-Order Deterministic Tree Transducers.
When the memory is almost purely affine ($\textit{à la}$ Kanazawa), we show that these machines can be translated to tree-walking transducers (and with a purely affine memory, we get a reversible tree-walking transducer). This leads to a proof of an inexpressivity conjecture of Nguyên and Pradic on "implicit automata" in an affine $λ$-calculus. We also prove that a more powerful variant, extended with preprocessing by an MSO relabeling and allowing a limited amount of non-linearity, is equivalent in expressive power to Engelfriet, Hoogeboom and Samwel's invisible pebble tree transducers.
The key technical tool in our proofs is the Interaction Abstract Machine (IAM), an operational avatar of Girard's geometry of interaction, a semantics of linear logic. We work with ad-hoc specializations to $λ$-terms of low exponential depth of a tree-generating version of the IAM.2024-02-08T17:38:29ZLê Thành Dũng NguyênGabriele Vanonihttp://arxiv.org/abs/2509.16150v2New properties of the $\varphi$-representation of integers2026-03-11T17:56:15ZWe prove a few new properties of the $\varphi$-representation of integers, where $\varphi = (1+\sqrt{5})/2$. In particular, we prove a 2012 conjecture of Kimberling. As software assistants, we used the Walnut theorem-prover, and in one proof, ChatGPT 5.2025-09-19T16:57:20ZJeffrey ShallitIngrid Vukusichttp://arxiv.org/abs/2603.11104v1Type-safe Monitoring of Parameterized Streams2026-03-11T10:03:57ZStream-based monitoring is a real-time safety assurance mechanism for complex cyber-physical systems such as unmanned aerial vehicles. The monitor aggregates streams of input data from sensors and other sources to give real-time statistics and assessments of the system's health. Since the monitor is a safety-critical component, it is mandatory to ensure the absence of runtime errors in the monitor. Providing such guarantees is particularly challenging when the monitor must handle unbounded data domains, like an unlimited number of airspace participants, requiring the use of dynamic data structures. This paper provides a type-safe integration of parameterized streams into the stream-based monitoring framework RTLola. Parameterized streams generalize individual streams to sets of an unbounded number of stream instances and provide a systematic mechanism for memory management. We show that the absence of runtime errors is, in general, undecidable but can be effectively ensured with a refinement type system that guarantees all memory references are either successful or backed by a default value. We report on the performance of the type analysis on example specifications from a range of benchmarks, including specifications from the monitoring of autonomous aircraft.2026-03-11T10:03:57ZJan BaumeisterBernd FinkbeinerFlorian Kohnhttp://arxiv.org/abs/2603.11083v1Probabilistic Disjunctive Normal Forms in Temporal Logic and Automata Theory2026-03-11T00:12:28ZThis article introduces probabilistic disjunctive normal forms (PDNFs) as a framework for representing and reasoning about uncertainty in logical systems. Unlike classical DNFs, PDNFs assign real-valued weights to variables, encoding probabilistic information about their presence, absence, or negation. Then we construct a vector space of PDNFs that allows algebraic evidence combination. PDNFs are interpreted as probability distributions over venjunctions (temporal logic constructs) and as integrable functions over partitioned intervals, where the integrals determine variable probabilities. This dual perspective allows for a Banach space structure and the application of functional analysis. We demonstrate that, under exponential parametrisation, PDNF addition aligns with Bayesian evidence fusion and derive bounds for outcome identification from random samples. The formalism thus bridges logic, numerical methods, and continuous probability.2026-03-11T00:12:28ZAlexander Kuznetsovhttp://arxiv.org/abs/2603.10139v1The Generation-Recognition Asymmetry: Six Dimensions of a Fundamental Divide in Formal Language Theory2026-03-10T18:21:53ZEvery formal grammar defines a language and can in principle be used in three ways: to generate strings (production), to recognize them (parsing), or -- given only examples -- to infer the grammar itself (grammar induction). Generation and recognition are extensionally equivalent -- they characterize the same set -- but operationally asymmetric in multiple independent ways. Inference is a qualitatively harder problem: it does not have access to a known grammar. Despite the centrality of this triad to compiler design, natural language processing, and formal language theory, no survey has treated it as a unified, multidimensional phenomenon. We identify six dimensions along which generation and recognition diverge: computational complexity, ambiguity, directionality, information availability, grammar inference, and temporality. We show that the common characterization "generation is easy, parsing is hard" is misleading: unconstrained generation is trivial, but generation under constraints can be NP-hard. The real asymmetry is that parsing is always constrained (the input is given) while generation need not be. Two of these dimensions -- directionality and temporality -- have not previously been identified as dimensions of the generation-recognition asymmetry. We connect the temporal dimension to the surprisal framework of Hale (2001) and Levy (2008), arguing that surprisal formalizes the temporal asymmetry between a generator (surprisal = 0) and a parser that predicts under uncertainty (surprisal > 0). We review bidirectional systems in NLP and observe that bidirectionality has been available for fifty years yet has not transferred to most domain-specific applications. We conclude with a discussion of large language models, which architecturally unify generation and recognition while operationally preserving the asymmetry.2026-03-10T18:21:53ZSubmitted to Information and Computation. 32 pages, 6 figures, 4 tablesRomain Peyrichouhttp://arxiv.org/abs/2603.08624v1Context-Free Trees2026-03-09T17:04:50ZMuller and Schupp introduced the concept of context-free graphs (originating from Cayley graphs of context-free groups). These graphs are always tree-like (i.e. quasi-isometric to a tree) and in this paper we investigate the subclass of bona fide context-free trees. We show that they have a finite-state description using multi-edge NFAs and that this specializes to certain partial DFAs in the case of deterministic graphs. We investigate this form of encoding algorithmically and show that the isomorphism problem for deterministic context-free trees is NL-complete in the rooted and the non-rooted case.2026-03-09T17:04:50ZJan Philipp Wächterhttp://arxiv.org/abs/2310.04764v9Characterizations of Monadic Second Order Definable Context-Free Sets of Graphs2026-03-09T14:47:17ZWe give a characterization of the sets of graphs that are both definable in Counting Monadic Second Order Logic (CMSO) and context-free, i.e., least solutions of Hyperedge-Replacement (HR) grammars introduced by Courcelle and Engelfriet. We prove the equivalence of these sets with: (a) recognizable sets (in the algebra of graphs with HR-operations) of bounded tree-width; we refine this condition further and show equivalence with recognizability in a finitely generated subalgebra of the HR-algebra of graphs; (b) parsable sets, for which there is a definable transduction from graphs to a set of derivation trees labelled by HR operations, such that the set of graphs is the image of the set of derivation trees under the canonical evaluation of the HR operations; (c) images of recognizable unranked sets of trees under a definable transduction, whose inverse is also definable. We rely on a novel connection between two seminal results, a logical characterization of context-free graph languages in terms of tree-to-graph definable transductions, by Courcelle and Engelfriet and a proof that an optimal-width tree decomposition of a graph can be built by an definable transduction, by Bojanczyk and Pilipczuk.2023-10-07T09:53:52ZLogical Methods in Computer Science, Volume 22, Issue 1 (March 10, 2026) lmcs:13735Radu IosifFlorian Zuleger10.46298/lmcs-22(1:22)2026http://arxiv.org/abs/2603.08331v1Turn Complexity of Context-free Languages, Pushdown Automata and One-Counter Automata2026-03-09T12:48:15ZA turn in a computation of a pushdown automaton is a switch from a phase in which the height of the pushdown store increases to a phase in which it decreases. Given a pushdown or one-counter automaton, we consider, for each string in its language, the minimum number of turns made in accepting computations. We prove that it cannot be decided if this number is bounded by any constants. Furthermore, we obtain a non-recursive trade-off between pushdown and one-counter automata accepting in a finite number of turns and finite-turn pushdown automata, that are defined requiring that the constant bound is satisfied by each accepting computation. We prove that there are languages accepted in a sublinear but not constant number of turns, with respect to the input length. Furthermore, there exists an infinite proper hierarchy of complexity classes, with the number of turns bounded by different sublinear functions. In addition, there is a language requiring a number of turns which is not constant but grows slower than each of the functions defining the above hierarchy.2026-03-09T12:48:15ZExtended version of a paper presented at the conference DLT 2025. In order to make the presentation easier, the witness languages in Section 5 and 6 are different from those presented in the conference version. However, the arguments and the techniques are similarGiovanni Pighizzinihttp://arxiv.org/abs/2603.08134v1Forgetting Event Order in Higher-Dimensional Automata2026-03-09T09:13:51ZHigher dimensional automata (HDAs) provide a geometric model of true concurrency, yet their standard formulation encodes an artificial total order on events. This representational artifact causes a fundamental mismatch between the combinatorial structure of HDAs and their observable behavior, leading to logical asymmetries and complicating the application of categorical tools. In this paper, we resolve this tension by developing a semantics for HDAs that is independent of event order, based on interval ipomsets (partially ordered multisets with interfaces) that preserve only precedence and concurrency. We prove that for any HDA, the traditional ST trace of an execution path corresponds precisely to its associated interval ipomset. On the structural side, we show that the presheaf theoretic presentation with an unordered base and the combinatorial presentation of symmetric HDAs are categorically isomorphic. Finally, by characterizing ST and hereditary history preserving (hhp) bisimulation via ipomset isomorphism, we provide a unified, order free foundation for HDA semantics. Our results resolve several critical ambiguities in the literature: they provide the necessary path category structure to canonically apply the Open Maps framework, eliminate representational artifacts in temporal and modal logics, and bridge systematic mismatches between HDAs and other models of concurrency such as Petri nets.2026-03-09T09:13:51ZSafa Zouarihttp://arxiv.org/abs/2511.02872v4FATE: A Formal Benchmark Series for Frontier Algebra of Multiple Difficulty Levels2026-03-08T09:01:49ZRecent advances in large language models (LLMs) have demonstrated impressive capabilities in formal theorem proving, particularly on contest-based mathematical benchmarks like the IMO. However, these contests do not reflect the depth, breadth, and abstraction of modern mathematical research. To bridge this gap, we introduce FATE (Formal Algebra Theorem Evaluation), a new benchmark series in formal algebra designed to chart a course toward advanced mathematical reasoning. We present two new components, FATE-H and FATE-X, each with 100 problems in abstract and commutative algebra. The FATE series spans a difficulty spectrum from undergraduate exercises to problems exceeding PhD qualifying exams. Notably, FATE-X is the first formal benchmark to surpass both PhD-level exam difficulty and the coverage of the Mathlib library. Our evaluations of state-of-the-art LLM provers on this new benchmark reveal a stark performance gap compared to contest math: the best model achieves only 3% (pass@64) accuracy on FATE-H and 0% on FATE-X. Our two-stage evaluation reveals that models' natural-language reasoning is notably more accurate than their ability to formalize this reasoning. We systematically classify the common errors that arise during this formalization process. Furthermore, a comparative study shows that a specialized prover can exhibit less effective reflection than general-purpose models, reducing its accuracy at the natural-language stage. We believe FATE provides a robust and challenging benchmark that establishes essential checkpoints on the path toward research-level formal mathematical reasoning.2025-11-04T03:25:17ZJiedong JiangWanyi HeYuefeng WangGuoxiong GaoYongle HuJingting WangNailin GuanPeihao WuChunbo DaiLiang XiaoBin Donghttp://arxiv.org/abs/2508.00749v3Dynamic Symbolic Execution for Semantic Difference Analysis of Component and Connector Architectures2026-03-07T18:27:00ZIn the context of model-driven development, ensuring the correctness and consistency of evolving models is paramount. This paper investigates the application of Dynamic Symbolic Execution (DSE) for semantic difference analysis of component-and-connector architectures, specifically utilizing MontiArc models. We have enhanced the existing MontiArc-to-Java generator to gather both symbolic and concrete execution data at runtime, encompassing transition conditions, visited states, and internal variables of automata. This data facilitates the identification of significant execution traces that provide critical insights into system behavior. We evaluate various execution strategies based on the criteria of runtime efficiency, minimality, and completeness, establishing a framework for assessing the applicability of DSE in semantic difference analysis. Our findings indicate that while DSE shows promise for analyzing component and connector architectures, scalability remains a primary limitation, suggesting further research is needed to enhance its practical utility in larger systems.2025-08-01T16:24:58ZJohanna GrahlBernhard RumpeMax StachonSebastian Stüberhttp://arxiv.org/abs/2512.10017v3Complexity of Linear Subsequences of $k$-Automatic Sequences2026-03-07T17:10:07ZWe construct automata with input(s) in base $k$ recognizing some basic relations and study their number of states. We also consider some basic operations on $k$-automatic sequences $(h(i))_{i \geq 0}$ and discuss their state complexity. We find a relationship between subword complexity of the interior sequence $(h'(i))_{i \geq 0}$ and state complexity of the linear subsequence $(h(ni+c))_{i \geq 0}$. We resolve a recent question of Zantema and Bosma about linear subsequences of $k$-automatic sequences with input in most-significant-digit-first format. We also discuss the state complexity and runtime complexity of using a reasonable interpretation of Büchi arithmetic to actually construct some of the studied automata recognizing relations or carrying out operations on automatic sequences.2025-12-10T19:10:26ZTheorem 21 added in this versionDelaram MoradiNarad RampersadJeffrey Shallithttp://arxiv.org/abs/2603.07094v1Randomise Alone, Reach as a Team2026-03-07T08:04:08ZWe study concurrent graph games where n players cooperate against an opponent to reach a set of target states. Unlike traditional settings, we study distributed randomisation: team players do not share a source of randomness, and their private random sources are hidden from the opponent and from each other.
We show that memoryless strategies are sufficient for the threshold problem (deciding whether there is a strategy for the team that ensures winning with probability that exceeds a threshold), a result that not only places the problem in the Existential Theory of the Reals (\exists\mathbb{R}) but also enables the construction of value iteration algorithms. We additionally show that the threshold problem is NP-hard. For the almost-sure reachability problem, we prove NP-completeness.
We introduce Individually Randomised Alternating-time Temporal Logic (IRATL). This logic extends the standard ATL framework to reason about probability thresholds, with semantics explicitly designed for coalitions that lack a shared source of randomness. On the practical side, we implement and evaluate a solver for the threshold and almost-sure problem based on the algorithms that we develop.2026-03-07T08:04:08Z50 pages, 7 figuresLéonard BriceThomas A. HenzingerAlipasha MontaseriAli ShafieeK. S. Thejaswini