Positional $ω$-regular languages

2026-02-23T10:29:06Z

In the context of two-player games over graphs, a language $L$ is called positional if, in all games using $L$ as winning objective, the protagonist can play optimally using positional strategies, that is, strategies that do not depend on the history of the play. In this work, we describe the class of parity automata recognising positional languages, providing a complete characterisation of positionality for $ω$-regular languages. As corollaries, we establish decidability of positionality in polynomial time, finite-to-infinite and 1-to-2-players lifts, and show the closure under union of prefix-independent positional objectives, answering a conjecture by Kopczyński in the $ω$-regular case.

Equality of cycle lengths in one- and two-dimensional $σ$ automata

2026-02-22T01:06:10Z

When the game Lights Out is played according to an algorithm specifying the player's sequence of moves, it can be modeled using deterministic cellular automata. One such model reduces to the $σ$ automaton, which evolves according to the 2-dimensional analog of Rule 90. We consider how the cycle lengths of multi-dimensional $σ$ automata depend on their dimension. We find that the cycle lengths of 1-dimensional $σ$ automata and 2-dimensional $σ$ automata (of the same size) are equal, and we prove this by relating the eigenvalues and Jordan blocks of their respective adjacency matrices. We also discover that cycle lengths of higher-dimensional $σ$ automata are bounded (despite the number of lattice sites increasing with dimension) and eventually saturate the upper bound.

On A. V. Anisimov's problem for finding a polynomial algorithm checking inclusion of context-free languages in group languages

2026-02-20T16:01:19Z

The work investigates the problem of whether a context-free language is a subset of a group language. A.~V. Anisimov has shown that the problem of determining the unambiguity of finite automata is a special case of this problem. Then the question of finding polynomial algorithm verifying the inclusion of context-free languages in group languages naturally arises. The article focuses on this open problem. For the purpose, the paper describes an unconventional method of description of context-free languages, namely a representation with the help of a finite digraph whose arcs are labelled with a specially defined monoid $\mathcal{U}$. Also, we define a semiring $\mathcal{S}_\mathcal{U}$ whose elements are the set $2^\mathcal{U}$ of all subsets of $\mathcal{U}$ and with operations - product and union of the elements of $2^\mathcal{U}$. The described algorithm executes no more than $O(n^3)$ operations in $\mathcal{S}_\mathcal{U}$.

A Dichotomy Theorem for Automatic Structures

2026-02-20T14:24:54Z

The field of constraint satisfaction problems (CSPs) studies homomorphism problems between relational structures where the target structure is fixed. Classifying the complexity of these problems has been a central quest of the field, notably when both sides are finite structures. In this paper, we study the generalization where the input is an automatic structure -- potentially infinite, but describable by finite automata. We prove a striking dichotomy: homomorphism problems over automatic structures are either decidable in non-deterministic logarithmic space (NL), or undecidable. We show that structures for which the problem is decidable are exactly those with finite duality, which is a classical property of target structures asserting that the existence of a homomorphism into them can be characterized by the absence of a finite set of obstructions in the source structure. Notably, this property precisely characterizes target structures whose homomorphism problem is definable in first-order logic, which is well-known to be decidable over automatic structures. We also consider a natural variant of the problem. While automatic structures are finitely describable, homomorphisms from them into finite structures need not be. This leads to the notion of regular homomorphism, where the homomorphism itself must be describable by finite automata. Remarkably, we prove that this variant exhibits the same dichotomy, with the same characterization for decidability.

Backward Responsibility in Transition Systems Beyond Safety

2026-02-20T13:58:36Z

As the complexity of software systems rises, methods for explaining their behaviour are becoming ever-more important. When a system fails, it is critical to determine which of its components are responsible for this failure. Within the verification community, one approach uses graph games and Shapley values to ascribe a responsibility value to every state of a transition system. As this is done with respect to a specific failure, it is called backward responsibility. This paper provides tight complexity bounds for the computation of backward responsibility values for reachability, Büchi and parity objectives. For Büchi objectives, a polynomial algorithm is given to determine the set of responsible states, i.e. states with positive responsibility value. To analyse systems that are too large for standard methods, the paper presents a novel refinement algorithm that iteratively finds the set of responsible states. Several heuristics are proposed to guide the refinement algorithm. Its utility is demonstrated with a tool that implements refinement in addition to several other responsibility computation techniques.

History-Constrained Systems

2026-02-20T11:05:03Z

We study verification problems for history-constrained systems (HCS), a model of guarded computation that uses nested systems. An outer system describes the process architecture in which a sequence of actions represents the communication between sub-systems through a global bus. Actions are either permitted or blocked locally by guards; these guards read and decide based on the sequence of actions so far in the global bus. When HCS have both the outer systems and the local guard controllers modelled by finite automata, we show they have the same expressive power as regular languages and finite automata, but they are exponentially more succinct. We also analyse games on this model, representing the interaction between environment and controller, and show that solving such games is EXPTIME-complete, where the lower bound already holds for reachability/safety games and the upper bound holds for any $ω$-regular winning condition. Finally, we consider HCS with guards of greater expressive power, Vector Addition Systems with States (VASS). We show that with deterministic coverability-VASS guards the reachability problem is EXPSPACE-complete, while with reachability-VASS the problem is undecidable.

Automating the Analysis and Improvement of Dynamic Programming Algorithms with Applications to Natural Language Processing

2026-02-20T02:01:27Z

This thesis develops a system for automatically analyzing and improving dynamic programs, such as those that have driven progress in natural language processing and computer science, more generally, for decades. Finding a correct program with the optimal asymptotic runtime can be unintuitive, time-consuming, and error-prone. This thesis aims to automate this laborious process. To this end, we develop an approach based on 1. a high-level, domain-specific language called Dyna for concisely specifying dynamic programs 2. a general-purpose solver to efficiently execute these programs 3. a static analysis system that provides type analysis and worst-case time/space complexity analyses 4. a rich collection of meaning-preserving transformations to programs, which systematizes the repeated insights of numerous authors when speeding up algorithms in the literature 5. a search algorithm for identifying a good sequence of transformations that reduce the runtime complexity, given an initial, correct program We show that, in practice, automated search -- like the mental search performed by human programmers -- can find substantial improvements to the initial program. Empirically, we show that many speed-ups described in the NLP literature could have been discovered automatically by our system. We provide a freely available prototype system at https://github.com/timvieira/dyna-pi.

Some Remarks on Marginal Code Languages

2026-02-19T12:19:43Z

A prefix code L satisfies the condition that no word of L is a proper prefix of another word of L. Recently, Ko, Han and Salomaa relaxed this condition by allowing a word of L to be a proper prefix of at most k words of L, for some `margin' k, introducing thus the class of k-prefix-free languages, as well as the similar classes of k-suffix-free and k-infix-free languages. Here we unify the definitions of these three classes of languages into one uniform definition in two ways: via the method of partial orders and via the method of transducers. Thus, for any known class of code-related languages definable via the transducer method, one gets a marginal version of that class. Building on the techniques of Ko, Han and Salomaa, we discuss the \emph{uniform} satisfaction and maximality problems for marginal classes of languages.

Disjunction Composition of BDD Transition Systems for Model-Based Testing

2026-02-19T10:33:50Z

We introduce a compositional approach to model-based test generation in Behavior-Driven Development (BDD). BDD is an agile methodology in which system behavior is specified through textual scenarios that, in our approach, are translated into transition systems used for model-based testing. This paper formally defines disjunction composition, to combine BDD transition systems that represent alternative system behaviors. Disjunction composition allows for modeling and testing the integrated behavior while ensuring that the testing power of the original set of scenarios is preserved. This is proved using a symbolic semantics for BDD transition systems, with the property that the symbolic equivalence of two BDD transition systems guarantees that they fail the same test cases. Also, we demonstrate the potential of disjunction composition by applying the composition in an industrial case study.

Synthesis and Verification of Transformer Programs

2026-02-18T14:04:02Z

C-RASP is a simple programming language that was recently shown to capture concepts expressible by transformers. In this paper, we develop new algorithmic techniques for automatically verifying C-RASPs. To this end, we establish a connection to the verification of synchronous dataflow programs in Lustre, which enables us to exploit state-of-the-art model checkers utilizing highly optimized SMT-solvers. Our second contribution addresses learning a C-RASP program in the first place. To this end, we provide a new algorithm for learning a C-RASP from examples using local search. We demonstrate efficacy of our implementation for benchmarks of C-RASPs in the literature, in particular in connection to the following applications: (1) transformer program optimization, and (2) constrained learning of transformer programs (based on a partial specification).

PVASS Reachability is Decidable

2026-02-18T13:54:04Z

Reachability in pushdown vector addition systems with states (PVASS) is among the longest standing open problems in Theoretical Computer Science. We show that the problem is decidable in full generality. Our decision procedure is similar in spirit to the KLMST algorithm for VASS reachability, but works over objects that support an elaborate form of procedure summarization as known from pushdown reachability.

Formalized Run-Time Analysis of Active Learning -- Coalgebraically in Agda

2026-02-18T12:58:51Z

The objective of automata learning is to reconstruct the implementation of a hidden automaton, to which only a teacher has access. The learner can ask certain kinds of queries to the teacher to gain more knowledge about the hidden automaton. The run-time of such a learning algorithm is then measured in the number of queries it takes until the hidden automaton is successfully reconstructed, which is usually parametric in the number of states of that hidden automaton. How can we prove such a run-time complexity of learning algorithms in a proof assistant if we do not have the hidden automaton and the number of states available? In the present paper, we solve this by considering learning algorithms themselves as generalized automata, more specifically as coalgebras. We introduce formal and yet compact definitions of what a learner and a teacher is, which make it easy to prove upper and lower bounds of different kinds of learning games in the proof assistant Agda. As a running example, we discuss the common number guessing game where a teacher thinks of a natural number and answers guesses by the learner with `correct', `too high', or `too low'. To demonstrate our framework, we formally prove in Agda that both the lower and upper bound on number of guesses by the learner is $\mathcal{O}(\log n)$, where $n$ is the teacher's secret number.

Hardness of Regular Expression Matching with Extensions

2026-02-18T08:53:40Z

The regular expression matching problem asks whether a given regular expression of length $m$ matches a given string of length $n$. As is well known, the problem can be solved in $O(nm)$ time using Thompson's algorithm. Moreover, recent studies have shown that the matching problem for regular expressions extended with a practical extension called lookaround can be solved in the same time complexity. In this work, we consider four well-known extensions to regular expressions called backreference, squaring, intersection and complement, and we show that, unlike in the case of lookaround, the matching problem for regular expressions extended with any of the four (for backreference, even when restricted to one capturing group) cannot be solved in $O(n^{2-\varepsilon} \mathrm{poly}(m))$ time for any constant $\varepsilon > 0$ under the Orthogonal Vectors Conjecture. Moreover, we study the matching problem for regular expressions extended with complement in more detail, which is also known as extended regular expression (ERE) matching. We show that there is no ERE matching algorithm that runs in $O(n^{ω-\varepsilon} \mathrm{poly}(m))$ time ($2 \le ω< 2.3714$ is the exponent of square matrix multiplication) for any constant $\varepsilon > 0$ under the $k$-Clique Hypothesis, and there is no combinatorial ERE matching algorithm that runs in $O(n^{3-\varepsilon} \mathrm{poly}(m))$ time for any constant $\varepsilon > 0$ under the Combinatorial $k$-Clique Hypothesis. This shows that the $O(n^3 m)$-time algorithm introduced by Hopcroft and Ullman in 1979 and recently improved by Bille et al. to run in $O(n^ωm)$ time using fast matrix multiplication was already optimal in a sense, and sheds light on why the theoretical computer science community has struggled to improve the time complexity of ERE matching with respect to $n$ and $m$ for more than 45 years.

The Smallest String Attractors of Fibonacci and Period-Doubling Words

2026-02-18T02:55:38Z

A string attractor of a string $T[1..|T|]$ is a set of positions $Γ$ of $T$ such that any substring $w$ of $T$ has an occurrence that crosses a position in $Γ$, i.e., there is a position $i$ such that $w = T[i..i+|w|-1]$ and the intersection $[i,i+|w|-1]\cap Γ$ is nonempty. The size of the smallest string attractor of Fibonacci words is known to be $2$. We completely characterize the set of all smallest string attractors of Fibonacci words, and show a recursive formula describing the $2^{n-4} + 2^{\lceil n/2 \rceil - 2}$ distinct position pairs that are the smallest string attractors of the $n$th Fibonacci word for $n \geq 7$. Similarly, the size of the smallest string attractor of period-doubling words is known to be $2$. We also completely characterize the set of all smallest string attractors of period-doubling words, and show a formula describing the two distinct position pairs that are the smallest string attractors of the $n$th period-doubling word for $n\geq 2$. Our results show that strings with the same smallest attractor size can have a drastically different number of distinct smallest attractors.

Exploring VASS Parameterised by Geometric Dimension

2026-02-17T10:43:57Z

The geometric dimension $g$ of a Vector Addition System with States (VASS) is the dimension of the vector space generated by cycles in the VASS; this parameter refines the standard dimension $d$, the number of counters. Recently, it was discovered that the fastest-known algorithm for solving the reachability problem for VASS has the same complexity in terms of $g$ as in terms of $d$. This suggests that the geometric dimension may in fact be a more adequate parameter for measuring the complexity of VASS reachability problems. We initiate a more systematic study of the geometric dimension. We discuss differences between two parameters: the geometric dimension and the SCC dimension. Our main technical result states that classical results about the coverability and boundedness problems can be improved from dimension $d$ to geometric dimension $g$. Namely, coverability is witnessed by runs of length $n^{2^{\mathcal{O}(g)}}$ instead of $n^{2^{\mathcal{O}(d)}}$, and unboundedness can be witnessed by runs of length $n^{2^{\mathcal{O}(g\log g)}}$ instead of $n^{2^{\mathcal{O}(d\log d )}}$, where $n$ is the size of the instance. We also study integer reachability and simultaneous unboundedness in VASS parameterised by the geometric dimension.