https://arxiv.org/api/QxGYYoBSPfVgW7VpoFFk+Bmn0us2026-03-30T11:58:22Z282807515http://arxiv.org/abs/2603.21272v1The Library Theorem: How External Organization Governs Agentic Reasoning Capacity2026-03-22T15:02:56ZExternalized reasoning is already exploited by transformer-based agents through chain-of-thought, but structured retrieval -- indexing over one's own reasoning state -- remains underexplored. We formalize the transformer context window as an I/O page and prove that tool-augmented agents with indexed external memory achieve exponentially lower retrieval cost than agents restricted to sequential scanning: $O(\log_b N)$ versus $Ω(N)$ page reads per query, and $O(T \log_b T)$ versus $Θ(T^2)$ cumulative cost over $T$ reasoning steps -- a gap that widens as deliberation deepens. We test these predictions on a controlled lookup benchmark across three content types -- random hashes, ordered integers, and encyclopedia entries -- varying store size from 50 to 5,000 items, and replicate key conditions across two model generations (GPT-4o-mini and GPT-5.4). On abstract content, the indexed agent achieves median 1 page read regardless of store size, confirming the $O(1)$ prediction. Sorted pages without an index fail to close the gap: the weaker model cannot sustain binary search at scale, and the stronger model achieves near-optimal $\log_2 N$ search but still loses to the index by $5\times$. On familiar content (encyclopedia entries), a competing failure mode emerges: the model recognizes the domain, bypasses the retrieval protocol, and generates answers from parametric memory, producing catastrophic token expenditure even when the index is sound. This parametric memory competition dissociates the two cognitive operations that indexing combines: understanding content (where language models excel) and following navigational protocols (where they fail when understanding tempts them to shortcut). The result argues for a separation of concerns: use language models for index construction, where semantic understanding helps, and deterministic algorithms for index traversal, where it hurts.2026-03-22T15:02:56Z19 pages, 6 figuresZachary F. Mainenhttp://arxiv.org/abs/2602.15341v2Testing Monotonicity of Real-Valued Functions on DAGs2026-03-22T10:40:28ZWe study monotonicity testing of real-valued functions on directed acyclic graphs (DAGs) with $n$ vertices. For every constant $δ>0$, we prove a $Ω(n^{1/2-δ}/\sqrt{\varepsilon})$ lower bound against non-adaptive two-sided testers on DAGs, nearly matching the classical $O(\sqrt{n/\varepsilon})$-query upper bound. For constant $\varepsilon$, we also prove an $Ω(\sqrt n)$ lower bound for randomized adaptive one-sided testers on explicit bipartite DAGs, whereas previously only an $Ω(\log n)$ lower bound was known. A key technical ingredient in both lower bounds is positive-matching Ruzsa--Szemerédi families. On the algorithmic side, we give simple non-adaptive one-sided testers with query complexity $O(\sqrt{m\,\ell}/(\varepsilon n))$ and $O(m^{1/3}/\varepsilon^{2/3})$, where $m$ is the number of edges in the transitive reduction and $\ell$ is the number of edges in the transitive closure. For constant $\varepsilon>0$, these improve over the previous $O(\sqrt{n/\varepsilon})$ bound when $m\ell=o(n^3)$ and $m=o(n^{3/2})$, respectively.2026-02-17T04:14:41ZYuichi Yoshidahttp://arxiv.org/abs/2408.15181v2On the parameterized complexity of computing good edge-labelings2026-03-22T10:38:51ZA good edge-labeling (gel for short) of a graph $G$ is a function $λ: E(G) \to \mathbb{R}$ such that, for any ordered pair of vertices $(x, y)$ of $G$, there do not exist two distinct increasing paths from $x$ to $y$, where ``increasing'' means that the sequence of labels is non-decreasing. This notion was introduced by Bermond et al. [Theor. Comput. Sci. 2013] motivated by practical applications arising from routing and wavelength assignment problems in optical networks. Prompted by the lack of algorithmic results about the problem of deciding whether an input graph admits a gel, called GEL, we initiate its study from the viewpoint of parameterized complexity. We first introduce the natural version of GEL where one wants to use at most $c$ distinct labels, which we call $c$-GEL, and we prove that it is NP-complete for every $c \geq 2$ on very restricted instances. We then provide several positive results, starting with simple polynomial kernels for GEL and $c$-\GEL parameterized by neighborhood diversity or vertex cover. As one of our main technical contributions, we present an FPT algorithm for GEL parameterized by the size of a modulator to a forest of stars, based on a novel approach via a 2-SAT formulation which we believe to be of independent interest. We also present FPT algorithms based on dynamic programming for $c$-GEL parameterized by treewidth and $c$, and for GEL parameterized by treewidth and the maximum degree. Finally, we answer positively a question of Bermond et al. [Theor. Comput. Sci. 2013] by proving the NP-completeness of a problem strongly related to GEL, namely that of deciding whether an input graph admits a so-called UPP-orientation.2024-08-27T16:33:37Z47 pages, 16 figuresDavi de AndradeJúlio AraújoLaure MorelleIgnasi SauAna Silvahttp://arxiv.org/abs/2603.21148v1Fast Nearest Neighbor Search for $\ell_p$ Metrics2026-03-22T09:49:43ZThe Nearest Neighbor Search (NNS) problem asks to design a data structure that preprocesses an $n$-point dataset $X$ lying in a metric space $\mathcal{M}$, so that given a query point $q \in \mathcal{M}$, one can quickly return a point of $X$ minimizing the distance to $q$. The efficiency of such a data structure is evaluated primarily by the amount of space it uses and the time required to answer a query. We focus on the fast query-time regime, which is crucial for modern large-scale applications, where datasets are massive and queries must be processed online, and is often modeled by query time $\text{poly}(d \log n)$. Our main result is such a randomized data structure for NNS in $\ell_p$ spaces, $p>2$, that achieves $p^{O(1) + \log\log p}$ approximation with fast query time and $\text{poly}(dn)$ space. Our data structure improves, or is incomparable to, the state-of-the-art for the fast query-time regime from [Bartal and Gottlieb, TCS 2019] and [Krauthgamer, Petruschka and Sapir, FOCS 2025].2026-03-22T09:49:43Z9 pagesRobert KrauthgamerNir Petruschkahttp://arxiv.org/abs/2603.14190v2Sublime: Sublinear Error & Space for Unbounded Skewed Streams2026-03-22T04:48:03ZModern stream processing systems often need to track the frequency of distinct keys in a data stream in real-time. Since maintaining exact counts can require a prohibitive amount of memory, many applications rely on compact, probabilistic data structures known as frequency estimation sketches to approximate them. However, mainstream frequency estimation sketches fall short in two critical aspects. First, they are memory-inefficient under skewed workloads because they use uniformly-sized counters to count the keys, thus wasting memory on storing the leading zeros of many small counts. Second, their estimation error deteriorates at least linearly with the length of the stream--which may grow indefinitely--because they rely on a fixed number of counters.
We present Sublime, a framework that generalizes frequency estimation sketches to address these challenges. To reduce memory footprint under skew, Sublime begins with short counters and dynamically elongates them as they overflow, storing their extensions within the same cache line. It employs efficient bit manipulation routines to quickly locate and access a counter's extensions. To maintain accuracy as the stream grows, Sublime also expands its number of counters at a configurable rate, exposing a new spectrum of accuracy-memory tradeoffs that applications can tune to their needs. We apply Sublime to both Count-Min Sketch and Count Sketch. Through theoretical analysis and empirical evaluation, we show that Sublime significantly improves accuracy and memory over the state of the art while maintaining competitive or superior performance.2026-03-15T02:57:17Z27 pages. 16 figures. 3 tables. Accepted to SIGMOD 2026Navid EslamiIoana O. BerceaRasmus PaghNiv Dayanhttp://arxiv.org/abs/2511.06171v2Halfspaces are hard to test with relative error2026-03-21T21:45:02ZSeveral recent works [DHLNSY25, CPPS25a, CPPS25b] have studied a model of property testing of Boolean functions under a \emph{relative-error} criterion. In this model, the distance from a target function $f: \{0,1\}^n \to \{0,1\}$ that is being tested to a function $g$ is defined relative to the number of inputs $x$ for which $f(x)=1$; moreover, testing algorithms in this model have access both to a black-box oracle for $f$ and to independent uniform satisfying assignments of $f$. The motivation for this model is that it provides a natural framework for testing \emph{sparse} Boolean functions that have few satisfying assignments, analogous to well-studied models for property testing of sparse graphs.
The main result of this paper is a lower bound for testing \emph{halfspaces} (i.e., linear threshold functions) in the relative error model: we show that $\tildeΩ(\log n)$ oracle calls are required for any relative-error halfspace testing algorithm over the Boolean hypercube $\{0,1\}^n$. This stands in sharp contrast both with the constant-query testability (independent of $n$) of halfspaces in the standard model [MORS10], and with the positive results for relative-error testing of many other classes given in [DHLNSY25, CPPS25a, CPPS25b]. Our lower bound for halfspaces gives the first example of a well-studied class of functions for which relative-error testing is provably more difficult than standard-model testing.2025-11-09T00:51:16Z26 pages, appeared in SODA 2026. v2 correct minor typosXi ChenAnindya DeYizhi HuangShivam NadimpalliRocco A. ServedioTianqi Yanghttp://arxiv.org/abs/2409.18634v2Split-or-decompose: Improved FPT branching algorithms for maximum agreement forests2026-03-21T15:58:41ZPhylogenetic trees are leaf-labelled trees used to model the evolution of species. In practice it is not uncommon to obtain two topologically distinct trees for the same set of species, and this motivates the use of distance measures to quantify dissimilarity. A well-known measure is the maximum agreement forest (MAF): a minimum-size partition of the leaf labels which splits both trees into the same set of disjoint, leaf-labelled subtrees (up to isomorphism after suppressing degree-2 vertices). Computing such a MAF is NP-hard and so considerable effort has been invested in finding FPT algorithms, parameterised by $k$, the number of components of a MAF. The state of the art has been unchanged since 2015, with running times of $O^*(3^k)$ for unrooted trees and $O^*(2.3431^k)$ for rooted trees. In this work we present improved algorithms for both the unrooted and rooted cases, with runtimes $O^*(2.846^k)$ and $O^*(2.3391^k)$ respectively. The key to our improvement is a novel branching strategy in which we show that any overlapping components obtained on the way to a MAF can be `split' by a branching rule with favourable branching factor, and then the problem can be decomposed into disjoint subproblems to be solved separately. We expect that this technique may be more widely applicable to other problems in algorithmic phylogenetics.2024-09-27T11:07:06ZAccepted for journal publication. Compared to first arxiv version contains extra figures and clarifying paragraphsDavid MestelSteven ChaplickSteven KelkRuben Meuwesehttp://arxiv.org/abs/2603.07280v3Complexity Lower Bounds of Small Matrix Multiplication over Finite Fields via Backtracking and Substitution2026-03-21T13:21:53ZWe introduce a new method for proving bilinear complexity lower bounds for matrix multiplication over finite fields. The approach combines the substitution method with a systematic backtracking search over linear restrictions on the first matrix $A$ in the product $AB = C^T$. We enumerate restriction classes up to symmetry; for each class we either obtain a rank lower bound by classical arguments or branch further via the substitution method. The search is organized by dynamic programming on the restricted matrix $A$. As an application we prove that the bilinear complexity of multiplying two $3 \times 3$ matrices over $\mathbb{F}_2$ is at least $20$, improving the longstanding lower bound of $19$ (Bläser 2003). The proof is found automatically within 1.5 hours on a laptop and verified in seconds.2026-03-07T16:57:11ZChengu Wanghttp://arxiv.org/abs/2603.20790v1(Sets of ) Complement Scattered Factors2026-03-21T12:30:39ZStarting in the 1970s with the fundamental work of Imre Simon, \emph{scattered factors} (also known as subsequences or scattered subwords) have remained a consistently and heavily studied object. The majority of work on scattered factors can be split into two broad classes of problems: given a word, what information, in the form of scattered factors, are contained, and which are not. In this paper, we consider an intermediary problem, introducing the notion of \emph{complement scattered factors}. Given a word $w$ and a scattered factor $u$ of $w$, the complement scattered factors of $w$ with regards to $u$, $C(w, u)$, is the set of scattered factors in $w$ that can be formed by removing any embedding of $u$ from $w$. This is closely related to the \emph{shuffle} operation in which two words are intertwined, i.e., we extend previous work relating to the shuffle operator, using knowledge about scattered factors. Alongside introducing these sets, we provide combinatorial results on the size of the set $C(w, u)$, an algorithm to compute the set $C(w, u)$ from $w$ and $u$ in $O(\vert w \vert \cdot \vert u \vert \binom{w}{u})$ time, where $\binom{w}{u}$ denotes the number of embeddings of $u$ into $w$, an algorithm to construct $u$ from $w$ and $C(w, u)$ in $O(\vert w \vert^2 \binom{\vert w \vert}{\vert w \vert - \vert u \vert})$ time, and an algorithm to construct $w$ from $u$ and $C(w, u)$ in $O(\vert u \vert \cdot \vert w \vert^{\vert u \vert + 1})$ time.2026-03-21T12:30:39ZDuncan AdamsonPamela FleischmannAnnika Huchhttp://arxiv.org/abs/2409.19437v5Strongly-polynomial time and validation analysis of policy gradient methods2026-03-21T02:13:10ZThis paper proposes a novel termination criterion, termed the advantage gap function, for finite state and action Markov decision processes (MDP) and reinforcement learning (RL). By incorporating this advantage gap function into the design of step size rules and deriving a new linear rate of convergence that is independent of the stationary state distribution of the optimal policy, we demonstrate that policy gradient methods can solve MDPs in strongly-polynomial time. To the best of our knowledge, this is the first time that such strong convergence properties have been established for policy gradient methods. Moreover, in the stochastic setting, where only stochastic estimates of policy gradients are available, we show that the advantage gap function provides close approximations of the optimality gap for each individual state and exhibits a sublinear rate of convergence at every state. The advantage gap function can be easily estimated in the stochastic case, and when coupled with easily computable upper bounds on policy values, they provide a convenient way to validate the solutions generated by policy gradient methods. Therefore, our developments offer a principled and computable measure of optimality for RL, whereas current practice tends to rely on algorithm-to-algorithm or baselines comparisons with no certificate of optimality.2024-09-28T18:56:48ZUpdated manuscript with new experimentsCaleb JuGuanghui Lanhttp://arxiv.org/abs/2405.01425v4In-and-Out: Algorithmic Diffusion for Sampling Convex Bodies2026-03-20T17:14:00ZWe present a new random walk for uniformly sampling high-dimensional convex bodies. It achieves state-of-the-art runtime complexity with stronger guarantees on the output than previously known, namely in Rényi divergence (which implies TV, $\mathcal{W}_2$, KL, $χ^2$). The proof departs from known approaches for polytime algorithms for the problem -- we utilize a stochastic diffusion perspective to show contraction to the target distribution with the rate of convergence determined by functional isoperimetric constants of the target distribution.2024-05-02T16:15:46ZTo appear in Random Structures & Algorithms; conference version appeared in NeurIPS 2024 (spotlight)Yunbum KookSantosh S. VempalaMatthew S. Zhanghttp://arxiv.org/abs/2603.20060v1Power laws and power-of-two-choices2026-03-20T15:40:32ZThis paper analyzes a variation on the well-known "power of two choices" allocation algorithms. Classically, the smallest of $d$ randomly-chosen options is selected. We investigate what happens when the largest of $d$ randomly-chosen options is selected. This process generates a power-law-like distribution: the $i^{th}$-smallest value scales with $i^{d-1}$, where $d$ is the number of randomly-chosen options, with high probability. We give a formula for the expectation and show the distribution is concentrated around the expectation2026-03-20T15:40:32ZAmanda Redlichhttp://arxiv.org/abs/2509.01597v2Statistics-Friendly Confidentiality Protection for Establishment Data, with Applications to the QCEW2026-03-20T14:30:40ZConfidentiality for business data is an understudied area of disclosure avoidance, where legacy methods struggle to provide acceptable results. Standard formal privacy techniques for person-level data, like differential privacy, are designed to protect against membership inference and hence do not provide suitable confidentiality/utility trade-offs due to the highly skewed nature of business data and because extreme outlier records are often important contributors to query answers. Prior proposals, therefore, took a personalized differential privacy approach that allowed privacy parameters to degrade for the outlying records -- larger establishments get weaker membership inference guarantees. However, providing guarantees to some entities that are strictly weaker than guarantees for others is problematic from a policy standpoint. In this paper, we propose a novel confidentiality framework for business data with a focus on interpretability for policy makers. Instead of protecting against membership inference, which is often not a concern in business data, we protect against attribute inferences that are too precise. In our framework, data curators specify a neighbor function that is used to define uncertainty interval bands around an establishment's attribute values and the privacy parameters govern the strength of indistinguishability between values within the same uncertainty interval.We propose two query-answering mechanisms under this framework and evaluate them on: (1) a confidential Quarterly Census of Employment and Wages (QCEW) dataset produced by the U.S. Bureau of Labor Statistics (this was done through a cooperative agreement), and (2) a substitute dataset that we created from public sources (and will publicly release).2025-09-01T16:29:54Z42 pages (13 main text, 2 references, and 27 appendix pages), 13 figures (4 in main text)Kaitlyn WebbProttay ProtivashJohn DurrellDaniell TothAleksandra SlavkovićDaniel Kiferhttp://arxiv.org/abs/2603.19965v1Computational Complexity Analysis of Interval Methods in Solving Uncertain Nonlinear Systems2026-03-20T14:06:03ZThis paper analyses the computational complexity of validated interval methods for uncertain nonlinear systems. Interval analysis produces guaranteed enclosures that account for uncertainty and round-off, but its adoption is often limited by computational cost in high dimensions. We develop an algorithm-level worst-case framework that makes the dependence on the initial search volume $\mathrm{Vol}(X_0)$, the target tolerance $\varepsilon$, and the costs of validated primitives explicit (inclusion-function evaluation, Jacobian evaluation, and interval linear algebra). Within this framework, we derive worst-case time and space bounds for interval bisection, subdivision$+$filter, interval constraint propagation, interval Newton, and interval Krawczyk. The bounds quantify the scaling with $\mathrm{Vol}(X_0)$ and $\varepsilon$ for validated steady-state enclosure and highlight dominant cost drivers. We also show that determinant and inverse computation for interval matrices via naive Laplace expansion is factorial in the matrix dimension, motivating specialised interval linear algebra. Finally, interval Newton and interval Krawczyk have comparable leading-order costs; Krawczyk is typically cheaper in practice because it inverts a real midpoint matrix rather than an interval matrix. These results support the practical design of solvers for validated steady-state analysis in applications such as biochemical reaction network modelling, robust parameter estimation, and other uncertainty-aware computations in systems and synthetic biology.2026-03-20T14:06:03Z20 pages, 2 figuresRudra PrakashS. JanardhananShaunak Senhttp://arxiv.org/abs/2405.12876v3Approximating Traveling Salesman Problems Using a Bridge Lemma2026-03-20T13:13:41ZWe give improved approximations for two metric Traveling Salesman Problem (TSP) variants. In Ordered TSP (OTSP) we are given a linear ordering on a subset of nodes $o_1, \ldots, o_k$. The TSP solution must have that $o_{i+1}$ is visited at some point after $o_i$ for each $1 \leq i < k$. This is the special case of Precedence-Constrained TSP ($PTSP$) in which the precedence constraints are given by a single chain on a subset of nodes. In $k$-Person TSP Path (k-TSPP), we are given pairs of nodes $(s_1,t_1), \ldots, (s_k,t_k)$. The goal is to find an $s_i$-$t_i$ path with minimum total cost such that every node is visited by at least one path.
We obtain a $3/2 + e^{-1} < 1.878$ approximation for OTSP, the first improvement over a trivial $α+1$ approximation where $α$ is the current best TSP approximation. We also obtain a $1 + 2 \cdot e^{-1/2} < 2.214$ approximation for k-TSPP, the first improvement over a trivial $3$-approximation.
These algorithms both use an adaptation of the Bridge Lemma that was initially used to obtain improved Steiner Tree approximations [Byrka et al., 2013]. Roughly speaking, our variant states that the cost of a cheapest forest rooted at a given set of terminal nodes will decrease by a substantial amount if we randomly sample a set of non-terminal nodes to also become terminals such provided each non-terminal has a constant probability of being sampled. We believe this view of the Bridge Lemma will find further use for improved vehicle routing approximations beyond this paper.2024-05-21T15:46:13Zv3: No textual changes since v2, only a license changeMartin BöhmZachary FriggstadTobias MömkeJoachim Spoerhase