https://arxiv.org/api/MVkLDLhc/ddw8VWCLh7knqdVPB4 2026-06-18T08:41:46Z 29013 195 15 http://arxiv.org/abs/2606.02752v1 Online K-d tree for approximate neighborhood search in data streams 2026-06-01T18:18:29Z

The k-Nearest Neighbors (kNN) algorithm has long been widely used in Machine Learning (ML) applications. However, the main concern when using it is the computational cost required for neighborhood search, which can make it unfeasible for large-scale applications. Optimization algorithms, such as the K-d tree, become an option in such scenarios. Under data streams, it can be challenging to maintain the properties of the K-d tree, as it requires inserting and deleting nodes on the fly. These operations can make maintaining the tree's balance and invariants difficult. Additionally, traditional K-d trees were initially designed for Minkowski-based distance functions. In this work, we describe an Online K-d tree and its adaptation to the Canberra distance that supports dynamic updates over data streams while preserving the structural invariants required for efficient traversal. Experimental analysis demonstrates that the Online K-d tree algorithm achieves faster processing time under data streams, and that adapting to the Canberra distance enabled effective subtree pruning, as evidenced by a minor loss in average accuracy and a substantial gain in instances processed per second. Our implementation can be found in our GitHub repository

2026-06-01T18:18:29Z Paper accepted to the ICPRAI 2026 Eduardo V. L. Barboza Robert Sabourin Rafael M. O. Cruz http://arxiv.org/abs/2606.02492v1 $O(n +f(k))$: Truly Linear FPT 2026-06-01T17:01:31Z

Parameterized complexity has always been concerned with practical computing: by confining combinatorial explosion to a secondary parameter $k$, one can uncover why and how many NP-hard problems are effectively tackled in practice. Today, however, the scale of data has changed: scientists study Big Data, which is so large that even quadratic dependence in the total input size $n$ is unaffordable. Therefore, what constitutes a practical algorithm has also changed. Classically, parameterized complexity is blind to the difference between defining fixed parameter tractability multiplicatively (i.e. $f(k) \cdot n^c$) or additively (i.e. $f(k) + n^c$). But what if the constant $c$ is one and we require true linearity, is this distinction still inconsequential? Here, we define and explore Truly Linear FPT (TLFPT) -- that is $O(n)+f(k)$ -- and show that it is a strict subset of Linear FPT (LFPT) -- that is $O(n) \cdot f(k)$ -- via diagonalization. Populating TLFPT requires careful consideration of linear-time algorithmics and data structures. We meet many inhabitants of TLFPT: SAT, Vertex Cover, Min-Max Matching, $(n-k)$-Coloring, Diverse Pair of Matchings, $k$-Path, and $H$-Coloring. Our parameterizations are equally varied. Beyond classical parameters like solution size, we leverage two parameters, treedepth and BFS-width, which are particularly well-suited to the TLFPT regime. We do so by developing techniques based on depth- and breadth-first search. For parameterized complexity to be of service to the scientific community, we need to contend with Big Data. For sufficiently large inputs, FPT beyond linear may not suffice. Thus, there is a practical and theoretical need for more ambitious goals. TLFPT is a first step forward.

2026-06-01T17:01:31Z 42 pages, 5 figures Benjamin Merlin Bumpus Rod Downey Tala Eagling-Vose Jessica Enright Michael R. Fellows David C. Kutner Laura Larios-Jones Barnaby Martin Frances Rosamond Ella Yates http://arxiv.org/abs/2606.02325v1 Terminal Steiner tree problem : Complexity and Algorithms 2026-06-01T14:37:12Z

Given a connected graph $G$ and a terminal set $R \subseteq V(G)$, the Steiner tree problem (ST) asks for a tree that spans all of $R$ with at most $r$ vertices from $V(G)\backslash R$, for some integer $r\geq 0$. It is known from (Garey et al.,1977 ) that ST is NP-complete. A Steiner tree in which all terminal vertices are constrained to be leaves is called a terminal Steiner tree. Our study addresses the existence of a terminal Steiner tree, its complexity across various graph classes, black-box applications of the ST, and a fixed-parameter tractable (FPT) algorithm with respect to the number of terminals.

2026-06-01T14:37:12Z Jyothish S Sadagopan Narasimhan http://arxiv.org/abs/2606.02263v1 Exact Sampling of Permutations with a Fixed Longest Increasing Subsequence 2026-06-01T13:50:04Z

We study exact uniform sampling of permutations of length $n$ whose longest increasing subsequence (LIS) has prescribed length $k$. For $k \in Θ(n)$, we give a direct rejection sampler whose expected running time is $O(n\log\log n)$ in the word-RAM model. The sampler uses an expanded proposal space consisting of permutations together with a specified increasing subsequence, and accepts exactly those proposals whose specified subsequence is the leftmost LIS. For arbitrary $1\le k\le n$, we give an exact sampler based on the Robinson--Schensted correspondence. The algorithm samples the corresponding Plancherel-conditioned shape by computing exact completion counts via determinant identities, and then samples two uniform tableaux of that shape. The direct implementation runs in $\tilde O(n^4k^5)$ expected time. We then show that the same sampler can be implemented in expected $\tilde O(n^3k^4)$ time by evaluating a determinant oracle through Hankel moment matrices.

2026-06-01T13:50:04Z Peter Clifford Raphaël Clifford http://arxiv.org/abs/2606.02183v1 Efficiently Listing Projected Trees, and Equivalence of Listing and Enumeration 2026-06-01T12:38:26Z

The subgraph isomorphism problem and its generalizations such as conjunctive queries, where some nodes are projected, are among the most fundamental problems in graph algorithms and database theory. In this paper, we study the listing and enumeration variants of these problems and present two main results. (1) We present the first algorithms for enumerating projected trees with polynomial preprocessing time ($\widetilde{O}(n^{17.42})$) and polylogarithmic delay ($\mathrm{polylog}(n)$). Prior to this work, all algorithms in the literature required time $Ω(n^{Ω(k)} + t)$ or $t \cdot n^{Ω(1)}$ to list all copies of a $k$-node tree with projections, where $t$ is the number of solutions. Our result generalizes to arbitrary projected hypergraphs, achieving enumeration in preprocessing time $\widetilde{O}(m^{17.42 \cdot \mathrm{subw}(H)})$ and polylogarithmic delay, where $\mathrm{subw}(H)$ is the submodular width of the pattern hypergraph $H$. We heavily rely on fast (rectangular and output-sensitive) matrix multiplication, which we complement by fine-grained lower bounds indicating that any algorithm beating time $Ω(n^{Ω(k)} + t)$ must rely on fast matrix multiplication. (2) As our second main result, we present a generic enumeration-to-listing reduction, establishing that listing and enumeration are equivalent under natural assumptions. For (colored) subgraph isomorphism, our reduction transforms any listing algorithm running in time $O(f(n,m) + t \cdot g(n,m))$ into an enumeration algorithm with preprocessing time $O((f(n,m)+g(n,m)+m) \log^2 n)$ and delay $O(g(n,m))$. We utilize this equivalence as a tool for proving our first main result, and we expect that our generic reduction will find many future applications.

2026-06-01T12:38:26Z Karl Bringmann Nick Fischer Yanheng Wang http://arxiv.org/abs/2606.02029v1 The Completion-Threshold Framework for Obligatory-Test Scheduling on Multiple Machines 2026-06-01T10:17:47Z

We study online scheduling with obligatory testing on $m$ identical machines with the objective of minimizing the sum of completion times. In this model, every job must undergo a test before its actual processing time is revealed. Consequently, the central algorithmic challenge is no longer whether to acquire information, but how to optimally balance machine capacity between revealing unknown jobs and processing currently known ones. While this tradeoff becomes structurally richer in the multiple-machine setting, the only prior explicit deterministic lower bound for this objective was $\sqrt{2}$, established strictly for a single machine in 2024 by Dogeas et al. [ESA 2024: 48:1-48:14]. Our core conceptual contribution is demonstrating that completion-threshold quantities, denoted $T_X$, serve as the fundamental analytical metric for this setting. Because every completed job must first pass through the testing phase, delayed revelation inherently forces delayed completion. By bounding these $T_X$ thresholds, we systematically derive strong lower bounds on the total completion time. Utilizing this framework, we establish the first substantial deterministic lower bounds for multiple machines, including a three-type bound of $1.4811$ and a multi-type dyadic construction that asymptotically approaches $3/2$. Finally, we complement these theoretical limits with a deterministic $2$-competitive list-scheduling algorithm for arbitrary test times.

2026-06-01T10:17:47Z Kao-Chuan Liang Ya-Chun Liang http://arxiv.org/abs/2411.12438v2 Dimension Reduction via Sum-of-Squares and Improved Clustering Algorithms for Non-Spherical Mixtures 2026-06-01T08:02:06Z

We develop a new approach for clustering non-spherical (i.e., arbitrary component covariances) Gaussian mixture models via a subroutine, based on the sum-of-squares method, that finds a low-dimensional separation-preserving projection of the input data. Our method gives a non-spherical analog of the classical dimension reduction, based on singular value decomposition, that, among several other applications, forms a key component of the celebrated spherical clustering algorithm of Vempala and Wang [VW04]. As applications, we obtain an algorithm to (1) cluster an arbitrary total-variation separated mixture of $k$ centered (i.e., zero-mean) Gaussians with $n\geq \operatorname{poly}(d) f(w_{\min}^{-1})$ samples and $\operatorname{poly}(n)$ time, and (2) cluster an arbitrary total-variation separated mixture of $k$ Gaussians with identical but arbitrary unknown covariance with $n \geq d^{O(\log w_{\min}^{-1})} f(w_{\min}^{-1})$ samples and $n^{O(\log w_{\min}^{-1})}$ time. Here, $w_{\min}$ is the minimum mixing weight of the input mixture, and $f$ does not depend on the dimension $d$. Our algorithms naturally extend to tolerating a dimension-independent fraction of arbitrary outliers. Before this work, the techniques in the state-of-the-art non-spherical clustering algorithms needed $d^{O(k)} f(w_{\min}^{-1})$ samples and time for clustering such mixtures. Our results may come as a surprise in the context of the $d^{Ω(k)}$ statistical query and sum-of-squares lower bounds [DKS17, DKPP24] for clustering non-spherical Gaussian mixtures. While these results are usually thought to rule out $d^{o(k)}$ cost algorithms for the problem, our results show that the lower bounds can in fact be circumvented for a remarkably general class of Gaussian mixtures.

2024-11-19T11:58:51Z 67 pages, updated to match camera-ready version at COLT 2026 Prashanti Anderson Mitali Bafna Rares-Darius Buhai Pravesh K. Kothari David Steurer http://arxiv.org/abs/2606.01809v1 A Near-Optimal Offline Algorithm for Dynamic All-Pairs Shortest Paths in Planar Digraphs 2026-06-01T07:27:10Z

In the planar, dynamic All-Pairs Shortest Paths (APSP) problem, a planar, weighted digraph $G$ undergoes a sequence of edge weight updates and the goal is to maintain a data structure on $G$, that can quickly answer distance queries between any two vertices $x,y \in V(G)$. The currently best algorithms for this problem require $\tilde{O}(n^{2/3})$ worst-case update and query time, while conditional lower bounds show that either update or query time $n^{0.5-δ}$ is needed for any constant $δ> 0$. In this article, we present the first algorithm with near-optimal $\tilde{O}(\sqrt{n})$ worst-case update and query time for the offline setting, where the update sequence is given initially. This result is obtained by giving the first offline dynamic algorithm for maintaining dense distance graphs (DDGs) faster than recomputing from scratch after each update. Further, we also present an \emph{online} algorithm for the incremental APSP problem with $\tilde{O}(\sqrt{n})$ worst-case update/ query time. This allows us to reduce the online dynamic APSP problem to the online decremental APSP problem, which constitutes partial progress even for the online version of this notorious problem.

2026-06-01T07:27:10Z Appeared in SODA'2022 Debarati Das Maximilian Probst Gutenberg Christian Wulff-Nilsen http://arxiv.org/abs/2606.01693v1 Scalable Concurrent Queues for GPU 2026-06-01T04:57:32Z

Concurrent queues can significantly impact supercomputing performance by being critical bottlenecks for task distribution, load balancing, and resource utilization. As HPC systems move beyond 10-million processor cores, the ability to rapidly move items between producer and consumer threads without excessive locking is essential for efficient queues, preventing idle cores, maximizing utilization, and achieving high parallel speedup. While concurrent queues are well studied on CPUs, they remain largely unexplored on modern GPUs, where SIMT execution, massive parallelism, and atomic contention reshape the design space. We present three linearizable GPU concurrent queues spanning from lock-free to wait-free guarantees: (1) G-WFQ-YMC, an adaptation of Yang and Mellor-Crummey's wait-free queue using preallocated segments; (2) G-LFQ, a bounded lock-free queue that uses wave-batched fast paths to maximize throughput; and (3) G-WFQ, a bounded wait-free queue that packs shared state into 64-bit compare-and-swap operations while preserving linearizability and bounded memory.

2026-06-01T04:57:32Z 10 pages, 5 figures Pratheek Prakash Shetty Thomas R. W. Scogland Wu-chun Feng http://arxiv.org/abs/2601.21237v2 Characterizing the Effect of Noise in Language Generation in the Limit 2026-06-01T03:47:51Z

Kleinberg and Mullainathan recently proposed a formal framework for studying the phenomenon of language generation, called language generation in the limit. In this model, an adversary gives an enumeration of example strings from an unknown target language, and the algorithm is tasked with correctly generating unseen strings from the target language within finite time. Refined notions of non-uniform and uniform generation were later introduced by Li, Raman, and Tewari (2025), and a noisy model was introduced by Raman and Raman (2025), which allows the adversary to insert extraneous strings. A natural question in the noisy model is to quantify the effect of noise, by studying the impact of each additional extraneous string. We show two complementary results in this setting. We first show that for both uniform and non-uniform generation, a single noisy string strictly reduces the set of collections that can be generated, thus answering an open question in Raman and Raman (2025). Then, we show for both uniform and non-uniform generation that generation with a single noisy string is equivalent to generation with any finite amount of noise, sharply contrasting with the strict hierarchy for noisy generation in the limit shown by Bai, Panigrahi, and Zhang (2026). Finally, we leverage our previous results to provide the first known characterization for non-uniform noise-dependent generatability.

2026-01-29T03:58:40Z ICML 2026 Aaron Li Ian Zhang http://arxiv.org/abs/2601.18115v2 Robust Learning of a Group DRO Neuron 2026-05-31T19:44:09Z

We study the problem of learning a single neuron under standard squared loss in the presence of arbitrary label noise and group-level distributional shifts, for a broad family of covariate distributions. Our goal is to identify a ''best-fit'' neuron parameterized by $\mathbf{w}_*$ that performs well under the most challenging reweighting of the groups. Specifically, we address a Group Distributionally Robust Optimization problem: given sample access to $K$ distinct distributions $\mathcal p_{[1]},\dots,\mathcal p_{[K]}$, we seek to approximate $\mathbf{w}_*$ that minimizes the worst-case objective over convex combinations of group distributions $\boldsymbolλ \in Δ_K$, where the objective is $\sum_{i \in [K]}λ_{[i]}\,\mathbb E_{(\mathbf x,y)\sim\mathcal p_{[i]}}(σ(\mathbf w\cdot\mathbf x)-y)^2 - νd_f(\boldsymbolλ,\frac{1}{K}\mathbf1)$ and $d_f$ is an $f$-divergence that imposes (optional) penalty on deviations from uniform group weights, scaled by a parameter $ν\geq 0$. We develop a computationally efficient primal-dual algorithm that outputs a vector $\widehat{\mathbf w}$ that is constant-factor competitive with $\mathbf{w}_*$ under the worst-case group weighting. Our analytical framework directly confronts the inherent nonconvexity of the loss function, providing robust learning guarantees in the face of arbitrary label corruptions and group-specific distributional shifts. The implementation of the dual extrapolation update motivated by our algorithmic framework shows promise on LLM pre-training benchmarks.

2026-01-26T04:00:53Z Guyang Cao Shuyao Li Sushrut Karmalkar Jelena Diakonikolas http://arxiv.org/abs/2411.15363v4 The Polymatroid Representation of a Greedoid, and Associated Galois Connections 2026-05-31T19:37:15Z

A greedoid is a generalization of a matroid allowing for more flexible analyses and modeling of combinatorial optimization problems. However, these structures decimate many matroid properties contributing to their pervasive nature. A polymatroid greedoid [KL85] presents an interesting middle ground, so we further develop this class. First we prove every local poset greedoid for which the greedy algorithm correctly solves linear optimizations over its basic words must have a polymatroid representation. For this, we use relationships between the lattices of greedoid flats and closed sets of a polymatroid to generalize concepts in [KL85]. Then, we show our generalization is defined by a Galois connection between the greedoid flats and closed sets of a representation. Finally, we apply this duality to identify a subclass of polymatroid greedoids with favorable properties, which we call strong polymatroid greedoids. As technical tools for our analyses, we introduce optimism and the Forking Lemma for interval greedoids. Both are pervasive in our work, and are of independent interest.

2024-11-22T22:12:35Z 38 pages, 8 figures, 4 appendices. In v1 there is an error in the proof of the main claim of an alternative description of polymatroid greedoids. This was noted in v2, and corrected in v3 by changing the main results to remove errors. In v4, some proofs were simplified by removing technical lemmas, and other minor improvements were made Robert P. Streit Vijay K. Garg http://arxiv.org/abs/2606.01333v1 Adversarial Configurations for the ReCom Transition Function 2026-05-31T16:35:51Z

ReCom is a leading Markov Chain Monte Carlo algorithm for sampling balanced graph partitions in computational redistricting. At each step, its transition function proposes a new partition by merging two adjacent districts and if possible re-splitting the conjoined region. The transition function is efficient in practice, however, it is unknown whether it is guaranteed to run in polynomial time. In this report we exhibit an explicit family of 3-partitions on planar square grid graphs from which ReCom requires an exponentially large expected number of steps to re-split the graph (even if we admit approximately balanced splits), showing that in the worst case ReCom does not run in polynomial time. Notably, this result implies that ReCom is not technically rapidly mixing (if started from an adversarial configuration, ReCom requires exponential many steps to reach the stationary distribution).

2026-05-31T16:35:51Z Micah Gold http://arxiv.org/abs/2606.01330v1 On Thin Perfect Matchings up to Polylogarithmic Factors 2026-05-31T16:31:52Z

We resolve the thin matching problem proposed by Anari, Charikar and Ramakrishnan [ACR23] up to polylogarithmic factors. Given a fractional perfect matching $x$, we say a perfect matching $M$ is $α$-thin w.r.t. $x$ if for any cut $(S,\overline{S})$, we have $$ |M \cap E(S,\overline{S})| \leq α\cdot x(S,\overline{S}).$$ [ACR23] conjectured that for any fractional perfect matching $x$, there exists a perfect matching $M$ which is $O(1)$-thin w.r.t. $x$. First, we show that if $M$ is restricted to be in the support of $x$, then $α\geq Ω(n)$ and we complement this by designing an efficient algorithm that outputs an $O(n\log n)$-thin perfect matching where $n$ is the number of vertices. Then, we relax this constraint and show that for any fractional perfect matching $x$, there is a perfect matching $M$ (which is not necessarily in the support of $x$) such that $M$ is $\text{polylog}(n)$-thin w.r.t. $x$. All results work for both bipartite and non-bipartite graphs. We also discuss applications to the metric distortion problem.

2026-05-31T16:31:52Z Alireza Haqi Shayan Oveis Gharan http://arxiv.org/abs/2606.01309v1 Multiagent Matroid Upgrading: Greedy is Fair and Efficient 2026-05-31T15:57:38Z

This paper introduces a general multiagent matroid upgrading problem that models a broad class of real-world resource allocation tasks. In this setting, there are multiple agents and a ground set of elements, where each element is assigned to a specific agent and has two associated costs: a default cost and a reduced (upgraded) cost. Upgrading an element lowers its cost to the upgraded value, while non-upgraded elements retain their default costs. Each agent is associated with its own matroid, with the goal of finding a minimum-cost basis. The central task is to select at most k elements to upgrade so as to minimize a non-decreasing convex function over the agents' minimum basis costs, capturing both efficiency and fairness objectives in multiagent systems.

2026-05-31T15:57:38Z Appeared in AAMAS 2026 Qingwen Ma Chao Peng Changfeng Xu Chenyang Xu Ruilong Zhang