https://arxiv.org/api/bXQdzN3ij3v2i5AeDuZFfRfjmpQ2026-06-09T20:34:03Z28933015http://arxiv.org/abs/2606.09729v1Bayesian Probing on Graphs2026-06-08T16:51:30ZWe introduce a stochastic probing problem with correlated items. In our model, which we call Bayesian Probing, the correlations are modeled by an underlying graph $G$. Each vertex is independently active with a known probability. Each item corresponds to an edge in the graph. Probing an edge has some cost, gives some reward if both endpoints are active, and also reveals the state of its endpoints. Hence a probe induces a Bayesian update on the remaining edges. The goal is to adaptively probe items/edges subject to a knapsack constraint to maximize the expected total reward obtained from the probed edges.
Bayesian Probing generalizes stochastic knapsack and stochastic probing by allowing correlations between items. Moreover, it gives a tractable model for the Bayesian Active Search problem, a popular problem considered in the machine learning community. In Bayesian Active Search, the goal is to find items in a particular class by adaptively probing at most, say $k$, items. Given a prior distribution over items, we want to compute a Bayesian policy to maximize the number of such items found. For this general problem with arbitrary priors, there are strong lower bounds on efficiently computing good policies.
In this paper, we design efficient approximation algorithms for Bayesian Probing. These results give the first efficient approximation algorithms for Bayesian Active Search, for a class of practically-relevant prior distributions.2026-06-08T16:51:30ZAnupam GuptaBenjamin MoseleyRudy Zhouhttp://arxiv.org/abs/2606.09728v1Quantum Cut Sparsifiers2026-06-08T16:51:08ZIn this paper, we continue a line of research initiated by Basu, Brakensiek, and Putterman [2026] studying the sparsifiability of Hamiltonians. We focus particularly on the sparsifiability of the widely-studied Quantum Cut (QC) Hamiltonians. Our main result is that in an $n$-qubit system, any $n$-qubit QC Hamiltonian can be sparsified to $\widetilde{O}(n /\varepsilon^2)$ many terms while preserving the energy of every state up to a factor of $1 \pm \varepsilon$. Our result can be interpreted as giving an importance sampling scheme for the edges of an arbitrary graph $G$ such that the \emph{Kikuchi} graph at level $\ell$ of the sampled graph is a spectral approximation to the Kikuchi graph of $G$. Importantly, the \emph{same} sampling scheme works simultaneously for all $\ell$.
The natural approach of leverage score sampling, analyzed via matrix concentration inequalities, yields a polynomially worse bound in our setting because the underlying matrices have dimension $\sim 2^n$. Instead, our approach relies on decomposing the action of these matrices into invariant subspaces. Then, by using an operator-valued inequality of Alon and Kozma [Ann. Henri Poincaré, 2020], itself building on an \emph{octopus inequality} of Caputo, Liggett, and Richthammer [J. AMS, 2010], we extend our sparsification technique to all expander graphs. We then invoke expander decomposition to extend our sparsifier to all graphs.2026-06-08T16:51:08ZArpon BasuJoshua BrakensiekPravesh K. KothariAaron Puttermanhttp://arxiv.org/abs/2505.05847v4Smaller and More Flexible Cuckoo Filters2026-06-08T13:40:57ZCuckoo filters are space-efficient approximate set membership data structures with a controllable false positive rate (FPR) and zero false negatives, similar to Bloom filters. In contrast to Bloom filters, Cuckoo filters store multi-bit fingerprints of keys in a hash table using variants of Cuckoo hashing, allowing each fingerprint to be stored at a small number of possible locations. Existing Cuckoo filters use fingerprints of $(k+3)$ bits per key and an additional space overhead factor of at least $1.05$ to achieve an FPR of $2^{-k}$. For $k=10$, this amounts to $1.365\, kn$ bits to store $n$ keys, which is better than $1.443\, kn$ bits for Bloom filters. The $+3$ for the fingerprint size is required to balance out the multiplied FPR caused by looking for the fingerprint at several locations. In the original Cuckoo filter, the number of hash table buckets is restricted to a power of 2, which may lead to much larger space overheads, up to $2.1\, (1+3/k)\, kn$ bits.
We present two improvements of Cuckoo filters. First, we remove the restriction that the number of buckets must be a power of 2 by using a different placement strategy. Second, we reduce the space overhead factor of Cuckoo filters to $1.06 \, (1+2/k)$ by using overlapping windows instead of disjoint buckets to maintain the load threshold of the hash table, while reducing the number of alternative slots where any fingerprint may be found.
A detailed evaluation demonstrates that the alternative memory layout based on overlapping windows decreases the size of Cuckoo filters not only in theory, but also in practice. A comparison with other state-of-the art filter types, Prefix filters and Vector Quotient filters (VQFs), shows that the reduced space overhead makes windowed Cuckoo filters the smallest filters supporting online insertions, with similarly fast queries, but longer insertion times.2025-05-09T07:37:38ZJohanna Elena SchmitzJens ZentgrafSven Rahmannhttp://arxiv.org/abs/2511.20397v2Model-Based Learning of Whittle indices2026-06-08T13:11:50ZWe present BLINQ, a new model-based algorithm that learns the Whittle indices of an indexable, communicating and unichain Markov Decision Process (MDP). Our approach relies on building an empirical estimate of the MDP and then computing its Whittle indices using an extended version of a state-of-the-art existing algorithm. We provide a proof of convergence to the Whittle indices we want to learn as well as a bound on the time needed to learn them with arbitrary precision. Moreover, we investigate its computational complexity. Our numerical experiments suggest that BLINQ significantly outperforms existing Q-learning approaches in terms of the number of samples needed to get an accurate approximation. In addition, it has a total computational cost even lower than Q-learning for any reasonably high number of samples. These observations persist even when the Q-learning algorithms are speeded up using neural networks to predict Q-values.2025-11-25T15:21:00Z30 pages, 7 figures, submitted to TOMPECSJoël Charles-RebufféNicolas GastBruno Gaujalhttp://arxiv.org/abs/2606.09318v1Engineering Scalable Distributed List Ranking2026-06-08T10:28:44ZThe list ranking problem is one of the classical problems of parallel computing, with nontrivial algorithms and many applications as a subroutine for solving other problems. While it has been intensively studied in the early days of parallel computing, few things happened in the last 20 years. In particular, there is little work on scaling list ranking to large machines and input sizes. We reconsider list ranking starting from the ground-breaking results of Sibeyn a quarter century ago. We employ algorithm and performance engineering to improve his sparse ruling-set algorithm, making it capable of scaling to many processors, and provide a more detailed analysis of the impact of the algorithm's parameters, further guiding our practical implementation.
We perform an extensive experimental study across a variety of input instances with different structural properties. We demonstrate that indirect communication, exploiting input locality, and message coalescing allows scaling to billions of elements on up to 24,576 cores.2026-06-08T10:28:44Z14 pages, 4 figuresPeter SandersMatthias SchimekTim Niklas UhlThomas Weidmannhttp://arxiv.org/abs/2606.09133v1Multiversion Concurrency Control for Multiversion B-Trees2026-06-08T07:31:22ZMultiversion concurrency control (MVCC) enables scans to read data from a committed snapshot (version), reducing conflicts with write operations compared to traditional concurrency approaches. Currently, versioned records are often managed in a B$^+$-tree using version chains. However, version chains introduce overhead during scans and can still lead to conflicts between scans and writers. The multiversion B-tree (MVBT) was designed for optimal range scan performance on arbitrary versions, but has been considered impractical due to its structural complexity and, until recently, the lack of effective concurrency control. In this paper, we present the concurrent MVBT (cMVBT), a redesign of the MVBT featuring a novel concurrency control protocol that uses optimistic latches for write operations and requires no latches for range scans, while preserving all the optimality guarantees of the original MVBT. Additionally, cMVBT supports continuous garbage collection without activity spikes, seamlessly integrating free-space management. Experiments with mixed workloads derived from a standard benchmark show that the cMVBT achieves low overhead, high write throughput, and excellent range scan performance, outperforming state-of-the-art methods based on version chains.2026-06-08T07:31:22ZAmir TontaBernhard SeegerEljas Soisalon-Soininenhttp://arxiv.org/abs/2606.01342v3Towards Optimal Robustness in Learning-Augmented Paging2026-06-08T06:33:53ZLearning-augmented paging has been extensively studied in recent years. A key advantage over naive ML-based approaches is \emph{bounded robustness}, which guarantees worst-case performance even when predictions are inaccurate, making these algorithms valuable for real-world systems. Prior work achieves robustness bounds of $2H_k + O(1)$ in the randomized setting, leaving a gap to the optimal competitive ratio $H_k$.
In this paper, we study how to close this gap. We begin by reviewing online optimality and proving a new property of the latest $H_k$-competitive algorithm, which facilitates our analysis in the learning-augmented setting. Then, we review existing learning-augmented paging algorithms and introduce a unifying primitive, the \emph{relative prediction budget}, which captures the essence of establishing robustness and reveals that prior algorithms either overuse or underutilize predictions. Guided by the above analysis, we develop a new framework that achieves the best-possible robustness up to an additive constant for learning-augmented paging: $H_k + O(1)$. Experiments further demonstrate strong practical performance.2026-05-31T16:49:36ZICML 2026Peng ChenHailiang ZhaoXueyan TangYixuan WangShuiguang Denghttp://arxiv.org/abs/2507.03980v4Functional design of efficient and parallelizable combinatorial generators using convolution2026-06-08T04:17:36ZThe application of program transformation and algebraic methods to the development of efficient combinatorial optimization (CO) algorithms relies on an exhaustive combinatorial generator for the problem specification, followed by the fusion of thinning or filtering processes into this specification. However, the effectiveness of such fusion transformations critically depends on the structural compatibility between the objective function and the generator, which is highly problem dependent. In practice, when the majority of candidate solutions remain unfiltered or are not eliminated-as is the case for most intractable CO problems-the overall efficiency of the resulting fused program is largely determined by the intrinsic efficiency of the combinatorial generator. Consequently, if the specification itself exhibits suboptimal performance, the fused program will inherit a correspondingly inferior level of efficiency.
We argue that a genuine designed process should also account for hardware compatibility and parallelizability-particularly the ability to support efficient parallel execution on modern hardware architectures, including multi-level cache hierarchies and GPUs. However, does achieving formal correctness necessarily conflict with designing algebraically elegant algorithms that support fusion? Can we obtain both simultaneously?
In this paper, we show that techniques from functional programming, provide powerful formal tools for the systematic construction of such hardware-compatible and parallelizable combinatorial generators. This paper investigates generators for two of the most fundamental combinatorial structures-combinations and permutations-together with their natural extension to nested generators (e.g., combinations/permutations of combinations/permutations).2025-07-05T10:11:37ZXi HeZhenjiang HuMax. A. Littlehttp://arxiv.org/abs/2606.08977v1Online Learning with Recency: Algorithms for Sliding-window Streaming Multi-armed Bandits2026-06-08T03:21:54ZMotivated by the recency effect in online learning, we study algorithms for single-pass *sliding-window streaming multi-armed bandits (MABs)* in this paper. In this setting, we are given $n$ arms with unknown sub-Gaussian reward distributions and a parameter $W$. The arms arrive in a single-pass stream, and only the most recent $W$ arms are considered valid. The algorithm is required to perform pure exploration and regret minimization with limited memory, defined as the number of stored arms. The model is a natural extension of the streaming multi-armed bandits model (without the sliding window) that has been extensively studied in recent years. We provide a comprehensive analysis of both the pure exploration and regret minimization problems with the model. For pure exploration, we prove that finding the best arm is hard with sublinear memory while finding an approximate best arm admits an efficient algorithm. For regret minimization, we explore a new notion of regret and give sharp memory-regret trade-offs for any single-pass algorithm. We complement our theoretical results with experiments, demonstrating the trade-offs between sample, regret, and memory.2026-06-08T03:21:54ZICML 2026Vladimir BravermanChen WangLiudeng WangSamson Zhouhttp://arxiv.org/abs/2508.11874v2Discovering Expert-Level Nash Equilibrium Algorithms with Large Language Models2026-06-08T00:24:06ZDesigning polynomial-time algorithms for approximate Nash equilibria (ANE) with provable worst-case guarantees is a fundamental open problem in algorithmic game theory. While large language models (LLMs) can generate candidate algorithms at scale, certifying worst-case guarantees requires formal analysis over all game instances -- a task for which no automated system previously existed. Here, we present LegoNE, a framework encoding expert proof strategies into a symbolic language that automatically compiles any candidate algorithm into a finite optimization problem certifying its worst-case guarantee. Integrating LegoNE with a reasoning LLM, we rediscovered an algorithm matching the best polynomial-time guarantee for two-player games, and discovered a three-player algorithm improving the best guarantee from $0.6+δ$ to $0.5+δ$ -- provably beyond the reach of the extension technique, the only previously known multi-player ANE design paradigm. These results show that encoding domain-specific proof strategies into a machine-tractable language can support LLM-driven discovery of algorithms outside known human design paradigms.2025-08-16T02:18:43Zaccepted by Nature CommunicationsHanyu LiDongchen LiXiaotie Deng10.1038/s41467-026-74003-1http://arxiv.org/abs/2603.04689v3Generalizing Fair Top-$k$ Selection: An Integrative Approach2026-06-07T17:37:24ZFair top-$k$ selection, which ensures appropriate proportional representation of members from minority or historically disadvantaged groups among the top-$k$ selected candidates, has drawn significant attention. We study the problem of finding a fair (linear) scoring function with multiple protected groups while also minimizing the disparity from a reference scoring function. This generalizes the prior setup, which was restricted to the single-group setting without disparity minimization. Previous studies imply that the number of protected groups may have a limited impact on the runtime efficiency. However, driven by the need for experimental exploration, we find that this implication overlooks a critical issue that may affect the fairness of the outcome. Once this issue is properly considered, our hardness analysis shows that the problem may become computationally intractable even for a two-dimensional dataset and small values of $k$. However, our analysis also reveals a gap in the hardness barrier, enabling us to recover the efficiency for the case of small $k$ when the number of protected groups is sufficiently small. Furthermore, beyond measuring disparity as the "distance" between the fair and the reference scoring functions, we introduce an alternative disparity measure$\unicode{x2014}$utility loss$\unicode{x2014}$that may yield a more stable scoring function under small weight perturbations. Through careful engineering trade-offs that balance implementation complexity, robustness, and performance, our augmented two-pronged solution demonstrates strong empirical performance on real-world datasets, with experimental observations also informing algorithm design and implementation decisions.2026-03-05T00:06:47ZGuangya Caihttp://arxiv.org/abs/2508.12536v3jXBW: A Compressed Index for Structure-Aware JSONL Retrieval in Structured RAG2026-06-07T16:31:47ZProviding \textit{structured} information to large language models (LLMs) improves multi-step reasoning and factual grounding, and recent retrieval-augmented generation (RAG) systems therefore reconstruct structure from retrieved text on every query. When the corpus is \emph{already} structured -- as in JSON Lines (JSONL), a popular format for LLM prompts, chemical compounds, and geospatial records -- this per-query rebuilding can be replaced by direct \emph{structural retrieval}. The core primitive is \textit{substructure search}: finding all JSON objects in a collection that contain a given query pattern. Existing approaches index each document separately, so both index space and query time grow with the total collection size; XML-based engines add conversion overhead and semantic mismatches. We propose \textbf{jXBW}, a compressed index for fast substructure search over JSONL, combining three innovations: (i) a merged tree representation that consolidates repeated structures across objects, (ii) a succinct tree index based on the eXtended Burrows--Wheeler Transform (XBW), and (iii) a newly developed three-phase substructure search algorithm that runs on this index. Together they achieve \textbf{query-dependent complexity}: the cost is determined by query characteristics rather than collection size, in compressed space. Experiments on seven real-world datasets, including PubChem ($10^6$ compounds) and OpenStreetMap ($6.6 \times 10^6$ objects), show that jXBW outperforms the strongest tree-based baseline by $\mathbf{16\times}$ on the smallest dataset and by up to $\mathbf{2{,}800\times}$ on the largest, and is more than $\mathbf{2 \times 10^6\times}$ faster than the XQuery engine Saxon. jXBW thus brings structural retrieval over million-record JSONL collections into the sub-millisecond range.2025-08-18T00:14:24ZYasuo Tabeihttp://arxiv.org/abs/2606.08713v1The price of incrementality in k-center clustering2026-06-07T16:09:04ZThe $k$-center problem is one of the best-studied and most intuitive clustering formulations. It asks, given a set of $n$ points in a metric space, for $k$ of the points to be designated as cluster centers, so that the maximum distance of an input point to its nearest center is minimized. Gonzalez's greedy algorithm from 1985 is a simple and efficient way to find a $2$-approximate solution. The algorithm has the attractive feature of \emph{incrementality}: it outputs the centers one by one, with a guaranteed $2$-approximation for every prefix of the obtained sequence of centers.
Incrementality imposes a geometric constraint on how solutions can be built, and it is natural to ask whether this comes at a price in the quality of the solution. It is known that in polynomial time, the approximation ratio of $2$ is best possible, assuming $P \neq NP$. In this paper we show that even with \emph{unlimited} computational power, the factor $2$ cannot be improved, if the solution is required to be built incrementally. The lower bound construction imposes a tradeoff between all $n$ levels of the clustering simultaneously; it was obtained with the help of ChatGPT, an aspect we discuss in Section 3 of the paper.2026-06-07T16:09:04ZLászló Kozmahttp://arxiv.org/abs/2606.08698v1Quotient Admission Algorithms for Witness-Supported Graph Windows2026-06-07T15:53:21ZWe formulate the quotient admission problem for finite graph-window rows. The input is a finite row set, an admissible evidence map, semantic labels, witness-support hypergraphs, and atom-level admissibility predicates. The output is a quotient decision on evidence atoms, with possible decisions certificate, residual, low-confidence, or blocked. The problem asks for the maximal guard-respecting atom-level decision map that uses no refinement beyond the admissible evidence partition. We prove an atom-union characterization of identifiable classes, give a witness-support hypergraph guard for certificate admission, characterize projected-label conflicts as blocked atoms, and present quotient admission algorithms with correctness, maximality, and complexity guarantees. With explicit evidence vectors and hyperedges, the algorithms run in expected O(B + I + n) time and space by hashing and deterministic O(B + I + n log n) time by sorting under a key-linear comparison model, where n is the number of rows, B is the total evidence encoding length, and I is the total hyperedge incidence size. We also prove a magnitude-only indistinguishability lower bound: any evaluator that observes only residual magnitudes fails on instances whose evidence atoms require different residual decisions after the magnitudes collapse them.2026-06-07T15:53:21Z11 pagesYushan Lihttp://arxiv.org/abs/2606.08662v1Uncertainty Principles for the Number Theoretic Transform2026-06-07T15:05:53ZMotivated by polynomial identity testing with exponentials (Li and Wu, ITCS'26), we study uncertainty principles for the number-theoretic transform (NTT). We show that the NTT satisfies strong sparsity tradeoffs: For every fixed prime $q$ and for all but finitely many primes $p \equiv 1 \pmod q$ every nonzero $f\in \mathbb F_p^{\mathbb Z_q}$ and its number-theoretic transform $\hat f$ satisfy \[ |\mathrm{Supp}(f)| + |\mathrm{Supp}(\hat f)| \ge q+1. \] Thus, a $k$-sparse function has transform support at least $q-k+1$. As our main technical contribution, we prove a probabilistic version of the above uncertainty principle, averaged over primes $p$, in the regime $p=q^{O(1)}$.
As an application, we obtain a black-box identity test for $k$-sparse exponential polynomials of degree at most $d$ with vanishing soundness error, for $q$ moderately larger than $k$.2026-06-07T15:05:53ZGiulio MalavoltaAlon Rosen