Planar Perfect Matching Counting is as Hard as Determinants

2026-06-02T17:56:22Z

In the 1960s, Fisher, Kasteleyn and Temperley designed an ingenious algorithm for computing the partition function of the dimer model, or equivalently, for counting perfect matchings in edge-weighted planar graphs (Philos. Mag. 1961; J. Mathematical Phys. 1963). This FKT algorithm later became the foundation for Valiant's holographic algorithms (FOCS 2004; SIAM J. Comput. 2008), which motivated the study of counting problems under the Holant framework. Combined with an algorithm by Yuster (FOCS 2008), the FKT algorithm allows us to count edge-weighted perfect matchings in planar $n$-vertex graphs with $\tilde{O}(n^{ω/2})$ arithmetic operations, where $ω<2.372$ is the matrix multiplication exponent. We prove a corresponding lower bound: Over algebraic circuits and other sufficiently strong computational models, perfect matchings in edge-weighted $n$-vertex planar graphs $G$ cannot be counted in $O(n^{ω/2-ε})$ arithmetic operations. This confirms the optimality of Yuster's algorithm. Our bound holds even when $G$ is an edge-weighted square grid.

Ranked MSO-enumeration over compressed words

2026-06-02T17:37:07Z

It is shown that the ranked query enumeration problem for a fixed MSO-query on strings can be solved with linear preprocessing and constant delay in the grammar-compressed setting, where the input string is given by a so-called straight-line program, i.e., a context-free grammar that produces exactly one string. Moreover, `ranked' means that the output tuples of the MSO-query are printed in a specific order that has to be MSO-definable. This is the first result for ranked query enumeration on compressed data. A corollary of this result is that for a fixed polyregular function $f$ and a word $w$ that is given by a straight-line program of size $n$, one can list after preprocessing time $\mathcal{O}(n)$ the symbols in $f(w)$ from left to right with constant delay, which generalizes a result of Bojanczyk for the case where $w$ is uncompressed. The proofs for these results are based on factorization trees, which are made accessible to the grammar-compressed setting (a contribution of independent interest).

Revisiting $O(n \log \log n)$ chaining for anchored edit distance

2026-06-02T17:18:53Z

Colinear chaining is a classical heuristic for sequence alignment: it enables scalable genome comparison and is a main component of many state-of-the-art read mappers based on seed-chain-extend. The earliest $O(n \log \log n)$ time algorithms by Eppstein et al. (J. ACM, 1992) chained $n$ fragments between two sequences $T$ and $Q$ while minimizing a gap cost based on the diagonal distance $Δ_{\text{diag}}$ between consecutive fragments. They also forbid fragment overlaps, which are essential in current chaining formulations: in long-read mapping, overlaps improve sensitivity and avoid restrictions on the fragment class considered. Jain, Gibney, and Thankachan (J. Comput. Biol. 2022) recently combined a $Δ_{\text{diag}} = |Δ_T -Δ_Q|$ overlap cost with the classic $L_\infty = \max(Δ_T , Δ_Q)$ gap cost that takes the maximum between the horizontal and vertical gap between the fragments and they proved that chaining under this cost model is equivalent to the anchored edit distance. We improve the existing $O(n \log^3 n)$-time algorithm for anchored edit distance to $O(n \log \log n)$ time in $O(n)$ space, by combining the gap-cost computation of Chao and Miller (Algorithmica, 1995) with the overlap-cost computation of Baker and Giancarlo (ESA, 1998). By developing llchain, a simpler $O(n \log n)$-time implementation of our method, we show how chaining algorithms that might have been recently overlooked by the bioinformatics community scale competitively to millions of fragments and large genomes. On average, llchain is $10\times$ faster than other methods on instances with $3\,000\,000$ anchors, and over $3\times$ faster on MEMs between HiFi reads and a reference human genome.

Learning DNF through Generalized Fourier Representations

2026-06-02T16:38:41Z

The Boolean Fourier representation has been widely used in learning theory, particularly for learning Disjunctive Normal Form (DNF) under uniform and product distributions. Extending these results to non-product distributions has remained a longstanding open problem. We address this challenge by introducing a generalized Fourier representation that enables learning under a broad class of non-product distributions. Our approach represents any distribution $D$ as a Bayesian network (BN) and derives a corresponding Fourier expansion. We show that standard Fourier-based learning techniques using membership queries to identify heavy coefficients can be adapted to this generalized representation with minor modifications. We prove that the $L_1$ spectral norm of conjunctions remains bounded under this expansion for difference-bounded tree BNs, significantly generalizing the known result for uniform distributions; matching lower bounds demonstrate the necessity of these constraints. Using these results, we establish the learnability of DNF and the agnostic learnability of decision trees under such distributions. Finally, we present an algorithm for learning difference-bounded tree BN distributions, extending our results to settings where the distribution is unknown.

ETH-Tight Complexity of Optimal Morse Matching on Bounded-Treewidth Complexes

2026-06-02T14:22:20Z

The Optimal Morse Matching (OMM) problem asks for a discrete gradient vector field on a simplicial complex that minimizes the number of critical simplices. It is NP-hard and has been studied extensively in heuristic, approximation, and parameterized complexity settings. Parameterized by treewidth $k$, OMM has long been known to be solvable on triangulations of $3$-manifolds in $2^{O(k^2)} n^{O(1)}$ time and in FPT time for triangulations of arbitrary manifolds, but the exact dependence on $k$ has remained an open question. We resolve this by giving a new $2^{O(k \log k)} n$-time algorithm for any finite regular CW complex, and show that no $2^{o(k \log k)} n^{O(1)}$-time algorithm exists unless the Exponential Time Hypothesis (ETH) fails.

Stepsize Hedging: an Alternative Mechanism for Accelerating Gradient Descent

2026-06-02T14:12:25Z

Can gradient descent be accelerated by just choosing better stepsizes? Surprisingly, the answer is yes. This short expository article provides an accessible introduction to this phenomenon of stepsize hedging.

Deterministic Distance Approximation in MPC via Improved Hitting Sets

2026-06-02T13:59:26Z

In this paper, we provide the first deterministic algorithms with sublogarithmic round complexity for spanners and approximate shortest paths in various MPC models. Moreover, we significantly improve upon the state of the art in the deterministic Congested Clique. In particular, we obtain the following four results on undirected graphs: 1. In both linear MPC and Congested Clique, we obtain an $O(k)$ stretch-spanner of a weighted graph of size $O(n^{1+1/k})$ in $O(1)$ rounds, for some parameter $k\ge 0$. For $k=O(\log{n})$, this leads to an $O(\log n)$ approximation of APSP in constant rounds in both models. 2. In sublinear MPC, we obtain an $O(k^{1+\varepsilon})$-stretch spanner of a weighted graph of size $O(n^{1+1/k})$ in $O(\log k)$ rounds, for any fixed constant $\varepsilon>0$. 3. In Congested Clique, we obtain $O(1)$-approximate APSP for weighted graphs in $O(\log \log \log n)$ rounds. 4. In near-linear MPC, we obtain $(1+\varepsilon)$-approximate single-source shortest paths and $O(1)$-approximate all-pairs shortest paths for unweighted graphs in $\textsf{poly}\log \log n$ rounds. Our algorithm only requires a single near-linear memory machine, where the rest can have sublinear memory. Our deterministic algorithms obtain similar guarantees to the state of the art randomized algorithms without incurring additional factors in the round complexity. To obtain these results, we inspect the randomized algorithms and isolate a randomized sampling routine. Then we derandomize these sampling routines by using a deterministic hitting set. Hereto, we develop a versatile deterministic hitting set algorithm, which we hope will have further derandomization applications.

Algorithmically Fair Maximization of Multiple Submodular Objective Functions and Implications to Constrained Fair Division

2026-06-02T13:33:00Z

Constrained maximization of submodular functions is a central problem in combinatorial optimization. In many realistic scenarios, multiple agents each need to maximize their own submodular objective over a common ground set, subject to individual constraints, with the requirement that their solutions be disjoint. We study this setting through the lens of algorithmic fairness and constrained fair division. Inspired by the fair division literature, we propose and analyze a simple Round-Robin protocol in which agents take turns building their solutions one item at a time; each agent is free to use any internal algorithm, and the protocol itself performs no computation. We show that agents following simple greedy policies enjoy solid guarantees for both monotone and non-monotone objectives subject to constraints as general as $p$-systems. For monotone objectives, a greedy agent $i$ with a $p_i$-system constraint achieves a $1/(n+p_i)$ fraction of the best value available when they first get to choose. On instances that are robust to competition -- where no agent's optimal value is greatly affected by losing some items to others -- these guarantees improve to a $1/Θ(p_i)$ approximation of the unconstrained optimum, which is asymptotically best-possible in polynomial time. We further establish novel fairness guarantees: greedy agents produce approximately feasible-envy-free-up-to-one-item (FEF1) and approximately feasible-envy-free-towards-unallocated-items (FEFu) allocations for monotone and non-monotone objectives. Via a simple augmented protocol and a self-contained polynomial-time proxy algorithm, we also obtain the first $Θ(1/p_i)$-approximate feasible maximin share (FMMS) guarantees for submodular agents with combinatorial constraints. Finally, although greedy policies may not be individually optimal, consistently improving upon them is NP-hard even in the simplest settings.

The anti-lexicographic SUS-anchor: a near-optimal k=1 sampling scheme

2026-06-02T07:13:57Z

In recent years, there has been a renewed interest in the search for low density minimizer schemes. These schemes take a window of $w$ consecutive $k$-mers, and sample one of them: the smallest under some specific order. Schemes such as the mod-minimizer provide a low density (fraction of sampled $k$-mers) when $k \gg w$, while schemes such as the greedy minimizer work well for explicit small parameters roughly in the regime $k \leq 2w$, for $k$ and $w$ up to $15$ or so. When $k < \log_σw$ is very small, minimizer schemes cannot do well, and more general sampling schemes are needed that can be richer than just comparing $k$-mers. Bidirectional-string anchors (bd-anchors) form one such scheme. Inspired by bd-anchors, we introduce the smallest unique substring or SUS-anchor: Given a window, this considers all suffixes that do not occur as a substring elsewhere in the window. It then samples the start position of the smallest suffix according to the new anti-lexicographic order that minimizes the first character and maximizes the remaining characters. We give a linear-time and $O(w)$ space streaming algorithm to compute all SUS-anchors of a string. For alphabet size $σ=4$ and $k=1$, the anti-lexicographic SUS-anchor empirically has density $<1\%$ away from the density lower bound, significantly improving over bd-anchors that are often $>15\%$ above it. For alphabet size $σ=2$, the density is at most $10\%$ above the lower bound, which again improves over the $>50\%$ overhead of bd-anchors.

HRNN: A Hybrid Graph Index for Approximate Reverse k-Nearest Neighbor Search on High-Dimensional Vectors

2026-06-02T06:35:54Z

Reverse k-nearest neighbor (RkNN) search returns all data points that regard a query vector as one of their k-nearest neighbors (kNNs). Existing RkNN methods typically follow a filter-and-verification framework: vectors near the query vector are first collected as candidates and then verified against their kNN-radius (i.e., the distance to their k-th nearest neighbor). However, existing methods face two key limitations in high-dimensional spaces. First, nearby vectors often do not belong to the query's true RkNN set, resulting in excessive candidate expansion overhead. Second, existing methods compute kNN-radius online during verification, incurring substantial query-processing cost. To address these limitations, we propose HRNN, a hybrid graph index for approximate RkNN search. (1) Rather than directly treating nearby vectors as RkNN candidates, HRNN uses them as proxy points based on the assumption that a query's RkNN results can often be discovered through the RkNN results of its nearby vectors. (2) To reduce verification cost, HRNN materializes high-fidelity kNN-radius offline, eliminating expensive online reconstruction while preserving accuracy. HRNN combines a navigation graph, a ranked KNN graph, and reverse-neighbor lists into a hybrid index that supports efficient proxy retrieval, candidate generation, and kNN-radius access. We also develop efficient index construction and append-only maintenance algorithms. Extensive experiments show that HRNN consistently outperforms existing methods, achieving up to one order of magnitude higher throughput. Moreover, HRNN scales to datasets containing up to 10 million high-dimensional vectors while supporting efficient dynamic index maintenance.

BBC: Improving Large-k Approximate Nearest Neighbor Search with a Bucket-based Result Collector

2026-06-02T03:45:45Z

Although Approximate Nearest Neighbor (ANN) search has been extensively studied, large-k ANN queries that aim to retrieve a large number of nearest neighbors remain underexplored, despite their numerous real-world applications. Existing ANN methods face significant performance degradation for such queries. In this work, we first investigate the reasons for the performance degradation of quantization-based ANN indexes: (1) the inefficiency of existing top-k collectors, which incurs significant overhead in candidate maintenance, and (2) the reduced pruning effectiveness of quantization methods, which leads to a costly re-ranking process. To address this, we propose a novel bucket-based result collector (BBC) to enhance the efficiency of existing quantization-based ANN indexes for large-k ANN queries. BBC introduces two key components: (1) a bucket-based result buffer that organizes candidates into buckets by their distances to the query. This design reduces ranking costs and improves cache efficiency, enabling high performance maintenance of a candidate superset and a lightweight final selection of top-k results. (2) two re-ranking algorithms tailored for different types of quantization methods, which accelerate their re-ranking process by reducing either the number of candidate objects to be re-ranked or cache misses. Extensive experiments on real-world datasets demonstrate that BBC accelerates existing quantization-based ANN methods by up to 3.8x at recall@k = 0.95 for large-k ANN queries.

Graphical and algebraic methods for Boolean factoring

2026-06-02T02:57:27Z

The problem of factoring Boolean polynomials has significant applications in both classical and quantum computing technology. In this paper we have developed novel algorithms for factoring both ESOP and SOP expressions. Our aim is to optimize the AND-count. The AND-count plays a key role in determining the number of AND and Toffoli gates required to implement a reversible function with classical and quantum circuits, respectively. The first type of algorithms that we develop, are graphical. We reduce the problem of Boolean factoring to covering a bipartite graph with bicliques, and so optimizing the number of bicliques required to cover the bipartite graph, leads to reduced number of factors, and hence AND-count. The second type of algorithm is algebraic, and is derived from multivariate Horner method. We have compared the performances of our algorithms with existing popular methods like EXORCISM-4 and EPOEM2, on random functions of up to 12 variables. We have observed that our multivariate Horner method is substantially faster, while our biclique-based method achieves the maximum AND-count reduction. In fact, compared to EXORCISM-4 our biclique based method achieves up to 5 times reduction in AND-count.

From Non-Convex to Strongly Convex: Curvature-Adaptive FTPL for Online Optimization

2026-06-01T23:01:36Z

Curvature adaptivity is a classical theme in online optimization: for convex Lipschitz losses, adaptive methods interpolate between the optimal $O(\sqrt{T})$ regret for general convex losses and $O(\log T)$ regret under strong convexity. Recent work has shown that Follow-the-Perturbed-Leader (FTPL) achieves optimal $O(\sqrt{T})$ regret even for online non-convex Lipschitz losses, assuming access to an approximate offline-optimization oracle, but these guarantees do not exploit curvature. We show that FTPL can be made curvature-adaptive in the non-convex setting, without knowing in advance how curvature will accumulate over time. Our algorithm replaces the fixed perturbation scale of standard FTPL with a time-varying scale chosen using only past information. We give a simple follow-the-leader tuning rule for this scale and show that it competes, up to constants, with the best choice in hindsight. The resulting method achieves $O(\sqrt{T})$ regret for arbitrary non-convex Lipschitz losses and improves as cumulative curvature grows; with sufficiently accurate oracle calls, it achieves $O(\log T)$ regret when cumulative curvature grows linearly, which includes the classical strongly convex regime. We complement these upper bounds with matching lower bounds for prescribed cumulative-curvature sequences, already for one-dimensional convex losses, showing that the tradeoff between worst-case non-convex regret and curvature-driven fast rates is intrinsic.

SplineSketch: Even More Accurate Quantiles with Error Guarantees

2026-06-01T20:24:47Z

Space-efficient streaming estimation of quantiles in massive datasets is a fundamental problem with numerous applications in data monitoring and analysis. While theoretical research led to optimal algorithms, such as the Greenwald-Khanna algorithm or the KLL sketch, practitioners often use other sketches that perform significantly better in practice but lack theoretical guarantees. Most notably, the widely used $t$-digest has unbounded worst-case error. In this paper, we seek to get the best of both worlds. We present a new quantile summary, SplineSketch, for numeric data, offering near-optimal theoretical guarantees, namely uniformly bounded rank error, and outperforming $t$-digest by a factor of 2-20 on a range of synthetic and real-world datasets. To achieve such performance, we develop a novel approach that maintains a dynamic subdivision of the input range into buckets while fitting the input distribution using monotone cubic spline interpolation. The core challenge is implementing this method in a space-efficient manner while ensuring strong worst-case guarantees.

On the gap of quiver representations

2026-06-01T19:24:49Z

The nullcone membership problem, deciding whether an orbit closure contains the origin, is fundamental in computational invariant theory. For self-adjoint groups, Bürgisser, Franks, Garg, Oliveira, Walter and Wigderson gave a geodesic optimization algorithm whose complexity is controlled by the gap, a condition number of the representation. We study the gap for quiver representations under the action of the special linear group. We prove that the inverse gap is polynomially bounded in the number of vertices and the maximum dimension for type A and $\hat{A}$, as well as tree quivers with uniform dimension vectors. Consequently, the algorithm of Bürgisser et al. solves the nullcone membership problem in polynomial time for these families. In contrast, we construct families of quivers and dimension vectors where the gap is exponentially small in the number of leaves, furthermore, for every connected quiver we exhibit dimension vectors such that the weight margin (a related condition number) is exponentially small in the number of vertices. We also extend our results to $σ$-semistability, thereby giving a new proof of a recent result of Iwamasa, Oki, and Soma.