The Power of Graph Doubling: Computing Ultrabubbles in a Bidirected Graph by Reducing to Weak Superbubbles

2026-05-13T06:47:46Z

Bidirected graphs are a common generalisation of directed graphs where arcs can also be incoming to both their incident nodes, or outgoing from both their incident nodes. Such arcs allow a walk to change direction. Some algorithms can easily be adapted from directed graphs to bidirected graphs, such as shortest path algorithms. These adaptions are already used in practice, and implicitly use the graph doubling technique to apply an algorithm for directed graphs to bidirected graphs. In other cases, the applicability of graph doubling is not that obvious. For example, superbubbles and their generalisation to bidirected graphs ultrabubbles. Ultrabubbles are a common structure in bidirected biological graphs which carries biological meaning, but also functions as a nested clustering method, since an ultrabubble is separated by only two nodes from the rest of the graph. There is an existing method that enumerates a structure similar to ultrabubbles by enumerating (weak) superbubbles in the doubled graph. However, the literature does not make any direct connection between superbubbles and ultrabubbles except that a superbubble is an ultrabubble in a directed graph. Only a partial result connecting superbubbles and ultrabubbles exists by Harviainen et al. (2026). Graph doubling on the other hand maintains connectivity, and allows to draw a direct connection between ultrabubbles and weak superbubbles. This results in the first linear-time reduction-based algorithm for computing ultrabubbles on any bidirected graph. Together with the fact that graph doubling is already used implicitly in simple cases, our result motivates that graph doubling is a powerful yet simple technique to apply algorithms for directed graphs to bidirected graphs.

A Faster Generalized Two-Stage Approximate Top-K

2026-05-13T03:11:51Z

We consider the Top-$K$ selection problem, which aims to identify the largest $K$ elements in an array. Top-$K$ selection arises in many machine learning algorithms and often becomes a bottleneck on accelerators, which are optimized for dense matrix multiplications. To address this problem, Chern et al. (2022) proposed a fast two-stage approximate Top-$K$ algorithm that: (i) partitions the input array into equal-sized chunks and selects the top-$1$ element from each partition; and (ii) sorts the resulting smaller subset and returns the top $K$ elements. In this paper, we generalize the first stage so that each partition selects the top $K'$ elements (for $1 \leq K' \leq K$). Our contributions include: (i) an expression for the expected recall of this generalized algorithm under random partitioning, and a demonstration that choosing $K' > 1$ with fewer partitions in the first stage more effectively reduces the input size to the second stage while maintaining the same expected recall as the original algorithm; (ii) a bound on the expected recall of the original algorithm as a function of the algorithm parameters that is provably tighter by a factor of $2$ than the bound reported by Chern et al. (2022); and (iii) an implementation of our algorithm on Cloud TPUv5e that achieves approximately an order of magnitude speedup over the original algorithm without sacrificing recall.

On the Advantage of Adaptivity for Sampling with Cell Probes

2026-05-13T01:36:59Z

We construct an explicit distribution $\mathbf{D}$ over $\{0,1\}^N$ that exhibits an essentially optimal separation between adaptive and non-adaptive cell-probe sampling. The distribution can be sampled exactly when each output bit is allowed two adaptive probes to an arbitrarily long sequence of independent uniform symbols from $[N]$. In contrast, any non-adaptive sampler requires $\widetildeΩ(N)$ non-adaptive cell probes to generate a distribution with total variation distance less than $1-o(1)$ from $\mathbf{D}$. This provides a $2$-vs-$\widetildeΩ(N)$ separation for sampling with adaptive versus non-adaptive cell probes, improving upon the $2$-vs-$\widetildeΩ(\log N)$ separation of Yu and Zhan (ITCS '24) and the $(\log N)^{O(1)}$-vs-$N^{Ω(1)}$ separation of Alekseev, Göös, Myasnikov, Riazanov, and Sokolov (STOC '26).

Time and Supply Fairness in Electricity Distribution using $k$-times bin packing

2026-05-12T23:13:44Z

Given items of different sizes and a fixed bin capacity, the bin-packing problem is to pack these items into the minimum number of bins such that the sum of the item sizes in each bin does not exceed the capacity. We define a new variant, k-times bin-packing (kBP), in which the goal is to pack the items so that each item appears exactly k times in k different bins. We generalize existing approximation algorithms for bin-packing to solve kBP and analyze their performance ratios. The fair electricity division problem motivates the study of kBP. The goal is to allocate the available supply among households using some fairness criteria, such as the egalitarian principle. We prove that every electricity division problem can be solved by k-times bin-packing for some finite k, which depends only on the number of households. We implement generalizations of the First-Fit and First-Fit Decreasing bin-packing algorithms to solve kBP and apply them to real electricity demand data. We show that our generalizations outperform existing heuristic solutions to the same problem in terms of the egalitarian allocation of connection time. We study another variant of the egalitarian allocation problem, in which the goal is to maximize the minimum number of watts allocated to a household. For this variant, we prove an impossibility result: there does not exist such a k that depends only on the number of agents. This impossibility result motivates us to develop four different heuristic algorithms to solve the egalitarian allocation of watts problem. We evaluate the heuristics by summing the minimum watts allocated to any household in each hour, yielding a fairness metric that reflects the lowest watt allocation across all hours. A higher total minimum of watts indicates a more equitable distribution. Thus, we establish new benchmarks for fair allocation of watts.

Thin Trees for Near Minimum Cuts

2026-05-12T19:19:06Z

The strong thin tree conjecture states that every $k$-edge-connected graph $G$ contains an $O(1/k)$-thin spanning tree, meaning a spanning tree which contains at most an $O(1/k)$ fraction of the edges across each cut in $G$. This conjecture is still open despite significant effort; the best current result by Anari and Oveis Gharan shows the existence of an $O(\text{polyloglog}(n)/k)$-thin tree. In this work, we demonstrate that the conjecture is true if one only requires thinness for the set of $η$-near minimum cuts of the graph for $η= 1/40$, in other words, for the set of cuts with fewer than $(1+1/40)k$ edges. Our approach constructs such a tree in polynomial time. To show this, we utilize the structure of near minimum cuts, and in particular the polygon representation of Benczúr and Goemans, to reduce to the previously solved problem of finding a spanning tree that is $O(1/k)$-thin for all sets in a laminar family.

Touring a Sequence of Orthogonal Polygons

2026-05-12T19:07:38Z

We study the problem of computing a shortest tour that visits a sequence of $k$ polygons $P_1,\dots, P_k$ with a total number of $n$ vertices. A tour is an oriented curve such that there exist points $p_i\in P_i$ for all $i$ where $p_i$ appears not after $p_{i+1}$. In a seminal paper, Dror, Efrat, Lubiw and Mitchell (STOC 2003) considered the problem under $L_2$ distance, and gave $\widetilde O(nk)$ and $\widetilde O(nk^2)$ algorithms for disjoint and intersecting convex polygons, respectively. In this paper, we consider the orthogonal setting (with orthogonal polygons and Manhattan distance) and obtain the following results: - a truly subquadratic $\widetilde O(n^{2-\frac{1}{48}})$ algorithm when consecutive polygons in the sequence are disjoint; - an $\widetilde O(n)$ algorithm for ortho-convex polygons when consecutive polygons are disjoint; - an $O(n)$ algorithm for axis-aligned rectangles; - $\widetilde O(n^2)$ and $\widetilde O(n^{1.5}k^2)$ algorithms without restrictions. Our algorithms build on a wide range of techniques, including additively weighted Voronoi diagrams, rectangle decompositions, persistent data structures, and dynamic distance oracles for weighted planar graphs.

Optimal Bounds, Barriers, and Extensions for Non-Hermitian Bivariate Quantum Signal Processing

2026-05-12T19:03:36Z

Multivariate quantum signal processing (M-QSP) has recently been shown to be applicable for non-Hermitian Hamiltonian simulation, opening several problems regarding the optimization landscape, angle-finding, and constant-factor analysis. We resolve several of these problems here. We find the anti-Hermitian query complexity $d_I = Θ(\betaI T + \log(1/\varepsilon)/\log\log(1/\varepsilon))$ to be tight, established via Chebyshev coefficient bounds, modified Bessel function asymptotics, and Lambert~$W$ inversion. Fast-forwarding to $d_I = \mathcal{O}(\sqrt{\betaI T})$ is impossible in the bivariate polynomial model, though a linear state-dependent improvement to $d_I = \mathcal{O} β_{\mathrm{eff}} T + \log(1/\varepsilon)/\log\log(1/\varepsilon))$ is achievable. The optimization landscape of M-QSP admits spurious local minima, but a warm-start basin guarantee ensures the two-stage algorithm converges. CRC-exploiting block peeling reduces angle-finding from $\mathcal{O}(d^3)$ to $\mathcal{O}(d^2)$ classical operations, and optimized error allocation yields a leading constant of approximately~$2$ relative to the information-theoretic lower bound. A constant-ratio condition extends to non-identical signal operators, enabling time-dependent non-Hermitian simulation with query complexity $\mathcal{O}(\int_0^T(\alphaR(s) + \betaI(s))\,ds + \log(1/\varepsilon)/\log\log(1/\varepsilon))$. Block-encoding overhead $e^{-2\betaI T}$ holds across all function classes within the walk-operator oracle model, and dilational methods (Schrödingerization) achieve the walk-operator barrier. A precisely characterized direct-access construction achieves the intrinsic barrier $e^{-2ωT}$ (with $ω< \betaI$ for non-commuting Hamiltonians) on a restricted domain, though extension to the full bitorus remains open.

Unique Decoding of Reed-Solomon and Related Codes for Semi-Adversarial Errors

2026-05-12T18:28:59Z

Motivated by recent developments in coding theory, particular in list-decoding, we introduce a new error model which we call semi-adversarial errors. This error model bridges between fully random errors and fully adversarial errors by allowing some symbols of a message to be corrupted by an adversary while others are replaced with uniformly random symbols. As our main quest, we seek to understand optimal efficient unique decoding algorithms in the semi-adversarial model. For interleaved Reed--Solomon (IRS), folded Reed--Solomon (FRS) and univariate multiplicity codes, we design decoding algorithms running in near-linear time for most mixtures of random and adversarial errors. Our analysis matches the information-theoretic optimum for semi-adversarial errors. Our algorithm for interleaved Reed--Solomon codes is an improved implementation of the decoding algorithm by Bleichenbacher--Kiayias--Yung (BKY) for fully random errors. We use a novel monomial-tracking technique to analyze its performance in this new semi-adversarial errors. Inspired by the BKY algorithm, we use novel interpolations to extend our approach to the settings of folded Reed--Solomon and multiplicity codes, resulting in fast algorithms for unique decoding against semi-adversarial errors. Our new decoders for FRS and multiplicity codes replace the sophisticated root-finding step in traditional algorithms, such as the Guruswami--Wang algorithm, with a straightforward polynomial long division. Analysis of these algorithms requires more robust monomial-tracking arguments than IRS codes.

Smoothed Analysis of Learning from Positive Samples

2026-05-12T17:57:04Z

Binary classification from positive-only samples is a variant of PAC learning where the learner receives i.i.d. positive samples and aims to learn a classifier with low error. Previous work by Natarajan, Gereb-Graus, and Shvaytser characterized learnability and revealed a largely negative picture: almost no interesting classes, including two-dimensional halfspaces, are learnable. This poses a challenge for applications from bioinformatics to ecology, where practitioners rely on heuristics. In this work, we initiate a smoothed analysis of positive-only learning. We assume samples from a reference distribution $D$ such that the true distribution $D^*$ is smooth with respect to it. In stark contrast to the worst-case setting, we show that all VC classes become learnable in the smoothed model, requiring $O(VC/ε^2)$ positive samples for $ε$ classification error. We also give an efficient algorithm for any class admitting $\mathrm{poly}(ε)$-approximation by degree-$k$ polynomials whose range is lower-bounded by a constant with respect to $D$ in L1-norm. It runs in time $\mathrm{poly}(d^k/ε)$, qualitatively matching L1-regression. Our results also imply faster or more general algorithms for: (1) estimation with unknown-truncation, giving the first polynomial-time algorithm for estimating exponential-family parameters from samples truncated to an unknown set approximable by non-negative polynomials in L1 norm, improving on [KTZ FOCS19; LMZ FOCS24], who required strong L2-approximation; (2) truncation detection for broad classes, including non-product distributions, improving on [DLNS STOC24]'s who required product distributions; and (3) learning from a list of reference distributions, where samples come from $O(1)$ distributions, one of which witnesses smoothness of $D^*$, as arises when list-decoding algorithms learn samplers for $D^*$ from corrupted data.

A proximal gradient algorithm for composite log-concave sampling

2026-05-12T17:48:09Z

We propose an algorithm to sample from composite log-concave distributions over $\mathbb{R}^d$, i.e., densities of the form $π\propto e^{-f-g}$, assuming access to gradient evaluations of $f$ and a restricted Gaussian oracle (RGO) for $g$. The latter requirement means that we can easily sample from the density $\text{RGO}_{g,h,y}(x) \propto \exp(-g(x) -\frac{1}{2h}||y-x||^2)$, which is the sampling analogue of the proximal operator for $g$. If $f + g$ is $α$-strongly convex and $f$ is $β$-smooth, our sampler achieves $\varepsilon$ error in total variation distance in $\widetilde{\mathcal O}(κ\sqrt d \log^4(1/\varepsilon))$ iterations where $κ:= β/α$, which matches prior state-of-the-art results for the case $g=0$. We further extend our results to cases where (1) $π$ is non-log-concave but satisfies a Poincaré or log-Sobolev inequality, and (2) $f$ is non-smooth but Lipschitz.

Layer-Based Width for PAFP

2026-05-12T17:45:40Z

The Path Avoiding Forbidden Pairs problem (PAFP) asks whether, in a directed graph $G$ with terminals $s,t$ and a set $\mathcal{F}$ of forbidden vertex pairs, there is an $s$-$t$ path that contains at most one endpoint from each forbidden pair. We initiate the study of PAFP through a layer-based width measure. Our first focus is the union digraph $G\cup\mathcal{F}$, obtained by adding to $G$ one arc per forbidden pair, oriented according to a fixed reachability-compatible order. Let the BFS layer $L_d$ be all vertices at directed shortest-path distance $d$ from $s$, where the BFS-width from $s$ is $\max_d |L_d|$. We show if $G\cup\mathcal{F}$ has BFS-width $b$ from $s$ and only $β$ arcs going from a later BFS layer to an earlier one, then PAFP is FPT parameterized by $b+β$. The backward-arc hypothesis is essential: we show PAFP remains NP-complete when the union digraph is a DAG with BFS-width 2. We also show if the input DAG has BFS-width at most $2$ and only $k$ backward input arcs, then PAFP can be decided in $2^k |I|^{O(1)}$ time, with unrestricted forbidden pairs. This width-$2$ result is tight: inspection of a classical reduction shows NP-completeness on input DAGs of BFS-width $3$ with no backward input arcs. Moreover, we study exact-length layers in the input graph, where the $d$-th layer consists of the vertices reachable from $s$ by a directed path of length exactly $d$. For DAGs of exact-length width at most $2$, we show PAFP is polynomial-time decidable by a 2-SAT encoding of fixed-length paths. This bound is tight: the same classical reduction yields NP-completeness on DAGs of exact-length width $3$. Unlike previously known polynomial-time regimes for PAFP, which restrict the forbidden-pair set in order to obtain tractability, our two input-graph tractability results allow unrestricted forbidden pairs and input graphs with exponentially many $s$-$t$ paths.

A Linear-Time 1.5-Approximation for Broadcasting in k-Cycle Graphs

2026-05-12T17:39:34Z

Broadcasting is an information dissemination primitive where a message originates at a node (called the originator) and is passed to all other nodes in the network. Broadcasting research is motivated by efficient network design and determining the broadcast times of standard network topologies. Verifying the broadcast time of a node $v$ in an arbitrary network $G$ is known to be NP-hard. Additionally, recent findings show that the broadcast time problem is NP-hard in several highly restricted subfamilies of cactus graphs. The most restrictive of these families is known as \emph{$k$-cycle graphs} or \emph{flower graphs} and is the focus of this paper. We present a simple $(1.5-ε)$-approximation algorithm for determining the broadcast time of networks modeled using $k$-cycle graphs, where $ε> 0$ depends on the structure of the graph.

Online Monotone Metric Embeddings

2026-05-12T14:38:15Z

Metric embeddings into structured spaces, particularly hierarchically well-separated trees (HSTs), are a fundamental tool in the design of online algorithms. In the classical online embedding setting, points arrive sequentially and must be embedded irrevocably upon arrival, resulting in strong distortion lower bounds of $Ω(\min(n, \log n\log Δ))$, where $n$ is the number of points and $Δ$ their aspect ratio. We propose a novel relaxation, online monotone metric embeddings, which allows distances between embedded points in the target space to decrease monotonically over time. Such relaxed embeddings remain compatible with many online algorithms. Moreover, this relaxation breaks existing lower bound barriers, enabling embeddings into HSTs with distortion $O(\log^2 n)$. We also study a dynamic variant, where points may both arrive and depart, seeking distortion guarantees in terms of the maximum number $l$ of simultaneously present points. For traditional embeddings, such bounds are impossible, and this limitation persists even for deterministic monotone embeddings. Surprisingly, probabilistic monotone embeddings allow for $O(l \log l)$ distortion, which is nearly optimal given an $Ω(l)$ lower bound.

Risk-Sensitive Online Selection with Bounded Adaptivity

2026-05-12T14:21:48Z

Designing randomized online algorithms that perform reliably not only in expectation but also under unfavorable realizations of randomness is a fundamental challenge in online decision-making. In this paper, we study this challenge in online adversarial selection, where a decision maker allocates $k$ units of a resource to sequentially arriving buyers through posted prices. We focus on two intertwined considerations that are often overlooked simultaneously: tail-risk sensitivity and bounded adaptivity, where tail risk is measured using conditional value-at-risk (CVaR) and bounded adaptivity limits the number of allowable policy updates over time. Our main contribution is a correlated posted-price mechanism that uses a single random seed to coordinate pricing decisions across time. This correlation induces a monotonic ordering of pricing profiles across sample paths, improving lower-tail performance while respecting the adaptivity constraint. More broadly, our results highlight correlation as a mechanism for controlling tail risk in randomized online algorithms. Using this framework, we derive competitive guarantees for several regimes of the problem under both static and dynamic pricing. Our analysis develops a risk-sensitive randomized online primal-dual framework tailored to CVaR objectives and reveals a systematic trade-off between allowable adaptivity, risk sensitivity, and competitive performance. Experiments on real airline pricing data further illustrate the empirical impact of correlated pricing on welfare concentration and tail behavior.

Interval Graphs are Reconstructible

2026-05-12T14:13:21Z

A graph is reconstructible if it is determined up to isomorphism by the multiset of its proper induced subgraphs. The reconstruction conjecture postulates that every graph of order at least 3 is reconstructible. We show that interval graphs with at least three vertices are reconstructible. For this purpose, we develop a technique to handle separations in the context of reconstruction. This resolves a major roadblock to using graph structure theory in the context of reconstruction. To apply our novel technique, we also develop a resilient combinatorial structure theory for interval graphs. A consequence of our result is that interval graphs can be reconstructed in polynomial time.