https://arxiv.org/api/7RYZ/fKPsdoG/yb7g/28iAO+594 2026-06-18T18:34:14Z 29013 330 15 http://arxiv.org/abs/2605.22653v1 The Secretary Problem with a Stochastic Precursor 2026-05-21T15:56:09Z

In learning-augmented online algorithms, predictions are usually valued for what they say: a value estimate, a solution, or an algorithmic recommendation. This paper shows that predictions can also be valuable solely due to their arrival time. We study the fundamental secretary problem augmented with a stochastic precursor: a content-free signal that is guaranteed to arrive no later than the best item, but is otherwise stochastically timed. The signal does not carry any additional information; nevertheless, its timing alone changes the structure of optimal stopping. We characterize optimal policies in the random-order and adversarial-order models. In random order, a single uniformly timed precursor already gives success probability at least $\frac12$, improving on the classic $\frac1e$ benchmark. With increasingly late precursors, the success probability approaches $1$. In adversarial order, for which traditional models do not admit strong guarantees, sufficiently concentrated precursors recover constant success guarantees. Our results show that such novel forms of asynchronous temporal information are a distinct and powerful form of advice in online decision making and may also be effective for other problems.

2026-05-21T15:56:09Z Franziska Eberle Alexander Lindermayr http://arxiv.org/abs/2605.22149v1 A Coalgebraic Dijkstra Algorithm 2026-05-21T08:20:49Z

The Dijkstra algorithm is a classical method for solving the shortest path problem on weighted graphs. There are several variations of the Dijkstra algorithm, including algorithms for the widest path problem and for two-player games. In this paper, we introduce the coalgebraic shortest path problem (CSPP), a unifying framework for a broad class of optimization problems on state-transition systems. This framework encompasses not only the aforementioned problems but also new ones such as the shortest binary tree problem. We further present a coalgebraic Dijkstra algorithm for solving the CSPP efficiently under a suitable condition. Our condition is necessary and sufficient for the algorithm to return correct solutions, thereby providing a precise criterion for when Dijkstra-style acceleration is possible. We also show that the proposed algorithm achieves asymptotic complexity comparable to that of the classical Dijkstra algorithm.

2026-05-21T08:20:49Z Takahiro Sanada Yoàv Montacute Kittiphon Phalakarn Ichiro Hasuo http://arxiv.org/abs/2602.11791v2 Gray Codes With Constant Delay and Constant Auxiliary Space 2026-05-21T06:52:57Z

We give the first two algorithms to enumerate all binary words of $\{0,1\}^\ell$ (like Gray codes) while ensuring that the delay and the auxiliary space is independent from $\ell$, i.e., constant time for each word, and constant memory in addition to the $\ell$ bits storing the current word. Our algorithms are given in two new computational models: tape machines and deque machines. We also study more restricted models, queue machines and stack machines, and show that they cannot enumerate all binary words with constant auxiliary space, even with unrestricted delay. A tape machine is a Turing machine that stores the current binary word on a single working tape of length $\ell$ (which never increases), using no other tape. The machine has a single head and must edit its tape to reach all possible words of $\{0,1\}^\ell$, and output them (in unit time, by entering special output states), with no duplicates. Hence a tape machine uses constant auxiliary space by definition (up to the head position). We construct a tape machine that achieves this task with constant delay between consecutive outputs, so that the machine implements a so-called skew-tolerant quasi-Gray code. We then construct a more involved tape machine that implements a Gray code. A deque machine stores the current binary word on a double-ended queue of length $\ell$, and stores a constant-size internal state. It works as a tape machine, except that it modifies the content of the deque by performing push and pop operations on the endpoints. Hence again a deque machine uses constant auxiliary space by definition. We construct deque machines that enumerate all words of $\{0,1\}^\ell$ with constant-delay. The main technical challenge in this model is to correctly detect when enumeration has finished.

2026-02-12T10:17:09Z 30 pages. This is the authors' version (will full proofs) of the corresponding ICALP'2026 article Antoine Amarilli Claire David Nadime Francis Victor Marsault Mikaël Monet Yann Strozecki http://arxiv.org/abs/2502.09105v2 Incremental Approximate Maximum Flow via Residual Graph Sparsification 2026-05-21T06:28:28Z

We give an algorithm that, with high probability, maintains a $(1-ε)$-approximate $s$-$t$ maximum flow in undirected, uncapacitated $n$-vertex graphs undergoing $m$ edge insertions in $\tilde{O}(m+ n F^*/ε)$ total update time, where $F^{*}$ is the maximum flow on the final graph. This is the first algorithm to achieve polylogarithmic amortized update time for dense graphs ($m = Ω(n^2)$), and more generally, for graphs where $F^*= \tilde{O}(m/n)$. At the heart of our incremental algorithm is the residual graph sparsification technique of Karger and Levine [STOC '02, SICOMP '15], originally designed for computing exact maximum flows in the static setting. Our main contributions are (i) showing how to maintain such sparsifiers for approximate maximum flows in the incremental setting and (ii) generalizing the cut sparsification framework of Fung et al. [STOC '11, SICOMP '19] from undirected graphs to balanced directed graphs.

2025-02-13T09:27:10Z v2: Our main result now holds against output-adaptive adversaries. v1 appeared at ICALP '25, and v2 accepted to TALG Gramoz Goranci Monika Henzinger Harald Räcke A. R. Sricharan http://arxiv.org/abs/2605.21946v1 Optimal $e^{(γ+o(1))n}$-Approximation of the Permanent of Positive Semidefinite Matrices 2026-05-21T03:25:29Z

We determine, up to lower-order terms in the exponent, the best possible deterministic polynomial-time approximation ratio for the permanent of a Hermitian positive semidefinite matrix. If $A\succeq 0$ has no zero diagonal entry, $d=\operatorname{rank}(A)$, $A=VV^\dagger$ with $V\in\mathbb{C}^{n\times d}$ full column rank, and $v_1,\ldots,v_n$ are the rows of $V$, define \[ Φ(V)=\max_{X\succ 0} \left\{\sum_{i=1}^n \log(v_i^\dagger Xv_i)+\log\det X-\operatorname{tr} X+d\right\}, \qquad \widehat P(A)=e^{Φ(V)}. \] We prove the exact sandwich \[ e^{-γn}\widehat P(A)\le \operatorname{per}(A)\le \widehat P(A). \] Here $γ$ is the Euler--Mascheroni constant. Since the maximization is concave, this gives a deterministic polynomial-time $e^{(γ+\varepsilon)n}$-approximation for every $\varepsilon>0$. Combined with the previous $e^{(γ-\varepsilon)n}$-hardness of approximation for positive semidefinite permanents, this resolves the optimal exponential approximation ratio for deterministic polynomial-time algorithms as $e^{(γ+o(1))n}$, assuming $\mathrm{P}\ne\mathrm{NP}$. The proof is an entropy argument applied to the standard Wick integral formula for $\operatorname{per}(A)$; the loss is exactly $γ$ per factor because $\mathbb{E}[\log T]=-γ$ for $T\sim\operatorname{Exp}(1)$. The result was obtained through interactions with GPT 5.5 Pro Extended: the first author's interaction was one-shot, while the second author's was a separate multi-turn interaction with high-level guidance. Both authors verified the theorem and proof. Codex was used to assemble and typeset the manuscript.

2026-05-21T03:25:29Z Nima Anari Farzam Ebrahimnejad http://arxiv.org/abs/2601.05157v2 Learning Mixture Models via Efficient High-dimensional Sparse Fourier Transforms 2026-05-21T03:15:53Z

In this work, we give a ${\rm poly}(d,k)$ time and sample algorithm for efficiently learning the parameters of a mixture of $k$ spherical distributions in $d$ dimensions. Unlike all previous methods, our techniques apply to heavy-tailed distributions and include examples that do not even have finite covariances. Our method succeeds whenever the cluster distributions have a characteristic function with sufficiently heavy tails. Such distributions include the Laplace distribution but crucially exclude Gaussians. All previous methods for learning mixture models relied implicitly or explicitly on the low-degree moments. Even for the case of Laplace distributions, we prove that any such algorithm must use super-polynomially many samples. Our method thus adds to the short list of techniques that bypass the limitations of the method of moments. Somewhat surprisingly, our algorithm does not require any minimum separation between the cluster means. This is in stark contrast to spherical Gaussian mixtures where a minimum $\ell_2$-separation is provably necessary even information-theoretically [Regev and Vijayaraghavan '17]. Our methods compose well with existing techniques and allow obtaining ''best of both worlds" guarantees for mixtures where every component either has a heavy-tailed characteristic function or has a sub-Gaussian tail with a light-tailed characteristic function. Our algorithm is based on a new approach to learning mixture models via efficient high-dimensional sparse Fourier transforms. We believe that this method will find more applications to statistical estimation. As an example, we give an algorithm for consistent robust mean estimation against noise-oblivious adversaries, a model practically motivated by the literature on multiple hypothesis testing. It was formally proposed in a recent Master's thesis by one of the authors, and has already inspired follow-up works.

2026-01-08T17:47:58Z Alkis Kalavasis Pravesh K. Kothari Shuchen Li Manolis Zampetakis http://arxiv.org/abs/2605.21920v1 Minimum Sum Set Cover: Structures and Algorithm 2026-05-21T02:45:18Z

A set cover of a hypergraph $H$ is a set of vertices intersecting every hyperedge. In the minimum sum set cover problem, vertices are selected one by one; each edge pays the position of the first vertex that hits it, and the objective is to minimize the total cost. When $H$ is a graph, this is the minimum sum vertex cover problem. A solution is specified by a set cover $S$ together with an ordering of its vertices. While the classical set cover problem seeks to minimize $|S|$, the minimum sum variant favors covering many edges early and may prefer larger covers. This motivates a natural question: how large can the gap between~$\overrightarrowτ$ and $τ$ be? We prove an upper bound $\overrightarrowτ \le τ\log_{2} \lvert E(H)\rvert$, and show that for any positive~$n$, there exists a hypergraph $H$ on $n + 3$ vertices with $τ=3$ and $\overrightarrowτ=n$. For graphs, we obtain stronger bounds: we prove~$\overrightarrowτ \le 2τ\log_{2} τ$, improving the bound of Liu et al.\ [Theor. Comput. Sci., 2025], and we construct graphs with~$\overrightarrowτ = Ω\left( \frac{τ\log τ}{\log\log τ}\right)$, nearly matching this upper bound. On the algorithmic side, we show that minimum sum set cover is fixed-parameter tractable on bounded-rank hypergraphs, parameterized by~$\overrightarrowτ$, extending the algorithm of Liu et al.\ for graphs (i.e., rank-two hypergraphs).

2026-05-21T02:45:18Z Zhongyi Zhang Yixin Cao http://arxiv.org/abs/2605.21860v1 Robust Statistical Estimators with Bounded Empirical Sensitivity 2026-05-21T01:13:52Z

We introduce a new measure of robustness for statistical estimators, which we call \emph{empirical sensitivity}. An estimator $\hat θ$ has bounded empirical sensitivity if, with high probability over a dataset $X = (X_1, \dots, X_n) \sim \mathcal{D}^{\otimes n}$, for any dataset $Y$ obtained by modifying at most $ηn$ points in $X$, we have that $\hat θ(Y)$ is close to $\hat θ(X)$. We study bounds on this quantity for the prototypical problem of Gaussian mean estimation. We prove new lower bounds, showing that for any estimator $\hat μ$ which achieves an optimal $\ell_2$-error bound of $O\left(\sqrt{d/n}\right)$, the empirical sensitivity is at least $Ω\left(η+ \sqrt{ηd/n}\right)$. The two terms arise due to obstructions on the mean and variance (via an Efron-Stein argument) of such an estimator. We show that this bound is tight up to logarithmic factors, by employing recent results for robust empirical mean estimation.

2026-05-21T01:13:52Z Valentio Iverson Gautam Kamath Argyris Mouzakis Adam Smith http://arxiv.org/abs/2605.21829v1 An $Ω(n \log n)$ Randomized Lower Bound for Cutting a Cake into Proportionally Fair Pieces 2026-05-20T23:51:30Z

We consider the classic cake cutting problem in the Robertson-Webb model, with the objective of proportional fairness. We show that any randomized algorithm must use $Ω(n \log n)$ queries.

2026-05-20T23:51:30Z Stephen Arndt Carnegie Mellon University Kirk Pruhs university of Pittsburgh Trung Tran University of Pittsburgh http://arxiv.org/abs/2605.21753v1 Finding a Solution to the Erdős-Ginzburg-Ziv Theorem in Linear Time 2026-05-20T21:26:09Z

The Erdős-Ginzburg-Ziv theorem states that every sequence of 2n - 1 integers contains a subsequence of length n whose sum is divisible by n. Choi, Kang, and Lim gave a simple deterministic O(n log n) algorithm for finding such a subsequence, and Leung recently improved this to O(n log log log n). We give a deterministic linear-time algorithm. The core is a linear-time algorithm for the following prime target subset-sum problem: given p - 1 nonzero residues in Z_p and a target residue, find a subset with the prescribed sum. Our algorithm maintains a compact arithmetic-progression representation of reachable sums. When two progressions intersect, a bounded Frobenius interval in their sum allows them to be merged into one longer progression, with enough growth to pay for the update. When the representation either contains a full progression or covers all nonzero residues, the target residue is recovered constructively. The standard multiplicative reduction then extends the prime algorithm to arbitrary moduli.

2026-05-20T21:26:09Z 10 pages Sunghyeon Jo http://arxiv.org/abs/2605.21738v1 Asymptotic Rank Speedup Theorems, Revisited 2026-05-20T21:03:34Z

Motivated by fast matrix multiplication and recent connections between asymptotic tensor rank and fine-grained complexity, we revisit classical tools from the matrix multiplication literature and develop a framework for obtaining improved asymptotic rank upper bounds for tensors beyond matrix multiplication. In the 1980s, Coppersmith-Winograd and Strassen discovered a series of speedup theorems for asymptotic rank: in certain regimes, one can extract additional terms from a border rank upper bound on a tensor $T$, and then use these terms to obtain an improved asymptotic rank of $T$. We establish general speedup theorems that subsume these results and enable quantitative improvements. Two representative applications are: (1) The asymptotic rank of the small Coppersmith-Winograd tensor $\mathrm{cw}_q$ is less than its border rank. For instance, we prove the asymptotic rank of $\mathrm{cw}_2$ is smaller than $3.931$, improving on $\underline{\mathrm{R}}(\mathrm{cw}_2)=4$. It is known that if the asymptotic rank of $\mathrm{cw}_2$ equals $3$, this would imply $ω=2$. (2) A general improvement over Strassen's bound: we obtain an upper bound below $d^{2ω/3}$ on the asymptotic rank of any $d\times d\times d$ tensor. To make full use of speedups, we analyze degenerations in which both sides are nontrivial direct sums, a setting where the optimal quantitative bound one can achieve was previously unclear. We do so via an approach we call Strassen calculus: a systematic method for converting such degeneration data into explicit asymptotic rank bounds using Strassen's theory of the asymptotic spectrum.

2026-05-20T21:03:34Z 43 pages, to appear in CCC 2026 Josh Alman Baitian Li http://arxiv.org/abs/2605.21601v1 Near-Optimal Generalized Private Testing 2026-05-20T18:06:50Z

In differential privacy (DP), the generalized private testing problem was introduced by Liu and Talwar (STOC 2019). Given a dataset $X \in \mathcal{X}$ and a sequence of black-box $\varepsilon_t$-DP mechanisms $M_t:\mathcal{X}\to\{+1,-1\}$, the analyst must accept the first mechanism whose success probability $p_t=\Pr[M_t(X)=+1]$ exceeds a given threshold $p^*\in(0,1)$, while achieving DP. Accuracy is measured by the gap between $p^*$ and a rejection threshold $\bar{p}$, such that with probability $1-β$ for all $t\geq1$, if $p_t\leq\bar{p}$, then $M_t$ is rejected, and if $p_t\geq p^*$, then it is accepted. This generalizes the standard private testing problem, whose solution, the Sparse Vector Technique, is ubiquitous in DP. We introduce the Generalized Thresholding Mechanism (GTM) for generalized private testing. For $\varepsilon>0$ and any sequence of $(\varepsilon_t,δ_t)$-DP mechanisms $M_t$, the GTM is pure $\varepsilon$-DP. For $θ>0$, $γ\in(1,2]$, and $β\in(0,1)$, $\bar{p}_t=\max(p^*/γΛ_t, 1 - γΛ_t(1-p^*))-δ_t/\varepsilon_t$ for $Λ_t=(5t\ln^3(t+2))^{(2+θ)\varepsilon_t/\varepsilon}(4/β)^{(3+θ+2/θ)\varepsilon_t/\varepsilon}$. With probability $1-β$, the number of evaluations of $M_t$ is at most $O((\ln(t/β)/(γ-1)^2)\max(Λ_t/p^*,(1-p^*)^{-1}))$ for all $t\geq 1$. Our lower bounds prove near-optimality of our accuracy and sample complexity guarantees. Via the GTM, we give a black-box reduction for DP optimization from the continual observation (CO) setting to the batch setting. This gives us the first DP-CO algorithms for many maximization problems. Further, the GTM permits an adaptive choice of acceptance thresholds $(p^*_t)_{t\geq1}$, addressing a challenge mentioned in prior work on using generalized private testing for hyperparameter optimization (Papernot and Steinke (ICLR 2022)).

2026-05-20T18:06:50Z 67 pages, 3 tables Anamay Chaturvedi Monika Henzinger Jalaj Upadhyay http://arxiv.org/abs/2605.21428v1 Polynomial-Time Robust Multiclass Linear Classification under Gaussian Marginals 2026-05-20T17:19:30Z

We study the task of agnostic learning of multiclass linear classifiers under the Gaussian distribution. Given labeled examples $(x, y)$ from a distribution over $\mathbb{R}^d \times [k]$, with Gaussian $x$-marginal, the goal is to output a hypothesis whose error is comparable to that of the best $k$-class linear classifier. While the binary case $k=2$ has a well-developed algorithmic theory, much less is known for $k \ge 3$. Even for $k=3$, prior robust algorithms incur exponential dependence on the inverse of the desired accuracy in both complexity and representation size. In this work, we develop new structural results for multiclass linear classifiers and use them to design fully polynomial-time robust learners with dimension-independent error guarantees. Our first result shows that the standard multiclass perceptron algorithm requires super-polynomially many samples and updates, even with clean labels and Gaussian marginals, revealing a basic obstruction absent in the binary case. Our main positive result is a pairwise improper-learning framework which yields an efficient learner with error $\widetilde O(k^{3/2}\sqrt{\mathrm{opt}})+ε$ for general $k$. Additionally, we develop a sharper localization-based framework which leads to error $O(\mathrm{opt})+ε$ for $k=3$, and error $\mathrm{poly}(k)\mathrm{opt}+ε$ for geometrically regular $k$-class linear classifiers.

2026-05-20T17:19:30Z Ilias Diakonikolas Giannis Iakovidis Mingchen Ma http://arxiv.org/abs/2408.02360v3 Potential Hessian Ascent: The Sherrington-Kirkpatrick Model 2026-05-20T15:45:30Z

We present the first iterative spectral algorithm to find near-optimal solutions for a random quadratic objective over the discrete hypercube, resolving a conjecture of Subag [Subag, Communications on Pure and Applied Mathematics, 74(5), 2021]. The algorithm is a randomized Hessian ascent in the solid cube, with the objective modified by subtracting an instance-independent potential function [Chen et al., Communications on Pure and Applied Mathematics, 76(7), 2023]. Using tools from free probability theory, we construct an approximate projector into the top eigenspaces of the Hessian, which serves as the covariance matrix for the random increments. With high probability, the iterates' empirical distribution approximates the solution to the primal version of the Auffinger-Chen SDE [Auffinger et al., Communications in Mathematical Physics, 335, 2015]. The per-iterate change in the modified objective is bounded via a Taylor expansion, where the derivatives are controlled through Gaussian concentration bounds and smoothness properties of a semiconcave regularization of the Fenchel-Legendre dual to the Parisi PDE. These results lay the groundwork for (possibly) demonstrating low-degree sum-of-squares certificates over high-entropy step distributions for a relaxed version of the Parisi formula [Open Question 1.8, arXiv:2401.14383].

2024-08-05T10:09:32Z 87 pages, 0 figures. Accepted for publication at CMP David Jekel Juspreet Singh Sandhu Jonathan Shi http://arxiv.org/abs/2605.21248v1 Distributed Stochastic Graph Algorithms 2026-05-20T14:38:11Z

We study stochastic graph optimization problems in a novel distributed setting. As in the standard centralized setting, a random subgraph $G^*$ of a known base graph $G$ is realized by including each edge $e$ independently with a known probability $p_e$, and we must solve an optimization problem on $G^*$ despite uncertainty about its edges. In the standard setting, to cope with this uncertainty, the algorithm can query any edge of $G$ to learn if the edge exists in $G^*$, and its complexity is the number of queried edges. The distributed setting incorporates uncertainty in a natural manner, by having each vertex know only about its own edges in $G^*$ (and only communicate over them), and the complexity is measured by the number of synchronous communication rounds. We establish that distributed stochastic algorithms can be drastically faster than their non-stochastic counterparts and overcome known lower bounds, by showing fast distributed approximation algorithms for maximum matching, minimum vertex cover, and minimum dominating set.

2026-05-20T14:38:11Z To appear in PODC 2026 Keren Censor-Hillel Aditi Dudeja George Giakkoupis