https://arxiv.org/api/kjgLyXigBK0qaGXgrrGjrEgAfSY 2026-06-18T14:31:21Z 29013 270 15 http://arxiv.org/abs/2602.04463v2 Simple Algorithms for Bad Triangle Transversals with Applications to Correlation Clustering 2026-05-27T07:18:56Z The Bad Triangle Transversal (BTT) problem asks for the smallest set of edges that need to be removed from a given signed graph, so that the resulting graph does not have a bad triangle. Here, a bad triangle is a triangle with exactly one negative edge. Several 2-approximations for BTT are proposed in this paper. On the hardness side, we show that BTT is NP-hard to approximate with factor better than $\frac{2137}{2136}$ on complete graphs. Our reduction also works for Correlation Clustering (CC), the Cluster Deletion problem (CD) and the Minimum Strong Triadic Closure problem (MinSTC). Lastly, we show that the BTT and CC optima are within a factor of 3/2 in complete graphs, by describing a pivot procedure that transforms transversals into clusters. 2026-02-04T11:48:41Z Accepted to ICML 2026 (Spotlight) Florian Adriaens Nikolaj tatti http://arxiv.org/abs/2605.27998v1 Efficient Algorithms for Interdicting Facilities in Trees and Bounded Treewidth Graphs 2026-05-27T05:41:29Z Given a graph $G$ of $n$ nodes partitioned into facilities and customers, the $r$-edge interdiction covering problem (REIC) is to remove up to $r$ edges so as to maximize the total weight of customers disconnected from all facilities, which is called the covering objective function. While REIC is known to be NP-complete for general graphs, Fröhlich and Ruzika show that the problem can be solved in polynomial time when $G$ is a tree, providing an $O(n^7 r)$-time algorithm. We give an efficient $O(nr^2)$-time dynamic programming algorithm for REIC on trees that is fixed-parameter linear in $n$. Evaluating our solution on a benchmark of randomly generated tree networks with baselines of the Fröhlich and Ruzika algorithm and the Gurobi integer program solver, we demonstrate that in practice, our algorithm is both significantly faster and less sensitive to network topology and size. We extend our algorithm for REIC to graphs of bounded treewidth, a well-studied family of sparse graphs that generalizes trees, and obtain a matching runtime of $O(nr^2)$. We also consider the $r$-facility interdiction covering problem (RFIC), a novel variant of this network interdiction problem where the goal is to remove up to $r$ facilities to maximize the covering objective function over disconnected customers. We show that RFIC is NP-complete by observing it generalizes the small set bipartite vertex expansion problem (SSBVE), also known as the minimum $p$-union problem. We give an $O(nr^2)$-time algorithm for RFIC on trees, which also gives an $O(n^3)$-time algorithm for SSBVE on trees. 2026-05-27T05:41:29Z Ali Abbasi Eli Friedman Leana Golubchik Samir Khuller Marco Paolieri http://arxiv.org/abs/2605.27942v1 Quantum principal component analysis without eigenvector recovery 2026-05-27T04:27:47Z Principal component analysis (PCA) is traditionally implemented through a covariance or kernel matrix, leading-eigenvector extraction, and hard rank-$k$ projection. These steps can be computationally costly in high-dimensional and quantum-data settings, sensitive to small eigengaps, and unnecessary when downstream tasks only require principal-subspace scores. Such score-based objectives are important in applications such as anomaly detection, spectral-energy profiling, and other postselection tasks. To address these needs, we introduce a measurement-based soft PCA framework replacing the hard top-$k$ projector with an entropy-regularized Fermi--Dirac filter. This filter is the unique optimizer of an entropy-regularized variational formulation of PCA and converges to the classical PCA projector in the zero-temperature limit. This filter has a direct interpretation as a quantum measurement, which naturally suggests a quantum approach. For centered covariance operators represented by quantum feature states, a single fixed circuit, together with threshold calibration, accesses all optimal filters for different rank budgets or retained-variance levels without rank-dependent circuit updates or eigenvector recovery. For new inputs, the same calibrated quantum circuit yields soft principal subspace scores, spectral energy profiles, and postselected filtered states. The required centering of both training and test data is performed coherently inside the quantum protocol, which is particularly important for quantum data where no classical feature vectors or centered Gram matrix are directly available. By reframing PCA as a calibrated measurement task, this framework bypasses the need for iterative eigenvector extraction and achieves a dimension-independent sample complexity $O(η^{-2})$ for normalized fractional-rank or retained variance scoring at additive accuracy $η$. 2026-05-27T04:27:47Z Yewei Yuan Michele Minervini Mark M. Wilde Nana Liu http://arxiv.org/abs/2605.27769v1 Smoothed Score Queries and the Complexity of Sampling 2026-05-26T23:38:20Z We study the query complexity of sampling from high-dimensional Gaussian distributions using gradient information. In the standard oracle model, exact gradients expose only matrix-vector products with the precision matrix, leading to polynomial approximation barriers and a characteristic \(\sqrtκ\) dependence on the condition number. We show that this barrier disappears when the sampler is allowed to query \emph{smoothed scores}, namely gradients of the logarithms of the Gaussian-convolved densities. For a Gaussian target with precision matrix \(Λ\), a smoothed-score query at noise level \(τ\) gives access to the resolvent \((Λ+τ^{-1}I)^{-1}\). Combining geometrically spaced noise levels with sinc-quadrature rational approximation, we obtain a sampler with $q=O\!\left(\bigl(\logκ+\log(e\sqrt d/δ_{\rm TV})\bigr)\log(e\sqrt d/δ_{\rm TV})\right)$ smoothed-score queries for total variation error \(δ_{\rm TV}\), improving the condition-number dependence from \(\sqrtκ\) to logarithmic. We also study finite-bit gradient oracles. Using coordinatewise quantization of the transformed smoothed-score answers and a final dithering step, we obtain a sampling scheme whose total communicated gradient information is polylogarithmic in \(κ\); in particular, for fixed dimension and accuracy, the bit complexity is \(O(\log^2κ)\). To complement these upper bounds, we introduce a channel-synthesis, or reverse-Shannon, converse technique for sampling lower bounds. This converts total-variation simulation guarantees into communication requirements and yields an \(Ω(\logκ)\) lower bound on the required gradient information. Together, these results identify smoothed scores as a provably more informative oracle for sampling and give nearly matching upper and lower bounds for its finite-bit complexity. 2026-05-26T23:38:20Z Jingbo Liu http://arxiv.org/abs/2404.11591v3 The EDGE Language: Extended General Einsums for Graph Algorithms 2026-05-26T22:47:36Z In this work, we propose a unified abstraction for graph algorithms: the Extended General Einsums language, or EDGE. The EDGE language expresses graph algorithms in the language of tensor algebra, providing a rigorous, succinct, and expressive mathematical framework. EDGE leverages two ideas: (1) the well-known foundations provided by the graph-matrix duality, where a graph is simply a 2D tensor, and (2) the power and expressivity of Einsum notation in the tensor algebra world. In this work, we describe our design goals for EDGE and walk through the extensions we add to Einsums to support more complex operations common in graph algorithms. Additionally, we provide a few examples of how to express graph algorithms in our proposed notation. We hope that a single, mathematical notation for graph algorithms will (1) allow researchers to more easily compare different algorithms and different implementations of a graph algorithm; (2) enable developers to factor complexity by separating the concerns of what to compute (described with the extended Einsum notation) from the lower level details of how to compute; and (3) enable the discovery of different algorithmic variants of a problem through algebraic manipulations and transformations on a given EDGE expression. 2024-04-17T17:42:48Z 116 pages, 15 figures Revision with updated semantics section and cleaner proofs Toluwanimi O. Odemuyiwa Serban D. Porumbescu Nandeeka Nayak Michael Pellauer Joel S. Emer John D. Owens http://arxiv.org/abs/2511.00254v3 Uncrossed Multiflows and Applications to Disjoint Paths 2026-05-26T22:46:14Z A multiflow in a planar graph is uncrossed if its support paths do not cross. Recently such flows have played a role in approximation algorithms for maximum disjoint paths in "fully-planar" instances, where the combined supply-demand graph is planar, as well as low-congestion unsplittable flows for fully-planar and single-source instances. We investigate the utility of uncrossed flow more generally and ask three key questions. First, are there other interesting planar multiflow instances that admit uncrossed flows? We answer affirmatively, demonstrating a new family of "pairwise-planar" instances whose flows can be uncrossed. This family subsumes fully-planar but includes substantially more, such as fully-compliant series-parallel instances and some instances that have large clique demand graphs. Second, can we always round a fractional uncrossed flow to a "good" integral flow? We again answer positively. For maximization problems, we obtain integral flows with a constant fraction of the original value. For congestion problems (where we fully route all given demands), we obtain integral flows with edge congestion 2. Consequently, we obtain constant-factor approximation algorithms for maximum disjoint paths and minimum congestion integer multiflow for pairwise-planar instances, and show such instances have a constant integral flow-multicut gap. Finally, given a planar multiflow instance, can we determine if there exists a congestion-1 uncrossed fractional flow (congestion) or find the maximum value uncrossed fractional flow (maximization)? For congestion, we show this problem is NP-hard, but finding uncrossed edge-disjoint paths is polytime solvable if the demands span a bounded number of faces. For maximization, we present a strong inapproximability result. 2025-10-31T20:52:56Z Proof sketch added for Lemma 3.6, added integral flow-multicut gap corollary, improved figures for and clarified strongly uncrossed flows Chandra Chekuri Guyslain Naves Joseph Poremba F. Bruce Shepherd http://arxiv.org/abs/2605.14112v2 Fast Leaf-to-Ancestor Minimum Query in the Oracle Model 2026-05-26T19:12:27Z We study leaf-to-ancestor path-minimum queries on a rooted, weighted tree in the oracle model, where the only allowed value operation is a comparison oracle on edge (or node) weights. We give a static data structure that, after O(n log h) preprocessing time, space, and oracle calls (where n is the number of nodes and h is the tree height), answers any leaf-to-ancestor query in $O(1)$ worst-case time with zero oracle calls at query time. The method combines (I) an edge-to-node weight conversion with a deterministic tie-break to obtain a total order; (II) ladder (longest-path) decomposition; (III) binary lifting; and (IV) sparse-table RMQ built over ladder arrays, storing indices selected via the oracle during preprocessing. We also show that the preprocessing oracle-comparison bound is tight in the deterministic comparison model. 2026-05-13T20:56:02Z Aleksey Upirvitskiy Aleksandr Levin http://arxiv.org/abs/2605.27594v1 Proper Agnostic Learning of Functions of Halfspaces under Gaussian Marginals 2026-05-26T19:07:06Z We study the problem of computationally efficient proper agnostic learning of multidimensional concept classes under the Gaussian distribution. In this setting, given i.i.d. labeled samples from an unknown distribution over $\mathbb{R}^d \times \{\pm 1\}$ whose marginal on $\mathbb{R}^d$ is Gaussian, the goal is to output a hypothesis from a target class $\mathcal{F}$ whose 0-1 loss is within $ε$ of that of the best classifier in $\mathcal{F}$. We give the first efficient proper agnostic learning algorithm for arbitrary Boolean functions of $K$ halfspaces under Gaussian marginals. Our algorithm runs in time $d^{O(K^2 \log(1/ε)/ε^2)} + (K/ε)^{O(K^3/ε^{2.5})}$. Prior to our work, the only known algorithm for $K \geq 2$ was brute-force search, with run-time exponential in $d$. Moreover, the dependence of our run-time on the dimension $d$ matches that of the best known improper learning algorithm, namely $d^{\widetilde{O}(K^2/ε^2)}$. For the special case of a single halfspace ($K=1$), the best previous run-time was $d^{O(1/ε^4)} + (1/ε)^{O(1/ε^6)}$. Our algorithm improves this to $d^{O(1/ε^2)} + (1/ε)^{O(1/ε^{2.5})}$. Once again, the dependence on $d$ matches that of the best known improper algorithm, namely $d^{O(1/ε^2)}$. Furthermore, the dependence of our run-time on the dimension $d$ is essentially optimal in the statistical query model. 2026-05-26T19:07:06Z Sergei Tikhonov Arsen Vasilyan http://arxiv.org/abs/2605.27490v1 Tree Search With Predictions 2026-05-26T16:19:48Z ``Algorithms with predictions'', or ``learning-augmented algorithms'', has proved to be an extremely useful paradigm for combining machine learning with traditional algorithms. One of the textbook settings for this is searching a sorted array. Without a prediction, classical binary search takes $O(\log n)$ queries, while with a prediction we can use ``doubling binary search'' to find the target key using $O(\log η)$ queries, where $η$ is the error of the prediction measured as the absolute value of the difference between the true location and the predicted location. Since an array is just a path graph, in this paper we ask whether similar bounds can be achieved for search on even slightly more general graphs: trees. We show first that the high-level answer is ``no'': there is no search algorithm that uses $O(\log η)$ queries, where $η$ is now the graph distance between the predicted location and the true location. However, as our main result, we show that such bounds can be achieved on trees which are ``path-like'' in that they have low \emph{pathwidth}. In particular, we prove that there is a search algorithm which uses at most $O(k \log η)$ queries, where $k$ is the pathwidth of the tree. We also prove a lower bound showing that our algorithm has existentially optimal query complexity. Finally, we show experimentally, on real-life inputs, that our algorithm has query complexity which is notably better than the simple non-prediction-based algorithm. 2026-05-26T16:19:48Z Michael Dinitz Bob Dong http://arxiv.org/abs/2605.27147v1 Virtual-Memory Powersort 2026-05-26T15:10:06Z We give a more space-efficient implementation of adaptive mergesort: Virtual-Memory Powersort. Using internal buffering techniques, we significantly reduce the memory consumption of the algorithm; specifically, for sorting $n$ objects the required buffer area is reduced from space for $n/2$ objects to $O(\sqrt{n \log n})$ objects. While this space-efficiency can be achieved (indeed reduced to $O(1)$) conceptually very easily with known inplace merging algorithms, using these as a drop-in replacement for the standard merge algorithm incurs a substantial slow-down. Virtual-Memory Powersort, by contrast, uses the same number of moves and comparisons as previous Powersort implementations up to an additive $O(n)$ term. We report on an empirical running-time study comparing our implementation against other Powersort variants and state-of-the-art stable sorting methods, demonstrating that almost in-place stable sorting can be achieved with negligible overhead in many scenarios. 2026-05-26T15:10:06Z Finn Moltmann Tamio-Vesa Nakajima Sebastian Wild http://arxiv.org/abs/2605.27098v1 Improved Hardness Results for Nash Social Welfare, Budgeted Allocation and GAP via the Unique Games Conjecture 2026-05-26T14:39:44Z We consider the problem of dividing a set of indivisible goods among agents with additive valuations. This problem has been studied under various objectives in both the computer science and the operations research literature. Our main contribution is a novel dictator test using this problem, which can separate a dictator from any function sufficiently far from a dictator. We use this test to prove the following hardness results (assuming the unique games conjecture is true): (1) We show that it is NP-hard to approximate the max Nash welfare by a factor better than $\sqrt[3]{\frac{81}{65}} - \varepsilon \approx 1.0761$. This improves on the previous best known inapproximability factor of $\sqrt{\frac87} - \varepsilon \approx 1.069$. (2) We show that it is NP-hard to approximate the maximum budgeted allocation by a factor better than $\frac{243}{227} - \varepsilon \approx 1.07$. This improves on the previous best known inapproximability factor of $\frac{16}{15} - \varepsilon \approx 1.067$. (3) We show that it is NP-hard to approximate the max generalized assignment problem (GAP) by a factor better than $\frac{145}{129} - \varepsilon \approx 1.124$. This improves on the previous best known inapproximability factor of $\frac{11}{10} - \varepsilon \approx 1.10$. 2026-05-26T14:39:44Z To Appear at EC 2026 Vignesh Viswanathan http://arxiv.org/abs/2305.03697v2 Fault-Tolerant ST-Diameter Oracles 2026-05-26T12:16:27Z Given two vertex sets $S$ and $T$ in a graph, the $ST$-diameter is the maximum $s$-$t$-distance between vertices $s \in S$ and $t \in T$. We study the problem of estimating the $ST$-diameter of graphs that are subject to a small number of transient edge failures. An $f$-edge fault-tolerant $ST$-diameter oracle ($f$-FDO-$ST$) is a data structure that preprocesses a graph $G$, sets $S$, $T$, and a positive integer $f$. When queried with a set $F$ of at most $f$ failing edges, the oracle returns an estimate $\widehat{D}$ of the $ST$-diameter in $G-F$. The oracle is said to have stretch $σ\geq 1$ if $\operatorname{diam}(G{-}F,S,T) \leq \widehat{D} \leq σ\cdot \operatorname{diam}(G{-}F,S,T)$. We design new $f$-FDO-$ST$s by reducing their construction to that of all-pairs and single-source distance sensitivity oracles ($f$-DSOs). These are data structures that estimate the pairwise graph distances, or respectively the distances from a distinguished source, under up to $f$ failures. We obtain several new trade-offs between the size of the $ST$-diameter oracles, their stretch guarantees, query and preprocessing times by combining our black-box reductions with $f$-DSO results from the literature. We further provide a lower bound on the space requirement of approximate $ST$-diameter oracles. We prove that there exists a family of graphs for which any $f$-FDO-$ST$ with sensitivity $f \ge 2$ and stretch better than $5/3$ requires $Ω(n^{3/2})$ bits of space, regardless of the query time. 2023-05-05T17:20:00Z ICALP 2023, Algorithmica 2026 Davide Bilò Keerti Choudhary Sarel Cohen Tobias Friedrich Simon Krogmann Martin Schirneck http://arxiv.org/abs/2605.26908v1 On the Detection of Commutative Factors in Factor Graphs: Necessary and Sufficient Conditions 2026-05-26T12:05:53Z Exploiting the indistinguishability of objects in a probabilistic graphical model such as a factor graph is key to lifted probabilistic inference algorithms and allows for tractable probabilistic inference problems with respect to domain sizes. A central building block for the exploitation of indistinguishable objects in factor graphs is the identification of commutative factors, i.e., factors whose output values are invariant under permutations of input values assigned to a subset of their arguments. In this paper, we revisit the theoretical foundations underlying the state-of-the-art algorithm to detect commutative factors. Specifically, we show that in its current form, the state-of-the-art algorithm relies on a central theorem that is mistakenly regarded as a sufficient condition to identify commutative factors, while it actually only implies necessary condition. Consequently, the state of the art might, as we show in this paper, deliver incorrect results. To fix the flaws currently present in the state of the art, we prove a slightly modified version of the aforementioned theorem, which serves as a necessary condition to identify commutative factors. Moreover, we present a corrected version of the state-of-the-art algorithm, which keeps its efficiency while ensuring correctness and introduce a complementary algorithm with tighter worst-case bounds. 2026-05-26T12:05:53Z Malte Luttermann Ralf Möller Marcel Gehrke http://arxiv.org/abs/2605.26886v1 Parsimonious Learning-Augmented Online Metric Matching 2026-05-26T11:47:58Z Learning-augmented algorithms have received significant attention in recent years, particularly in the context of online optimization. Motivated by the high computational cost of generating predictions, a growing line of work studies the tradeoff between performance guarantees and the number of predictions used in learning-augmented algorithms for problems such as caching and metrical task systems. In this paper, we extend this line of research to online metric matching by developing parsimonious learning-augmented algorithms and establishing lower bounds on their performance. Our approach extends the Follow-the-Prediction framework to the parsimonious setting by filling in a virtual prediction in the absence of an actual prediction, using an online metric matching algorithm that maintains good intermediate matchings throughout its execution. We complement our theoretical results with an empirical evaluation, demonstrating the practical effectiveness of our approach. 2026-05-26T11:47:58Z To appear in ICML 2026 Yongho Shin Phanu Vajanopath http://arxiv.org/abs/2605.26816v1 Where to Split and When to Charge: Optimal Route Construction from Customer Permutations in Electric Vehicle Routing 2026-05-26T10:32:36Z Permutation-based metaheuristics are widely used for electric vehicle routing, where candidate solutions are represented as ordered sequences of customers. Such sequences, however, do not directly define feasible vehicle routes: they must be decoded by choosing where to split the permutation into routes and where to insert charging-station visits, subject to cargo capacity and battery constraints. These decisions are inherently interdependent, since each return to the depot both separates consecutive routes and restores the vehicle battery. This paper formalizes the task as the Fixed-Permutation Splitting and Charging Problem and proposes an exact forward labeling algorithm that constructs a minimum-distance feasible decoding of a fixed customer permutation using dynamic programming with dominance pruning. We further derive restricted variants representing increasingly simplified decoding strategies: first separating route splitting from charging-station insertion, and then additionally limiting each inter-customer segment to at most one charging-station visit. Computational experiments on benchmark and randomly generated instances, including comparisons with heuristic decoders from the literature, confirm that the exact decoder remains tractable in practice and reveal a clear hierarchy among decoding strategies. The most restrictive variant achieves runtimes close to those of heuristic decoders while delivering substantially higher decoding success rates and better solution quality. Less restrictive variants further improve quality and robustness at the cost of additional runtime. The exact joint decoder provides the optimal reference for each fixed permutation, clarifying the trade-offs introduced by common decoding simplifications. 2026-05-26T10:32:36Z 28 pages, 6 figures Leon Stjepan Uroić Marko Đurasević