https://arxiv.org/api/kjgLyXigBK0qaGXgrrGjrEgAfSY2026-06-18T14:31:21Z2901327015http://arxiv.org/abs/2602.04463v2Simple Algorithms for Bad Triangle Transversals with Applications to Correlation Clustering2026-05-27T07:18:56ZThe Bad Triangle Transversal (BTT) problem asks for the smallest set of edges that need to be removed from a given signed graph, so that the resulting graph does not have a bad triangle. Here, a bad triangle is a triangle with exactly one negative edge. Several 2-approximations for BTT are proposed in this paper. On the hardness side, we show that BTT is NP-hard to approximate with factor better than $\frac{2137}{2136}$ on complete graphs. Our reduction also works for Correlation Clustering (CC), the Cluster Deletion problem (CD) and the Minimum Strong Triadic Closure problem (MinSTC). Lastly, we show that the BTT and CC optima are within a factor of 3/2 in complete graphs, by describing a pivot procedure that transforms transversals into clusters.2026-02-04T11:48:41ZAccepted to ICML 2026 (Spotlight)Florian AdriaensNikolaj tattihttp://arxiv.org/abs/2605.27998v1Efficient Algorithms for Interdicting Facilities in Trees and Bounded Treewidth Graphs2026-05-27T05:41:29ZGiven a graph $G$ of $n$ nodes partitioned into facilities and customers, the $r$-edge interdiction covering problem (REIC) is to remove up to $r$ edges so as to maximize the total weight of customers disconnected from all facilities, which is called the covering objective function. While REIC is known to be NP-complete for general graphs, Fröhlich and Ruzika show that the problem can be solved in polynomial time when $G$ is a tree, providing an $O(n^7 r)$-time algorithm. We give an efficient $O(nr^2)$-time dynamic programming algorithm for REIC on trees that is fixed-parameter linear in $n$. Evaluating our solution on a benchmark of randomly generated tree networks with baselines of the Fröhlich and Ruzika algorithm and the Gurobi integer program solver, we demonstrate that in practice, our algorithm is both significantly faster and less sensitive to network topology and size.
We extend our algorithm for REIC to graphs of bounded treewidth, a well-studied family of sparse graphs that generalizes trees, and obtain a matching runtime of $O(nr^2)$. We also consider the $r$-facility interdiction covering problem (RFIC), a novel variant of this network interdiction problem where the goal is to remove up to $r$ facilities to maximize the covering objective function over disconnected customers. We show that RFIC is NP-complete by observing it generalizes the small set bipartite vertex expansion problem (SSBVE), also known as the minimum $p$-union problem. We give an $O(nr^2)$-time algorithm for RFIC on trees, which also gives an $O(n^3)$-time algorithm for SSBVE on trees.2026-05-27T05:41:29ZAli AbbasiEli FriedmanLeana GolubchikSamir KhullerMarco Paolierihttp://arxiv.org/abs/2605.27942v1Quantum principal component analysis without eigenvector recovery2026-05-27T04:27:47ZPrincipal component analysis (PCA) is traditionally implemented through a covariance or kernel matrix, leading-eigenvector extraction, and hard rank-$k$ projection. These steps can be computationally costly in high-dimensional and quantum-data settings, sensitive to small eigengaps, and unnecessary when downstream tasks only require principal-subspace scores. Such score-based objectives are important in applications such as anomaly detection, spectral-energy profiling, and other postselection tasks. To address these needs, we introduce a measurement-based soft PCA framework replacing the hard top-$k$ projector with an entropy-regularized Fermi--Dirac filter. This filter is the unique optimizer of an entropy-regularized variational formulation of PCA and converges to the classical PCA projector in the zero-temperature limit.
This filter has a direct interpretation as a quantum measurement, which naturally suggests a quantum approach. For centered covariance operators represented by quantum feature states, a single fixed circuit, together with threshold calibration, accesses all optimal filters for different rank budgets or retained-variance levels without rank-dependent circuit updates or eigenvector recovery. For new inputs, the same calibrated quantum circuit yields soft principal subspace scores, spectral energy profiles, and postselected filtered states. The required centering of both training and test data is performed coherently inside the quantum protocol, which is particularly important for quantum data where no classical feature vectors or centered Gram matrix are directly available. By reframing PCA as a calibrated measurement task, this framework bypasses the need for iterative eigenvector extraction and achieves a dimension-independent sample complexity $O(η^{-2})$ for normalized fractional-rank or retained variance scoring at additive accuracy $η$.2026-05-27T04:27:47ZYewei YuanMichele MinerviniMark M. WildeNana Liuhttp://arxiv.org/abs/2605.27769v1Smoothed Score Queries and the Complexity of Sampling2026-05-26T23:38:20ZWe study the query complexity of sampling from high-dimensional Gaussian distributions using gradient information. In the standard oracle model, exact gradients expose only matrix-vector products with the precision matrix, leading to polynomial approximation barriers and a characteristic \(\sqrtκ\) dependence on the condition number. We show that this barrier disappears when the sampler is allowed to query \emph{smoothed scores}, namely gradients of the logarithms of the Gaussian-convolved densities. For a Gaussian target with precision matrix \(Λ\), a smoothed-score query at noise level \(τ\) gives access to the resolvent \((Λ+τ^{-1}I)^{-1}\). Combining geometrically spaced noise levels with sinc-quadrature rational approximation, we obtain a sampler with $q=O\!\left(\bigl(\logκ+\log(e\sqrt d/δ_{\rm TV})\bigr)\log(e\sqrt d/δ_{\rm TV})\right)$ smoothed-score queries for total variation error \(δ_{\rm TV}\), improving the condition-number dependence from \(\sqrtκ\) to logarithmic. We also study finite-bit gradient oracles. Using coordinatewise quantization of the transformed smoothed-score answers and a final dithering step, we obtain a sampling scheme whose total communicated gradient information is polylogarithmic in \(κ\); in particular, for fixed dimension and accuracy, the bit complexity is \(O(\log^2κ)\). To complement these upper bounds, we introduce a channel-synthesis, or reverse-Shannon, converse technique for sampling lower bounds. This converts total-variation simulation guarantees into communication requirements and yields an \(Ω(\logκ)\) lower bound on the required gradient information. Together, these results identify smoothed scores as a provably more informative oracle for sampling and give nearly matching upper and lower bounds for its finite-bit complexity.2026-05-26T23:38:20ZJingbo Liuhttp://arxiv.org/abs/2404.11591v3The EDGE Language: Extended General Einsums for Graph Algorithms2026-05-26T22:47:36ZIn this work, we propose a unified abstraction for graph algorithms: the Extended General Einsums language, or EDGE. The EDGE language expresses graph algorithms in the language of tensor algebra, providing a rigorous, succinct, and expressive mathematical framework. EDGE leverages two ideas: (1) the well-known foundations provided by the graph-matrix duality, where a graph is simply a 2D tensor, and (2) the power and expressivity of Einsum notation in the tensor algebra world. In this work, we describe our design goals for EDGE and walk through the extensions we add to Einsums to support more complex operations common in graph algorithms. Additionally, we provide a few examples of how to express graph algorithms in our proposed notation. We hope that a single, mathematical notation for graph algorithms will (1) allow researchers to more easily compare different algorithms and different implementations of a graph algorithm; (2) enable developers to factor complexity by separating the concerns of what to compute (described with the extended Einsum notation) from the lower level details of how to compute; and (3) enable the discovery of different algorithmic variants of a problem through algebraic manipulations and transformations on a given EDGE expression.2024-04-17T17:42:48Z116 pages, 15 figures Revision with updated semantics section and cleaner proofsToluwanimi O. OdemuyiwaSerban D. PorumbescuNandeeka NayakMichael PellauerJoel S. EmerJohn D. Owenshttp://arxiv.org/abs/2511.00254v3Uncrossed Multiflows and Applications to Disjoint Paths2026-05-26T22:46:14ZA multiflow in a planar graph is uncrossed if its support paths do not cross. Recently such flows have played a role in approximation algorithms for maximum disjoint paths in "fully-planar" instances, where the combined supply-demand graph is planar, as well as low-congestion unsplittable flows for fully-planar and single-source instances.
We investigate the utility of uncrossed flow more generally and ask three key questions. First, are there other interesting planar multiflow instances that admit uncrossed flows? We answer affirmatively, demonstrating a new family of "pairwise-planar" instances whose flows can be uncrossed. This family subsumes fully-planar but includes substantially more, such as fully-compliant series-parallel instances and some instances that have large clique demand graphs. Second, can we always round a fractional uncrossed flow to a "good" integral flow? We again answer positively. For maximization problems, we obtain integral flows with a constant fraction of the original value. For congestion problems (where we fully route all given demands), we obtain integral flows with edge congestion 2. Consequently, we obtain constant-factor approximation algorithms for maximum disjoint paths and minimum congestion integer multiflow for pairwise-planar instances, and show such instances have a constant integral flow-multicut gap. Finally, given a planar multiflow instance, can we determine if there exists a congestion-1 uncrossed fractional flow (congestion) or find the maximum value uncrossed fractional flow (maximization)? For congestion, we show this problem is NP-hard, but finding uncrossed edge-disjoint paths is polytime solvable if the demands span a bounded number of faces. For maximization, we present a strong inapproximability result.2025-10-31T20:52:56ZProof sketch added for Lemma 3.6, added integral flow-multicut gap corollary, improved figures for and clarified strongly uncrossed flowsChandra ChekuriGuyslain NavesJoseph PorembaF. Bruce Shepherdhttp://arxiv.org/abs/2605.14112v2Fast Leaf-to-Ancestor Minimum Query in the Oracle Model2026-05-26T19:12:27ZWe study leaf-to-ancestor path-minimum queries on a rooted, weighted tree in the oracle model, where the only allowed value operation is a comparison oracle on edge (or node) weights. We give a static data structure that, after O(n log h) preprocessing time, space, and oracle calls (where n is the number of nodes and h is the tree height), answers any leaf-to-ancestor query in $O(1)$ worst-case time with zero oracle calls at query time. The method combines (I) an edge-to-node weight conversion with a deterministic tie-break to obtain a total order; (II) ladder (longest-path) decomposition; (III) binary lifting; and (IV) sparse-table RMQ built over ladder arrays, storing indices selected via the oracle during preprocessing. We also show that the preprocessing oracle-comparison bound is tight in the deterministic comparison model.2026-05-13T20:56:02ZAleksey UpirvitskiyAleksandr Levinhttp://arxiv.org/abs/2605.27594v1Proper Agnostic Learning of Functions of Halfspaces under Gaussian Marginals2026-05-26T19:07:06ZWe study the problem of computationally efficient proper agnostic learning of multidimensional concept classes under the Gaussian distribution. In this setting, given i.i.d. labeled samples from an unknown distribution over $\mathbb{R}^d \times \{\pm 1\}$ whose marginal on $\mathbb{R}^d$ is Gaussian, the goal is to output a hypothesis from a target class $\mathcal{F}$ whose 0-1 loss is within $ε$ of that of the best classifier in $\mathcal{F}$.
We give the first efficient proper agnostic learning algorithm for arbitrary Boolean functions of $K$ halfspaces under Gaussian marginals. Our algorithm runs in time $d^{O(K^2 \log(1/ε)/ε^2)} + (K/ε)^{O(K^3/ε^{2.5})}$. Prior to our work, the only known algorithm for $K \geq 2$ was brute-force search, with run-time exponential in $d$. Moreover, the dependence of our run-time on the dimension $d$ matches that of the best known improper learning algorithm, namely $d^{\widetilde{O}(K^2/ε^2)}$.
For the special case of a single halfspace ($K=1$), the best previous run-time was $d^{O(1/ε^4)} + (1/ε)^{O(1/ε^6)}$. Our algorithm improves this to $d^{O(1/ε^2)} + (1/ε)^{O(1/ε^{2.5})}$. Once again, the dependence on $d$ matches that of the best known improper algorithm, namely $d^{O(1/ε^2)}$. Furthermore, the dependence of our run-time on the dimension $d$ is essentially optimal in the statistical query model.2026-05-26T19:07:06ZSergei TikhonovArsen Vasilyanhttp://arxiv.org/abs/2605.27490v1Tree Search With Predictions2026-05-26T16:19:48Z``Algorithms with predictions'', or ``learning-augmented algorithms'', has proved to be an extremely useful paradigm for combining machine learning with traditional algorithms. One of the textbook settings for this is searching a sorted array. Without a prediction, classical binary search takes $O(\log n)$ queries, while with a prediction we can use ``doubling binary search'' to find the target key using $O(\log η)$ queries, where $η$ is the error of the prediction measured as the absolute value of the difference between the true location and the predicted location. Since an array is just a path graph, in this paper we ask whether similar bounds can be achieved for search on even slightly more general graphs: trees. We show first that the high-level answer is ``no'': there is no search algorithm that uses $O(\log η)$ queries, where $η$ is now the graph distance between the predicted location and the true location. However, as our main result, we show that such bounds can be achieved on trees which are ``path-like'' in that they have low \emph{pathwidth}. In particular, we prove that there is a search algorithm which uses at most $O(k \log η)$ queries, where $k$ is the pathwidth of the tree. We also prove a lower bound showing that our algorithm has existentially optimal query complexity. Finally, we show experimentally, on real-life inputs, that our algorithm has query complexity which is notably better than the simple non-prediction-based algorithm.2026-05-26T16:19:48ZMichael DinitzBob Donghttp://arxiv.org/abs/2605.27147v1Virtual-Memory Powersort2026-05-26T15:10:06ZWe give a more space-efficient implementation of adaptive mergesort: Virtual-Memory Powersort. Using internal buffering techniques, we significantly reduce the memory consumption of the algorithm; specifically, for sorting $n$ objects the required buffer area is reduced from space for $n/2$ objects to $O(\sqrt{n \log n})$ objects. While this space-efficiency can be achieved (indeed reduced to $O(1)$) conceptually very easily with known inplace merging algorithms, using these as a drop-in replacement for the standard merge algorithm incurs a substantial slow-down. Virtual-Memory Powersort, by contrast, uses the same number of moves and comparisons as previous Powersort implementations up to an additive $O(n)$ term. We report on an empirical running-time study comparing our implementation against other Powersort variants and state-of-the-art stable sorting methods, demonstrating that almost in-place stable sorting can be achieved with negligible overhead in many scenarios.2026-05-26T15:10:06ZFinn MoltmannTamio-Vesa NakajimaSebastian Wildhttp://arxiv.org/abs/2605.27098v1Improved Hardness Results for Nash Social Welfare, Budgeted Allocation and GAP via the Unique Games Conjecture2026-05-26T14:39:44ZWe consider the problem of dividing a set of indivisible goods among agents with additive valuations. This problem has been studied under various objectives in both the computer science and the operations research literature. Our main contribution is a novel dictator test using this problem, which can separate a dictator from any function sufficiently far from a dictator. We use this test to prove the following hardness results (assuming the unique games conjecture is true):
(1) We show that it is NP-hard to approximate the max Nash welfare by a factor better than $\sqrt[3]{\frac{81}{65}} - \varepsilon \approx 1.0761$. This improves on the previous best known inapproximability factor of $\sqrt{\frac87} - \varepsilon \approx 1.069$.
(2) We show that it is NP-hard to approximate the maximum budgeted allocation by a factor better than $\frac{243}{227} - \varepsilon \approx 1.07$. This improves on the previous best known inapproximability factor of $\frac{16}{15} - \varepsilon \approx 1.067$.
(3) We show that it is NP-hard to approximate the max generalized assignment problem (GAP) by a factor better than $\frac{145}{129} - \varepsilon \approx 1.124$. This improves on the previous best known inapproximability factor of $\frac{11}{10} - \varepsilon \approx 1.10$.2026-05-26T14:39:44ZTo Appear at EC 2026Vignesh Viswanathanhttp://arxiv.org/abs/2305.03697v2Fault-Tolerant ST-Diameter Oracles2026-05-26T12:16:27ZGiven two vertex sets $S$ and $T$ in a graph, the $ST$-diameter is the maximum $s$-$t$-distance between vertices $s \in S$ and $t \in T$. We study the problem of estimating the $ST$-diameter of graphs that are subject to a small number of transient edge failures. An $f$-edge fault-tolerant $ST$-diameter oracle ($f$-FDO-$ST$) is a data structure that preprocesses a graph $G$, sets $S$, $T$, and a positive integer $f$. When queried with a set $F$ of at most $f$ failing edges, the oracle returns an estimate $\widehat{D}$ of the $ST$-diameter in $G-F$. The oracle is said to have stretch $σ\geq 1$ if $\operatorname{diam}(G{-}F,S,T) \leq \widehat{D} \leq σ\cdot \operatorname{diam}(G{-}F,S,T)$. We design new $f$-FDO-$ST$s by reducing their construction to that of all-pairs and single-source distance sensitivity oracles ($f$-DSOs). These are data structures that estimate the pairwise graph distances, or respectively the distances from a distinguished source, under up to $f$ failures. We obtain several new trade-offs between the size of the $ST$-diameter oracles, their stretch guarantees, query and preprocessing times by combining our black-box reductions with $f$-DSO results from the literature. We further provide a lower bound on the space requirement of approximate $ST$-diameter oracles. We prove that there exists a family of graphs for which any $f$-FDO-$ST$ with sensitivity $f \ge 2$ and stretch better than $5/3$ requires $Ω(n^{3/2})$ bits of space, regardless of the query time.2023-05-05T17:20:00ZICALP 2023, Algorithmica 2026Davide BilòKeerti ChoudharySarel CohenTobias FriedrichSimon KrogmannMartin Schirneckhttp://arxiv.org/abs/2605.26908v1On the Detection of Commutative Factors in Factor Graphs: Necessary and Sufficient Conditions2026-05-26T12:05:53ZExploiting the indistinguishability of objects in a probabilistic graphical model such as a factor graph is key to lifted probabilistic inference algorithms and allows for tractable probabilistic inference problems with respect to domain sizes. A central building block for the exploitation of indistinguishable objects in factor graphs is the identification of commutative factors, i.e., factors whose output values are invariant under permutations of input values assigned to a subset of their arguments. In this paper, we revisit the theoretical foundations underlying the state-of-the-art algorithm to detect commutative factors. Specifically, we show that in its current form, the state-of-the-art algorithm relies on a central theorem that is mistakenly regarded as a sufficient condition to identify commutative factors, while it actually only implies necessary condition. Consequently, the state of the art might, as we show in this paper, deliver incorrect results. To fix the flaws currently present in the state of the art, we prove a slightly modified version of the aforementioned theorem, which serves as a necessary condition to identify commutative factors. Moreover, we present a corrected version of the state-of-the-art algorithm, which keeps its efficiency while ensuring correctness and introduce a complementary algorithm with tighter worst-case bounds.2026-05-26T12:05:53ZMalte LuttermannRalf MöllerMarcel Gehrkehttp://arxiv.org/abs/2605.26886v1Parsimonious Learning-Augmented Online Metric Matching2026-05-26T11:47:58ZLearning-augmented algorithms have received significant attention in recent years, particularly in the context of online optimization. Motivated by the high computational cost of generating predictions, a growing line of work studies the tradeoff between performance guarantees and the number of predictions used in learning-augmented algorithms for problems such as caching and metrical task systems. In this paper, we extend this line of research to online metric matching by developing parsimonious learning-augmented algorithms and establishing lower bounds on their performance. Our approach extends the Follow-the-Prediction framework to the parsimonious setting by filling in a virtual prediction in the absence of an actual prediction, using an online metric matching algorithm that maintains good intermediate matchings throughout its execution. We complement our theoretical results with an empirical evaluation, demonstrating the practical effectiveness of our approach.2026-05-26T11:47:58ZTo appear in ICML 2026Yongho ShinPhanu Vajanopathhttp://arxiv.org/abs/2605.26816v1Where to Split and When to Charge: Optimal Route Construction from Customer Permutations in Electric Vehicle Routing2026-05-26T10:32:36ZPermutation-based metaheuristics are widely used for electric vehicle routing, where candidate solutions are represented as ordered sequences of customers. Such sequences, however, do not directly define feasible vehicle routes: they must be decoded by choosing where to split the permutation into routes and where to insert charging-station visits, subject to cargo capacity and battery constraints. These decisions are inherently interdependent, since each return to the depot both separates consecutive routes and restores the vehicle battery. This paper formalizes the task as the Fixed-Permutation Splitting and Charging Problem and proposes an exact forward labeling algorithm that constructs a minimum-distance feasible decoding of a fixed customer permutation using dynamic programming with dominance pruning. We further derive restricted variants representing increasingly simplified decoding strategies: first separating route splitting from charging-station insertion, and then additionally limiting each inter-customer segment to at most one charging-station visit. Computational experiments on benchmark and randomly generated instances, including comparisons with heuristic decoders from the literature, confirm that the exact decoder remains tractable in practice and reveal a clear hierarchy among decoding strategies. The most restrictive variant achieves runtimes close to those of heuristic decoders while delivering substantially higher decoding success rates and better solution quality. Less restrictive variants further improve quality and robustness at the cost of additional runtime. The exact joint decoder provides the optimal reference for each fixed permutation, clarifying the trade-offs introduced by common decoding simplifications.2026-05-26T10:32:36Z28 pages, 6 figuresLeon Stjepan UroićMarko Đurasević