https://arxiv.org/api/8ty+PFOGqOZDv7WjKJk+iCuWQEg2026-06-21T17:40:08Z2901957015http://arxiv.org/abs/2605.06319v1Designing Capacitated Subnetworks for Shortest Path Routing2026-05-07T14:19:08ZIn pursuit of higher energy efficiency in computer networks, one subfield of green traffic engineering aims at reducing the size of a network during times of low traffic, while still guaranteeing the ability to route all occurring demands. In this setting, we have to simultaneously solve a network design problem (choosing connections to deactivate) and a routing problem (routing paths in the active subnetwork, adhering to some routing protocol). Interestingly, there seems to be no available method to tackle the problem as a whole for the simplest (and still most commonly used) routing paradigm: shortest path routing. State-of-the-art methods either do not consider capacities, or assume that the routing paths should not change when deactivating network connections, or separate the problem into its two constituents, first solving the network design problem (using some estimators in lieu of the precise routing protocol) and only then the actual routing problem. In this paper, we present an algorithm to tackle the full combined problem exactly via a novel integer linear program, modeling dynamically changing shortest paths. To solve it, we need to devise a special-purpose column generation method. To speed up the solution process, we further propose additional provably strengthening constraints. Now having the means to yield true optimal solutions for (small) practical instances, we can for the first time give an in-depth experimental evaluation that includes the absolute quality intrinsic to the above simplifying algorithms. It turns out that the arguably simplest method--first computing a routing, fixing it, and turning off all superfluous connections--yields solutions surprisingly close to the true optimum in practice. When considering multiple different traffic demands, a recent traffic-oblivious approach (TOCA) performs best, while being comparatively straightforward to implement.2026-05-07T14:19:08ZMarkus ChimaniMax Ilsenhttp://arxiv.org/abs/2605.06262v1Bilateral Treewidth for QBF: Where Strategies and Resolution Meet2026-05-07T13:37:29ZTreewidth is a well-studied decompositional parameter to measure the tree-likeness of a graph. While the propositional satisfiability problem (SAT) is known to be tractable when parameterized by the treewidth of the underlying primal graph, the evaluation of quantified Boolean formulas (QBFs) remains PSPACE-complete even on formulas of constant treewidth. Intuitively, this is because ordinary treewidth does not take into account the prefix of the QBF: it neither distinguishes between existential and universal variables, nor accounts for the order in which they are quantified. In the past, several weaker variants of treewidth have been devised to incorporate prefix-sensitive information. To establish tractability for QBFs under these notions, prior work has employed either strategy- or resolution-based techniques, thereby dividing the parameterized complexity landscape of QBF into two regimes that are incomparable in strength. We establish fixed-parameter tractability with respect to bilateral treewidth, a novel and strictly more powerful decompositional parameter that combines these rivaling approaches by simultaneously allowing for branching on strategies and performing Q-resolution. As in previous works in this direction, our algorithm assumes that a suitable tree decomposition is provided on the input.2026-05-07T13:37:29ZRobert GanianMarlene Gründelhttp://arxiv.org/abs/2605.06211v1Contrastive Identification and Generation in the Limit2026-05-07T13:14:51ZIn the classical identification in the limit model of Gold [1967], a stream of positive examples is presented round by round, and the learner must eventually recover the target hypothesis. Recently, Kleinberg and Mullainathan [2024] introduced generation in the limit, where the learner instead must eventually output novel elements of the target's support. Both lines of work focus on positive-only or fully labeled data. Yet many natural supervision signals are inherently relational rather than singleton, which encode relationships between examples rather than labels of individual ones. We initiate the study of contrastive identification and generation in the limit, where the learner observes a contrastive presentation of data: a stream of unordered pairs $\{x,y\}$ satisfying $h(x)\ne h(y)$ for an unknown target binary hypothesis $h$, but which element is positive is hidden from the learner. We first present three results in the noiseless setting: an exact characterization of contrastive identifiable classes (a one-line geometric refinement of Angluin [1980]'s tell-tale condition), a combinatorial dimension called contrastive closure dimension (a contrasitive analogue of the closure dimension in Raman et al. [2025]) and exactly characterizing uniform contrastive generation with tight sample complexity, and a strict hierarchy in which contrastive generation and text identification are mutually incomparable. We then prove a sharp reversal under finite adversarial corruption: there exist classes identifiable from contrastive pairs under any finite corruption budget by a single budget-independent algorithm, yet not identifiable from positive examples under even one corrupted observation. The unifying technical object is the common crossing graph, which encodes pairwise ambiguity, family-level generation obstructions, and corruption defects in a single coverage-and-incidence language.2026-05-07T13:14:51ZXiaoyu LiAndi HanJiaojiao JiangJunbin Gaohttp://arxiv.org/abs/2605.06014v1Quantizing With Randomized Hadamard Transforms: Efficient Heuristic Now Proven2026-05-07T11:11:54ZUniform random rotations (URRs) are a common preprocessing step in modern quantization approaches used for gradient compression, inference acceleration, KV-cache compression, model weight quantization, and approximate nearest-neighbor search in vector databases. In practice, URRs are often replaced by randomized Hadamard transforms (RHTs), which preserve orthogonality while admitting fast implementations. The remaining issue is the performance for worst-case inputs. With a URR, each coordinate is individually distributed as a shifted beta distribution, which converges to a Gaussian distribution in high dimensions. Generally, one RHT is not suitable in the worst case, as individual coordinates can be far from these distributions. We show that after composing two RHTs on any $d$-sized input vector, the marginal distribution of every fixed coordinate of the normalized rotated vector is within $O(d^{-1/2})$ of a standard Gaussian both in Kolmogorov distance and in $1$-Wasserstein distance. We then plug these bounds into the analyses of modern compression schemes, namely DRIVE and QUIC-FL, and show that two RHTs achieve performance that asymptotically matches URRs.
However, we show that two RHTs may not be sufficient for Vector Quantization (VQ), which often requires weak correlation across fixed-size blocks of coordinates (as opposed to only marginal distribution convergence for single coordinates). We prove that a composition of three RHTs leads to decaying coordinate covariance. This ensures that any fixed, bounded, multi-dimensional VQ codebook optimized for URRs has the same expected error when using three RHTs, up to an additive term that vanishes with the dimension.
Finally, because practical inputs are rarely adversarial, we propose a linear-time ${O}(d)$ check on the input's moments to dynamically adapt the number of RHTs used at runtime to improve performance.2026-05-07T11:11:54ZRan Ben-BasatWilliam KuszmaulMichael MitzenmacherAmit PortnoyShay Vargaftikhttp://arxiv.org/abs/2605.05954v1Label Correcting Algorithms for the Multiobjective Temporal Shortest Path Problem2026-05-07T10:03:18ZGiven a directed, discrete-time temporal graph $G=(V,R)$, a start node $s\in V$, and $p\geq1$ objectives, the single-source multiobjective temporal shortest path problem asks, for each $v\in V$, for the set of nondominated images of temporal $s$-$v$-paths together with a corresponding efficient path for each image. A recent general label setting algorithm for this problem relies on two properties of the objectives - monotonicity and isotonicity. Monotonicity generalizes the nonnegativity assumption required by label setting methods for the classical additive single-objective shortest path problem on static graphs, while isotonicity ensures that the order of the objective values of two paths is preserved when both are extended by the same arc.
In this paper, we study the problem without assuming monotonicity and/or isotonicity. A key difficulty in this setting is that zero-duration temporal cycles may need to be traversed an arbitrary finite number of times to generate all nondominated images. This motivates the study of a restricted problem variant in which a maximum admissible path length $K$ is imposed, and only paths containing at most $K$ arcs are considered. We develop general label correcting algorithms for this setting and establish several sufficient conditions under which such a bound is not required, implying that the algorithms compute all nondominated images.2026-05-07T10:03:18ZEdina MaricaClemens ThielenAlina Wittmannhttp://arxiv.org/abs/2605.05877v1Discrete Optimal Transport: Rapid Convergence of Simulated Annealing Algorithms2026-05-07T08:44:17ZWe develop a discrete optimal transport framework for analyzing simulated annealing algorithms on finite state spaces. Building on the discrete Wasserstein metric introduced by Maas (J. Funct. Anal., 2011), we define a generalized discrete Wasserstein-2 distance and the associated notion of \emph{discrete action} for paths of probability measures on graphs. Using these tools, we establish non-asymptotic convergence guarantees for simulated annealing: the KL divergence between the algorithm's output and the target distribution is controlled by the discrete action of the annealing path. This can be viewed as the discrete counterpart of the action-based analysis of annealed Langevin dynamics in continuous spaces by Guo, Tao, and Chen (ICLR 2025).
As applications, we analyze simulated annealing for two fundamental models in statistical physics. For the \emph{mean-field Ising model}, we show that annealed single-site Glauber dynamics achieves $\varepsilon$ error in KL divergence in $O(n^5β^2/\varepsilon)$ steps at \emph{any} inverse temperature $β\ge 0$. For the \emph{mean-field $q$-state Potts model}, we show that annealed $(q-1)$-block Glauber dynamics achieves $\varepsilon$ error in $\mathrm{poly}(n, β, 1/\varepsilon)$ steps for all $β\ge β_{\mathsf{s}}=q/2$, the regime where the disordered phase has completely lost stability. In both cases, the key technical contribution is a polynomial upper bound on the discrete action, obtained by exploiting the symmetry of the model to reduce the analysis to a low-dimensional projected chain.2026-05-07T08:44:17Z67 pages, 1 figureYuchen HeTianhui JiangSihan WangChihao Zhanghttp://arxiv.org/abs/2604.24237v2Computational Complexity of the Interval Ordering Problem2026-05-07T08:21:30ZWe study an interval ordering problem introduced by Dürr et al. [Discrete Appl. Math. 2012] which is motivated by applications in bioinformatics. The task is to order a given set of n intervals with the goal of minimizing a certain objective which is defined via a given cost function $f$ which assigns a cost to the exposed part of each interval (that is, the pieces not covered by previous intervals). We develop a dynamic programming approach which solves the problem with $O(2^n\text{poly}(n))$ oracle calls to $f$ and arithmetic operations. Moreover, our approach yields polynomial-time algorithms for all cost functions $f$ such that $f-f(0)$ is subadditive or superadditive. This answers an open question for the function $f(x)=2^x$. We contrast these results by proving a running time lower bound of $2^{n-1}$ for any algorithm that solves the problem for every function $f$ (with oracle access only) and further proving NP-hardness for some classes of simple functions. Thus, we significantly narrow the gap regarding the computational complexity of the problem.2026-04-27T09:46:02ZSimeon PawlowskiVincent Froesehttp://arxiv.org/abs/2605.05837v1An Additive Approximation Scheme for Generating Dyadic Codings for the Outputs of an LLM2026-05-07T08:11:37ZWe study the problem of approximating a discrete probability distribution, such as the next-token distribution of a large language model, by a dyadic distribution induced by a binary tree under encoding rate constraints. The objective is to partition the support of the distribution and assign dyadic probabilities to minimize total variation distance while achieving a prescribed rate. We formulate this task as a tree-based partitioning problem and develop a polynomial-time additive approximation scheme for the rate-constrained setting in the constant-rate regime. Our results provide provable guarantees for near-optimal dyadic approximations and, as an application, yield a principled framework for LLM-based steganography, where the rate maps to bits of hidden information embedded per token and the total variation bound controls statistical detectability.2026-05-07T08:11:37ZDaniella Bar-LevFarzad FarnoudRyan Gabryshttp://arxiv.org/abs/2603.24880v2The Four Color Theorem with Linearly Many Reducible Configurations and Near-Linear Time Coloring2026-05-07T08:01:39ZWe give a near-linear time 4-coloring algorithm for planar graphs, improving on the previous quadratic time algorithm by Robertson et al. from 1996. Such an algorithm cannot be achieved by the known proofs of the Four Color Theorem (4CT). Technically speaking, we show the following significant generalization of the 4CT: every planar triangulation contains linearly many pairwise non-touching reducible configurations or pairwise non-crossing obstructing cycles of length at most 5 (which all allow for making effective 4-coloring reductions).
The known proofs of the 4CT only show the existence of a single reducible configuration or obstructing cycle in the above statement. The existence is proved using the discharging method based on combinatorial curvature. It identifies reducible configurations in parts where the local neighborhood has positive combinatorial curvature. Our result significantly strengthens the known proofs of 4CT, showing that we can also find reductions in large ``flat" parts where the curvature is zero, and moreover, we can make reductions almost anywhere in a given planar graph. This also opens possibilities for extensions to higher surfaces since we can find such flat parts in any large-width triangulation of any fixed surface.
From a computational perspective, the old proofs allowed us to apply induction on a problem that is smaller by some additive constant. The inductive step took linear time, resulting in a quadratic total time. With our linear number of reducible configurations or obstructing cycles, we can reduce the problem size by a constant factor. Our inductive step takes $O(n\log n)$ time, yielding a 4-coloring in $O(n\log n)$ total time. To efficiently handle a linear number of reducible configurations, we need them to be sufficiently robust to be useful in other applications. All our reducible configurations are what is known as D-reducible.2026-03-25T23:51:40ZSource files are available at Github: https://github.com/near-linear-4ct/Yuta InoueKen-ichi KawarabayashiAtsuyuki MiyashitaBojan MoharCarsten ThomassenMikkel Thoruphttp://arxiv.org/abs/2605.05618v1Algorithmic Phase Transition for Large Independent Sets in Dense Hypergraphs2026-05-07T03:17:33ZWe study the algorithmic tractability of finding large independent sets in dense random hypergraphs. In the sparse regime, much of the natural algorithms can be formulated within either the local or the low-degree polynomial (LDP) framework, and a rich literature has subsequently identified nearly sharp algorithmic thresholds within these classes by exploiting their stability. In the dense setting, however, the algorithmic paradigms are fundamentally different: they are online and thus need not be stable. Perhaps more crucially, even for the classical Erdős-Rényi random graph $G(n,p)$, LDPs are conjectured to fail in the 'easy' regime accessible to online algorithms, thereby challenging their viability for dense models.
Our focus is on two models: (i) finding large independent sets in dense $r$-uniform Erdős-Rényi hypergraphs, and (ii) the more challenging problem of finding large $γ$-balanced independent sets in dense $r$-uniform $r$-partite hypergraphs, where the $i$-th coordinate of $γ\in\mathbb{Q}^r$ specifies the proportion of vertices from $V_i$ in the independent set. For both models, we pinpoint the size of the largest independent set and design online algorithms that achieve a multiplicative approximation factor of $r^{1/(r-1)}$ in the uniform and $(\max_i γ_i)^{-1/(r-1)}$ in the $r$-partite model. Furthermore, we establish matching algorithmic lower bounds, showing that these computational gaps are sharp: no online algorithms can breach these gaps.2026-05-07T03:17:33Z38 pages plus references; abstract shortened due to arxiv restrictionsAbhishek DhawanNhi U. DinhEren C. KızıldağNeeladri MaitraBayram A. Şahinhttp://arxiv.org/abs/2508.11130v2Sampling Tree-Weighted Partitions Without Sampling Trees2026-05-07T03:08:28ZThis paper gives a new algorithm for sampling tree-weighted partitions of a large class of planar graphs. Formally, the tree-weighted distribution on $k$-partitions of a graph weights $k$-partitions proportional to the product of the number of spanning trees of each partition class. Recent work on computational redistricting analysis has driven special interest in the conditional distribution where all partition classes have the same size (balanced partitions). One class of Markov chains in wide use aims to sample from balanced tree-weighted $k$-partitions using a sampler for balanced tree-weighted 2-partitions. Previous implementations of this 2-partition sampler would draw a random spanning tree and check whether it contains an edge whose removal produces a balanced 2-component forest, rejecting if not. In practice, this is a significant computational bottleneck.
We show that in fact it is possible to sample from the balanced tree-weighted 2-partition distribution directly, without first sampling a spanning tree; the acceptance and rejection rates are the same as in previous samplers. We prove that on a wide class of planar graphs encompassing network structures typically arising from the geographic data used in computational redistricting, our algorithm takes expected linear time $O(n)$. Notably, this is asymptotically faster than the best known method to generate random trees, which is $O(n \log^2 n)$ for approximate sampling and $O(n^{1 + \log \log \log n / \log \log n})$ for exact sampling. Additionally, we show that a variant of our algorithm also gives a speedup to $O(n \log n)$ for exact sampling of uniformly random trees on these families of graphs, improving the bounds for both exact and approximate sampling.
We implement our algorithm and benchmark it on grid graphs, finding that it outperforms the standard bipartitioning method in the widely-used GerryChain library.2025-08-15T00:36:22ZSarah CannonTopher PankowWesley PegdenJamie Tucker-Foltzhttp://arxiv.org/abs/2605.05602v1Nearly Optimal Attention Coresets2026-05-07T02:37:57ZWe consider the problem of estimating the Attention mechanism in small space, and prove the existence of coresets for it of nearly optimal size. Specifically, we show that for any set of unit-norm keys and values $(K,V)$ in $\mathbb{R}^d$, there exists a subset $(K',V')$ of size at most $O({\sqrt{d} e^{ρ+o(ρ)}/\varepsilon})$ such that \[ \left\| \operatorname{Attn}(q,K,V)- \operatorname{Attn}(q,K',V') \right\| \le \varepsilon \] simultaneously for all queries whose norm is bounded by $ρ$. This outperforms the best known results for this problem. We also offer an improved lower bound showing that $\varepsilon$-coresets must have size $Ω({\sqrt{d} e^ρ/ε})$.2026-05-07T02:37:57ZEdo LibertyAlexandr AndoniEldar Kleinerhttp://arxiv.org/abs/2403.08929v6Two-sided Assortment Optimization: Adaptivity Gaps and Approximation Algorithms2026-05-06T20:17:47ZTo address efficiency and design challenges in choice-based matching platforms, we introduce a two-sided assortment optimization framework under general choice preferences. The goal in this problem is to maximize the expected number of matches by deciding which assortments are displayed to the agents and the order in which they are shown. In this context, we identify several classes of policies that platforms can use in their design. Our goals are: (1) to measure the value that one class of policies has over another one, and (2) to approximately solve the optimization problem itself for a given class. For (1), we define the adaptivity gap as the worst-case ratio between the optimal values of two different policy classes. First, we show that the gap between the class of policies that statically show assortments to one-side first and the class of policies that adaptively show assortments to one-side first is exactly $e/(e-1)$. Second, we show that the gap between the latter class of policies and the fully adaptive class of policies that show assortments to agents one by one is exactly $2$. We also note that the worst policies are those who simultaneously show assortments to all the agents. For (2), we first design a polynomial time algorithm that achieves a $1/4$ approximation factor within the class of policies that adaptively show assortments to agents one by one. Furthermore, when agents' preferences are governed by multinomial-logit models, we show that a 0.067 approximation factor can be obtained within the class of policies that show assortments to all agents at once. We further generalize our results to constrained assortment settings, where we impose an upper bound on the size of the displayed assortments. Finally, we present a computational study to evaluate the empirical performance of our theoretical guarantees.2024-03-13T19:37:04ZOmar El HousniUlysse HennebelleAlfredo Torricohttp://arxiv.org/abs/2601.05166v2Inapproximability of Counting Permutation Patterns2026-05-06T14:57:04ZDetecting and counting copies of permutation patterns are fundamental algorithmic problems, with applications in the analysis of rankings, nonparametric statistics, and property testing tasks such as independence and quasirandomness testing. From an algorithmic perspective, there is a sharp difference in complexity between detecting and counting the copies of a given length-$k$ pattern in a length-$n$ permutation. The former admits a $2^{\mathcal{O}(k^2)} \cdot n$ time algorithm (Guillemot and Marx, 2014) while the latter cannot be solved in time $f(k)\cdot n^{o(k/\log k)}$ unless the Exponential Time Hypothesis (ETH) fails (Berendsohn, Kozma, and Marx, 2021). In fact already for patterns of length 4, exact counting is unlikely to admit near-linear time algorithms under standard fine-grained complexity assumptions (Dudek and Gawrychowski, 2020).
Recently, Ben-Eliezer, Mitrović and Sristava (2026) showed that for patterns of length up to 5, a $(1+\varepsilon)$-approximation of the pattern count can be computed in near-linear time, yielding a separation between exact and approximate counting for small patterns, and conjectured that approximate counting is asymptotically easier than exact counting in general. We strongly refute their conjecture by showing that, under ETH, no algorithm running in time $f(k)\cdot n^{o(k/\log k)}$ can approximate the number of copies of a length-$k$ pattern within a multiplicative factor $n^{(1/2-\varepsilon)k}$. The lower bound on runtime matches the conditional lower bound for exact pattern counting, and the obtained bound on the multiplicative error factor is essentially tight, as an $n^{k/2}$-approximation can be computed in $2^{\mathcal{O}(k^2)}\cdot n$ time using an algorithm for pattern detection.2026-01-08T17:55:57ZMichal Oplerhttp://arxiv.org/abs/2604.27757v2Temporal Routing in Static Networks: The Schedule Completion Problem2026-05-06T12:50:34ZWe introduce the TemporallyEdgeDisjointScheduleCompletion (TEDSC) problem in which we need to cover a set of temporal edge demands $D$ by routing $k$ temporal walks through a directed static graph while remaining temporally edge disjoint. This problem combines the temporal aspects of train routing and passenger demands with the static nature of real-world rail networks. We present a polynomial time algorithm for TEDSC. Motivated by real world constraints, we next investigate two restricted variants of TEDSC in which each walk can only travel for some bounded distance or time $h$. We show that both are tractable when parameterized by $k + h$, but hard for $h$ and $k + |D|$. If we fix the underlying network, the two problems exhibit distinct complexities: The distance variant remains $W[1]$-hard parameterized by $k$ even on a path of three vertices whereas the time variant admits an FPT algorithm on any fixed star. Finally, we show how to approximate the number of required walks up to a factor of $(2-h^{-1})$.2026-04-30T11:45:04ZMichelle DöringNiklas MohrinGeorge Skretas