Branch-width of represented matroids in matrix multiplication time

2026-05-14T06:19:53Z

For an $n$-element matroid $M$ given by an $n \times n$ matrix representation over a finite field $\mathbb F$ and an integer $k$, we present an $(O_{k,\mathbb F}(n^2)+O(n^ω))$-time algorithm that either finds a branch-decomposition of $M$ of width at most $k$, or confirms that the branch-width of $M$ is more than $k$, where $ω< 2.3714$ is the matrix multiplication exponent, and the $O_{k,\mathbb F}(\cdot)$-notation hides factors that depend on $k$ and $\mathbb F$ in a computable manner. All previous algorithms including Hliněný and Oum [SIAM J. Comput. (2008)] and Jeong, Kim, and Oum [SIAM J. Discrete Math. (2021)] run in at least $Ω(n^3)$ time. Moreover, if the input matrix representation is given by a standard form, our algorithm runs in $O_{k,\mathbb F}(n^2)$-time, since $O(n^ω)$-time is only needed for finding a standard form of the input matrix. When $M$ is given by an $m \times n$ matrix, the overhead for finding a standard form is $O(mn \min(m,n)^{ω-2})$. As corollaries, we obtain faster algorithms for rank-width of directed graphs and path-width of matroids represented over a fixed finite field. Furthermore, we also present an approximation algorithm for finding branch-width that works on infinite fields provided that the input matrix is of a standard form and contains a bounded number of distinct values of entries. To suggest that our algorithm is optimal, we observe that for every field $\mathbb F$, deciding whether the branch-width of a matroid represented over $\mathbb F$ is $0$ is as hard as deciding whether a square matrix over $\mathbb F$ is singular. Under the assumption that singularity testing requires $Ω(n^ω)$-time, this implies that the overhead of $O(n^ω)$ is unavoidable. We also show strengthenings of this observation to rule out some approximations under this assumption.

zSort: Stable Distribution Sort using Z-Score Partitioning

2026-05-14T06:06:15Z

Sorting is a foundational primitive in modern data processing, influencing the execution speed of high-performance data pipelines. However, the algorithmic landscape is currently bifurcated by a pervasive "Stability Tax": practitioners must sacrifice either order preservation for high throughput or execution speed for stability. To address these limitations, this paper introduces, zSort, an adaptive z-score based distribution sorting algorithm that guarantees stability while avoiding pass complexity that scales with key-width. The performance of the proposed technique is evaluated using Microarchitectural analysis and experimental results. Microarchitectural analysis shows that zSort achieves a lower bad-speculation overhead (19.7%) than both stable baselines and several high-performance unstable algorithms and sustains a competitive IPC of 1.44. Empirical evaluation across diverse input distributions and datasets of up to 10^7 elements (64 bit) demonstrates that zSort consistently outperforms widely used comparison based stable sorting algorithms, achieving up to 3x-4.5x speedups, and a relatively better performance compared to LSD Radix, with larger gains on duplicate heavy and partially ordered inputs. Despite providing stability, zSort achieves comparable throughput as compared to high-performance unstable algorithms such as Skasort. It also maintains this performance on adaptive workloads where methods like Pdqsort typically excel and doesn't exhibit any extreme worst case. These results indicate that zSort substantially narrows the traditional performance gap between stable and unstable sorting and provides an efficient, stable sorting alternative.

Fast Gossip-based Rumor Spreading using Small Messages

2026-05-14T04:55:05Z

We study gossip algorithms for the fundamental rumor spreading problem, where the goal is to disseminate a rumor from a given source node to all nodes in an arbitrary (and unknown) graph. Gossip algorithms allow each node to call only one neighbor per round and are therefore highly message-efficient, with low per-node communication overhead per round. The state of the art present fast gossip algorithms, however they typically leverage large-sized messages. This undermines the light-weight communication advantage of gossip, since even though only one neighbor is contacted per round, the message size can be linear in $n$, the network size. Hence, a fundamental question is whether one can perform fast gossip using small messages. The main contribution of this paper is to answer the above question in the affirmative and present two gossip algorithms that achieve fast rumor spreading using messages of polylog{n} size. Specifically, we present the following algorithms: 1. An algorithm that runs in $O(c \log n / Φ_c)$ rounds for every $c \geq 1$, and $Φ_c$ is the weak conductance. Our bound in terms of weak conductance is essentially optimal. 2. An algorithm that depends on the network diameter (and is independent of the graph's conductance), which runs in $\tilde{O}(D+\sqrt{n})$ rounds with high probability. Our algorithm can be modified to output a minimum spanning tree (MST) in the same number of rounds, which is essentially round-optimal (even for non-gossip algorithms). Our gossip algorithms use graph sketches [Ahn, Guha, McGregor, SODA 2012] in a novel way to overcome communication bottlenecks and achieve small communication overhead with small message sizes.

Semi-Streaming Algorithms for Submodular Maximization under Random Arrival Order

2026-05-14T02:56:24Z

We study random order semi-streaming algorithms for submodular maximization under a wide range of combinatorial constraint classes, including matroids, matroid $p$-parity, $p$-exchange systems and $p$-systems. For most of these classes of constraints, our results are the first improvement over what is known to be achievable for adversarial order. For matroids, matching and $p$-matchoids, previous random order results were known, and we improve over some of these as well. In the case of matroids, our improved results show a separation between adversarial and random order semi-streaming algorithms, and exponentially improve the number of passes necessary for getting $1 - 1/e - \varepsilon$ approximation for maximizing a monotone submodular function subject to a matroid constraint. We also prove a new hardness result showing a similar separation for $p$-systems. Our results are based on two new technical tools. One tool provides a general way to translate offline algorithms for many classes of constraints into random order semi-streaming algorithms. The other tool is a semi-streaming variant of a recently proposed offline algorithm for matroid constraints.

Almost succinct representation of maximal palindromes

2026-05-14T02:31:16Z

Palindromes are strings that read the same forward and backward. The computation of palindromic structures within strings is a fundamental problem in string algorithms, being motivated by potential applications in formal language theory and bioinformatics. Although the number of palindromic factors in a string of length $n$ can be quadratic, they can be implicitly represented in $O(n \log n)$ bits of space by storing the lengths of all maximal palindromes in an integer array, which can be computed in $O(n)$ time [Manacher, 1975]. In this paper, for any positive constant $ε< 1$, we propose a novel $(3(1+ε)n + o(n))$-bit representation of all maximal palindromes in a string, which enables $O(1)$-time retrieval of the length of the maximal palindrome centered at any given position. The data structure can be constructed in $O(n)$ time from the input string of length $n$. Since Manacher's algorithm and the notion of maximal palindromes are widely utilized for solving numerous problems involving palindromic structures, our compact representation will accelerate the development of more space-efficient solutions to such problems. Indeed, as the first application of our compact representation of maximal palindromes, we present a data structure of size $O(n)$ bits that can compute the longest palindrome appearing in any given factor of a string of length $n$ in $O(\log n)$ time.

A Nonparametric Framework for Online Stochastic Matching with Correlated Arrivals

2026-05-14T02:18:46Z

The design of online algorithms for matching markets and revenue management settings is usually bound by the assumption that the demand process is formed by a fixed-length sequence of queries with unknown types, each drawn independently. This notion of serial independence implies that the demand of each type, i.e., the number of queries of a given type, has low variance and is approximately Poisson-distributed. This paper proposes a nonparametric framework for modeling arrival sequences in online stochastic matching that departs from the serial independent assumption. We propose two models, Indep and Correl, that capture different forms of serial correlations by combining a nonparametric distribution for the demand with standard assumptions on the arrival patterns -- adversarial or random order. The Indep model can capture arbitrary serial correlations within each customer type but assumes cross-sectional independence across types, whereas the Correl model captures common shocks across customer types. We demonstrate that fluid relaxations, which rely solely on demand expectations, have arbitrarily bad performance guarantees. In contrast, we develop new algorithms that achieve optimal (constant-factor) performance guarantees in each model. Our mathematical analysis includes tighter linear programming (LP) relaxations that leverage distribution knowledge, and a new lossless randomized LP rounding scheme for Indep. We test our new LP relaxations and rounding scheme in simulations on real and synthetic data, and find that they consistently outperform well-established matching algorithms, especially on real data sequences that exhibit greater demand variance.

The Pareto Frontier of Randomized Learning-Augmented Online Bidding

2026-05-14T01:19:54Z

Online bidding is a classical problem in online decision-making, with applications in resource allocation, hierarchical clustering, and the analysis of approximation algorithms. We study its randomized learning-augmented variant, where an online algorithm generates a sequence of random bids while leveraging predictions from an oracle. We provide analytical upper and lower bounds on the optimal consistency $C$ as a function of the robustness $R$, which match when $R \geq 2.885$, effectively closing the gap left by previous work. The key technical ingredient is the notion of a bidding function, a novel abstraction that provides a unified framework for the design and analysis of randomized bidding strategies. We complement our theoretical results with an experimental application of randomized bidding to the incremental median problem, demonstrating the applicability of our algorithm in practical clustering settings.

A Counterexample to EFX $n \ge 3$ Agents, $m \ge n + 5$ Items, Submodular Valuations via SAT-Solving

2026-05-14T00:39:30Z

The existence of EFX allocations is a central open problem in discrete fair division. An allocation is EFX (envy-free up to any good) if no agent envies another agent after the removal of any single good from the other agent's bundle. We resolve this longstanding question by providing the \textbf{first-ever counterexample} to the existence of EFX allocations for agents with monotone valuations, which in turn immediately implies a counterexample for submodular valuations. Specifically, we show that EFX allocations need not exist for instances with $n \ge 3$ agents and $m \ge n+5$ goods. In contrast, we prove that every instance with three agents and seven goods admits an EFX allocation. Both results are obtained via SAT solving. We encode the negation of EFX existence as a SAT instance: satisfiability yields a counterexample, while unsatisfiability establishes universal existence. The correctness of the encoding is formally verified in Lean. Finally, we establish positive guarantees for fair allocations with three agents and an arbitrary number of goods. Although EFX allocations may fail to exist, we prove that every instance with three agents and monotone valuations admits at least one of two natural relaxations of EFX: tEFX, or EF1 and EEFX.

Stochastic Matching via Local Sparsification

2026-05-13T23:25:15Z

The classic online stochastic matching problem typically requires immediate and irrevocable matching decisions. However, in many modern decentralized systems such as real-time ride-hailing and distributed cloud computing, the primary bottleneck is often local communication bandwidth rather than the timing of the match itself. We formalize this challenge by introducing a two-stage local sparsification framework. In this setting, arriving requests must prune their realized compatibility sets to a strict budget of $k$ edges before a central coordinator optimizes the global matching. This creates a "middle ground" between local information constraints and global optimization utility. We propose a local selection strategy, parametrized by a fractional solution of the expected instance. Theoretically, we quantify the approximation ratio as a function of the solution's {\em spread}. We prove that under sufficient spread, our sparsifier globally preserves the expected size of the maximum matching. Empirically, we demonstrate the robustness of our approach using the New York City ride-hailing datasets and adversarial synthetic benchmarks. Our results show that near-optimal global matching is achievable even with highly constrained local budgets, significantly outperforming standard online baselines.

Finite Sample Bounds for Learning with Score Matching

2026-05-13T22:48:18Z

Learning of continuous exponential family distributions with unbounded support remains an important area of research for both theory and applications in high-dimensional statistics. In recent years, score matching has become a widely used method for learning exponential families with continuous variables due to its computational ease when compared against maximum likelihood estimation. However, theoretical understanding of the statistical properties of score matching is still lacking. In this work, we provide a non-asymptotic sample complexity analysis for learning the structure of exponential families of polynomials with score matching. The derived sample bounds show a polynomial dependence on the model dimension. These bounds are the first of its kind, as all prior work has shown only asymptotic bounds on the sample complexity.

Improved Speed via Regional Fulfillment

2026-05-13T20:03:41Z

In e-retail, order fulfillment speed has become one of the most important metrics affecting customer satisfaction. While common wisdom dictates that maintaining a large global fulfillment network maximizes efficiency via economies of scale, recent evidence has shown that breaking up the network into smaller regions can yield significant speed improvements. In this paper, we consider a simple abstract model of order fulfillment by which we explain this phenomenon. We characterize fulfillment assignments satisfying an equilibrium condition based on the greedy fulfillment strategy, and quantify how the resulting fulfillment delay can be decreased by regionalizing the network. Finally, we provide some algorithmic results for computing low delay assignments, and some simulations supporting our equilibrium framework.

Near Optimal Alphabet-Soundness Tradeoff PCPs

2026-05-13T19:19:17Z

We show that for all $\varepsilon>0$, for sufficiently large $q\in\mathbb{N}$ power of $2$, for all $δ>0$, it is NP-hard to distinguish whether a given $2$-Prover-$1$-Round projection game with alphabet size $q$ has value at least $1-δ$, or value at most $1/q^{1-\varepsilon}$. This establishes a nearly optimal alphabet-to-soundness tradeoff for $2$-query PCPs with alphabet size $q$, improving upon a result of [Chan, Journal of the ACM 2016]. Our result has the following implications: 1) Near optimal hardness for Quadratic Programming: it is NP-hard to approximate the value of a given Boolean Quadratic Program within factor $(\log n)^{1 - o(1)}$ under quasi-polynomial time reductions. This improves upon a result of [Khot, Safra, ToC 2013] and nearly matches the performance of the best known algorithms due to [Megretski, IWOTA 2000], [Nemirovski, Roos, Terlaky, Mathematical Programming 1999] and [Charikar, Wirth, FOCS 2004] that achieve $O(\log n)$ approximation ratio. 2) Bounded degree $2$-CSPs: under randomized reductions, for sufficiently large $d>0$, it is NP-hard to approximate the value of $2$-CSPs in which each variable appears in at most $d$ constraints within factor $(1-o(1))\frac{d}{2}$, improving upon a result of [Lee, Manurangsi, ITCS 2024]. 3) Improved hardness results for connectivity problems: using results of [Laekhanukit, SODA 2014] and [Manurangsi, Inf. Process. Lett., 2019], we deduce improved hardness results for the Rooted $k$-Connectivity Problem, the Vertex-Connectivity Survivable Network Design Problem and the Vertex-Connectivity $k$-Route Cut Problem.

Non-Redundancy of Low-Arity Symmetric Boolean CSPs

2026-05-13T18:12:58Z

Non-redundancy, introduced by Bessiere, Carbonnel, and Katsirelos (AAAI 2020), is a structural parameter for Constraint Satisfaction Problems ($\mathsf{CSPs}$) that governs kernelization, exact and approximate sparsification, and exact streaming complexity. It is the largest size of a $\mathsf{CSP}$ instance admitting no smaller subinstance with the same satisfying assignments. We study non-redundancy $\mathsf{NRD}_n(R)$ for Boolean symmetric $\mathsf{CSPs}$ defined by an $r$-ary relation $R$ whose value depends only on Hamming weight. An instance of $\mathsf{CSP}(R)$ has $n$ variables and constraints given by $r$-tuples; a constraint is satisfied exactly when the induced tuple lies in $R$. This class includes natural predicates such as cuts and $k$-SAT clauses. Our main result is a near-complete classification of the asymptotic growth of $\mathsf{NRD}_n(R)$ for symmetric Boolean predicates of arity at most $5$. Using computational experiments and algebraic upper- and lower-bound criteria, we resolve every predicate of arity at most $4$ and all but two predicates of arity $5$. For upper bounds, we introduce $t$-balancedness, a lifted, higher-degree version of the balancedness notion of Chen, Jansen, and Pieterse (Algorithmica 2020). We prove that $t$-balancedness is equivalent to the existence of degree-$t$ multilinear polynomials capturing $R$, and hence implies $\mathsf{NRD}_n(R)=O(n^t)$. For lower bounds, we use Carbonnel's (CP 2022) framework: predicates admitting a special reduction from $k$-ary OR inherit OR's lower bound $Ω(n^k)$. The only unresolved arity-$5$ predicates in our framework have bounds $Ω(n^2)$ and $O(n^3)$; we reduce their exact classification to natural extremal set-system questions.

What is Learnable in Valiant's Theory of the Learnable?

2026-05-13T17:58:46Z

Valiant's 1984 paper is widely credited with introducing the PAC learning model, but it, in fact, introduced a different model: unlike PAC learning, the learner receives only positives, may issue membership queries, and must output a hypothesis with no false positives. Prior work characterized variants, including the case without queries. We revisit Valiant's original model and ask: *Which classes are learnable in it?* For every finite domain, including Valiant's Boolean-hypercube setting, we show that a class is learnable if and only if every realizable positive sample can be certified by a poly-size adaptive query-compression scheme. This is a new variant of sample compression where the learner certifies samples via a short interaction with the membership oracle. Our characterization shows that learnability in Valiant's model is strictly sandwiched between learnability in the PAC model and the variant of Valiant's model without membership queries. This is one of the rare cases where introducing membership queries changes the set of learnable classes, and not just the sample or computational complexity. Next, we study the natural extension of the model to arbitrary domains. While we do not obtain an exact characterization, our techniques readily generalize and show that the same strict sandwiching persists. Finally, we show that $d$-dimensional halfspaces, which are not learnable without queries, are learnable with queries: we give a $\mathrm{poly}(d) \tilde{O}(1/ε)$ sample and $\mathrm{poly}(d) \mathrm{polylog}(1/ε)$ query algorithm, and prove that at least $Ω(d)$ samples or queries are necessary. To our knowledge, this is the first algorithm for halfspaces in Valiant's model. Together, these results uncover a surprisingly rich theory behind Valiant's original notion of learnability and introduce ideas that may be of independent interest in learning theory.

Characterizing Streaming Decidability of CSPs via Non-Redundancy

2026-05-13T17:42:05Z

We study the single-pass streaming complexity of deciding satisfiability of Constraint Satisfaction Problems (CSPs). A CSP is specified by a constraint language $Γ$, that is, a finite set of $k$-ary relations over the domain $[q] = \{0, \dots, q-1\}$. An instance of $\mathsf{CSP}(Γ)$ consists of $m$ constraints over $n$ variables $x_1, \ldots, x_n$ taking values in $[q]$. Each constraint $C_i$ is of the form $\{R_i,(x_{i_1} + λ_{i_1}, \ldots, x_{i_k} + λ_{i_k})\}$, where $R_i \in Γ$ and $λ_{i_1}, \ldots, λ_{i_k} \in [q]$ are constants; it is satisfied if and only if $(x_{i_1} + λ_{i_1}, \ldots, x_{i_k} + λ_{i_k}) \in R_i$, where addition is modulo $q$. In the streaming model, constraints arrive one by one, and the goal is to determine, using minimum memory, whether there exists an assignment satisfying all constraints. For $k$-SAT, Vu (TCS 2024) proves an optimal $Ω(n^k)$ space lower bound, while for general CSPs, Chou, Golovnev, Sudan, and Velusamy (JACM 2024) establish an $Ω(n)$ lower bound; a complete characterization has remained open. We close this gap by showing that the single-pass streaming space complexity of $\mathsf{CSP}(Γ)$ is precisely governed by its non-redundancy, a structural parameter introduced by Bessiere, Carbonnel, and Katsirelos (AAAI 2020). The non-redundancy $\mathsf{NRD}_n(Γ)$ is the maximum number of constraints over $n$ variables such that every constraint $C$ is non-redundant, i.e., there exists an assignment satisfying all constraints except $C$. We prove that the single-pass streaming complexity of $\mathsf{CSP}(Γ)$ is characterized, up to a logarithmic factor, by $\mathsf{NRD}_n(Γ)$.