https://arxiv.org/api/rnJDYN4d7PV5yJRrCPAGrXdSbVo2026-06-24T08:05:39Z5490566015http://arxiv.org/abs/2605.23424v1Sparse In-Network Learning via Shortest-Path Backpropagation and Finite-Rate Gating2026-05-22T09:35:05ZIn-network learning (INL) trains distributed neural modules by exchanging latent activations and backpropagated errors over a communication graph. This letter proposes Dijkstra-pruned INL (D-INL), which removes non-tree links by retaining a capacity-aware shortest-path tree rooted at the fusion node. To balance sparsity and predictive information, local routing (or aggregation) is modeled as a finite-rate stochastic gate with rate $R_g=I(Z; T)$. We derive a rate-distortion-generalization bound and validate the method on a reproducible distributed-classification experiment, where D-INL reduces training exchange by $70.4\%$ while preserving accuracy within the standard deviation of dense INL. Adding finite-rate regularization further reduces the estimated latent rate by $45.7\%$ relative to unregularized Dijkstra INL.2026-05-22T09:35:05ZMohammad Reza Deylam Salehihttp://arxiv.org/abs/2605.23421v1Stochastic Generalized Sampling2026-05-22T09:32:47ZReconstructing an infinite-dimensional signal from a finite set of measurements is a fundamental problem in approximation theory and signal processing. While the generalized sampling (GS) framework provides a robust methodology for recovering elements in arbitrary separable Hilbert spaces, deterministic approaches suffer from severe basis-dependent dimensionality constraints, often requiring a quadratic sample complexity $m \gtrsim n^2$ to avoid numerical instability. In this paper, we introduce a fully stochastic framework for GS that natively overcomes these deterministic barriers. By drawing measurements according to an optimal leverage-score probability distribution, we prove that stable recovery is guaranteed with high probability at a near-linear sample complexity of $m \gtrsim n\log n$. Crucially, this optimal rate is universal-independent of the specific choice of measurement and reconstruction bases-and holds even when the sensing system is a highly redundant frame. To establish these guarantees, we derive a novel matrix Bernstein inequality for random rectangular operators, allowing us to rigorously control the aliasing error governed by the empirical cross-term. Finally, we demonstrate the practical efficacy of our approach on the classical problem of recovering analytic functions from continuous Fourier measurements via Legendre polynomials, where our randomized method achieve near-exponential convergence rates.2026-05-22T09:32:47ZLuca FinottiMatteo Santacesariahttp://arxiv.org/abs/2605.23390v1Layered construction of Message-Wise Unequal Error Protection Codes2026-05-22T09:00:49ZConventional communication systems are mainly designed to reduce error rates and increase transmission rates, and therefore usually provide uniform protection to all transmitted messages. However, in intent-oriented applications, different messages may have different semantic meanings and importance levels, requiring different levels of reliability. This paper proposes a layered construction of message-level unequal error protection (UEP) codes for short-blocklength communication. Instead of appending an explicit protection tag to each codeword, the proposed method embeds the protection structure directly into the Hamming-distance structure of the codebook. By assigning larger minimum intra-level distances to higher-importance message groups and imposing suitable inter-level distance constraints, the proposed codebook provides differentiated error-correction capabilities while enabling reliable importance-level classification at the receiver. Theoretical conditions for correct group classification are derived, and simulations over AWGN and VLC-ISI channels show that the proposed scheme improves BER performance and group classification accuracy compared with a tag-based ECC baseline.2026-05-22T09:00:49Z6pages,5 figuresQiming LuShan LuTakaya Yamazatohttp://arxiv.org/abs/2605.23362v1Instance-Optimal Estimation with Multiple LLM Judges on a Budget2026-05-22T08:26:08ZEvaluating large language models increasingly relies on LLM-as-a-judge protocols, but such evaluations remain costly: different judges have different prices and reliabilities, and the difficulty of each prompt-response pair can vary substantially. This raises a basic allocation question: under a fixed budget, how should one distribute evaluation queries across heterogeneous judges and instances to obtain the most accurate score estimates? We formalize this question as *budgeted heteroskedastic multi-judge estimation*. Given $K$ prompt-response pairs, $J$ judges with known costs, and unknown query-judge variances, the goal is to estimate a bounded score vector while minimizing an $\ell_p$-error. Our first contribution is to analyze the inverse-variance weighted estimator (IVWE) and to derive the oracle allocation that minimizes its error rate. Since this allocation depends on the unknown variances, we then address the practical unknown-variance setting by proposing EST-IVWE, an adaptive algorithm that constructs and leverages *optimistically biased* variance estimates to stabilize the empirical allocation. We prove that EST-IVWE matches the oracle IVWE rate up to lower-order terms in the budget. Our second and central theoretical contribution is a matching *local* minimax lower bound, which establishes the instance-optimality of the proposed algorithms. A key technical insight is that Fano-type high-probability arguments are too coarse for this problem: their packing construction loses the local variance structure that governs the optimal allocation. We instead use an Assouad-type in-expectation argument, based on local perturbations, which preserves this structure and yields the sharp allocation-dependent lower bound. Finally, we numerically validate the superiority of our approach over naïve uniform allocation on synthetic and HelpSteer2 datasets.2026-05-22T08:26:08Z53 pages, 4 figures; the first two authors contributed equallyJunghyun LeeSanghwa KimYassir JedraAlexandre ProutièreSe-Young Yunhttp://arxiv.org/abs/2602.07235v2ArcMark: Distortion-Free Multi-Byte LLM Watermark via Optimal Transport2026-05-22T08:22:26ZWatermarking is an important tool for promoting the responsible use of large language models (LLMs). Existing watermarks insert a signal into generated tokens that either flags LLM-generated text (zero-bit watermarking) or encodes more complex messages (multi-bit watermarking). Though a number of recent approaches insert multiple bits into text without perturbing average next-token predictions, they largely extend design principles from the zero-bit setting, such as encoding a single bit per token. In contrast, a watermarker capable of embedding multiple bytes into the text would dramatically increase the potential applications, by embedding information such as the ID of the user who submitted the prompt, the precise model version that was used, or even the prompt itself. We address this problem by introducing ArcMark: a new watermark construction based on coding and information-theoretic principles that is capable of reliably embedding multiple bytes of information into just a few hundred tokens, without any distortion of the underlying LLM next-token distribution. We derive ArcMark by formulating the distortion-free watermarking problem as a channel coding problem, and deriving an information-theoretic channel capacity that establishes the fundamental limit of embedding information in LLM output in a distortion-free manner. This capacity formulation informs the design of ArcMark. In practice, ArcMark outperforms competing multi-bit distortion-free watermarks in terms of reconstruction accuracy, including in the face of attacks that alter a subset of the LLM text. ArcMark output is also shown to be indistinguishable from unwatermarked text in terms of perplexity, and in downstream task quality.2026-02-06T22:28:03ZAtefeh GilaniSajani VithanaCarol Xuan LongOliver KosutLalitha SankarFlavio P. Calmonhttp://arxiv.org/abs/2605.23329v1MDS and NMDS Codes from the Extended Twisted Generalized Reed-Solomon Codes2026-05-22T07:44:33ZThis paper contributes to maximum distance separable (MDS) and near MDS (NMDS) properties of the extended generalized twisted Reed-Solomon (TGRS) codes. Firstly, a family of extended TGRS (ETGRS) are constructed by appending three columns to the generator matrix of original TGRS codes. Secondly, the necessary and sufficient conditions for these codes to be MDS or almost MDS (AMDS) codes are derived. Then, by analyzing the AMDS properties of their dual codes, the necessary and sufffcient conditions for them to be NMDS codes are established. Furthermore, some examples are given to verify the main results. Finally, we determine the non-generalized Reed-Solomon (non-GRS) characteristics of them via the Schur product method.2026-05-22T07:44:33ZYanli WangYanxin ChenTongjiang Yanhttp://arxiv.org/abs/2504.09388v2The Rate-Immediacy Barrier in Explicit Tree Code Constructions2026-05-22T07:03:01ZSince the introduction of tree codes by Schulman (STOC 1993), explicit construction of asymptotically good tree codes has remained a notorious challenge. A work by Cohen, Haeupler and Schulman (STOC 2018), as well as the state-of-the-art construction by Ben Yaacov, Cohen, and Yankovitz (STOC 2022) have achieved codes with rate $Ω(1/\log\log n)$, exponentially improving upon the original rate $Ω(1/\log n)$ construction of Evans, Klugerman and Schulman from 1994. All of these constructions rely, at least in part, on increasingly sophisticated methods of combining (block) error-correcting codes.
In this work, we identify a fundamental barrier to constructing tree codes using known techniques. We introduce a key property which we call immediacy, that, while not required by the original definition of tree codes, is shared by all known constructions and inherently arises in recursive combinations of error-correcting codes. Our main technical contribution is the proof of a rate-immediacy trade-off, which, in particular, implies that any tree code with constant distance and non-trivial immediacy must necessarily have vanishing rate. By applying our rate-immediacy trade-off to existing constructions, we establish that their known rate analyses are essentially optimal given their actual error-correction properties. More broadly, our work highlights the need for fundamentally new ideas -- beyond the recursive use of error-correcting codes -- to achieve substantial progress in explicitly constructing asymptotically good tree codes.2025-04-13T00:47:48ZAdded further discussion and examples. To appear in proceedings of CCC 2026Gil CohenLeonard J. SchulmanPiyush Srivastavahttp://arxiv.org/abs/2605.23260v1MISO Downlink with Fluid Antenna Multiple Access2026-05-22T06:01:07ZFluid antenna multiple access (FAMA) enables each user to rapidly switch among several closely spaced ports and select the strongest received signal. Although this mechanism offers micro-scale spatial diversity, its behavior in multiuser downlink systems with spatial correlation and linear precoding is not well understood. This paper develops a unified analytical framework for the multiple-input single-output (MISO) downlink with FAMA users served via maximum ratio transmission (MRT) or zero-forcing (ZF). We show that the per-port signal-to-interference ratio
(SIR) follows a Beta-prime distribution with parameters
\((M_{\mathrm{eff}},L)\), where \(M_{\mathrm{eff}}=M\) under MRT and
\(M_{\mathrm{eff}}=M-U+1\) under ZF, and derive closed-form finite-sum cumulative distribution functions (CDFs) for both cases. We further provide the first analytical characterization of cross-port SIR correlation. \textcolor{black}{Furthermore, we derive rigorous outage probability bounds that tightly bracket the exact performance and become exact in the limiting cases of fully correlated and independent ports.} Asymptotic analyses reveal the fundamental diversity orders and tail behavior for each precoder. Numerical results confirm the accuracy of the SIR distributions, correlation model, and outage bounds, and show that MRT achieves weaker port correlation and larger selection gains than ZF when the base station (BS) has ample spatial degrees of freedom. The framework offers explicit guidelines for port configuration and precoder selection in practical FAMA systems.2026-05-22T06:01:07Zaccepted in IEEE Transactions on Wireless CommunicationsAnastasios Papazafeiropouloshttp://arxiv.org/abs/2604.07796v2Order-Optimal Sequential 1-Bit Mean Estimation in General Tail Regimes2026-05-22T05:47:06ZIn this paper, we study the problem of mean estimation under 1-bit communication constraints. We propose a novel adaptive mean estimator based solely on randomized threshold queries, where each 1-bit outcome indicates whether a given sample exceeds a sequentially chosen threshold. Our estimator is $(ε, δ)$-PAC for any distribution with a bounded mean $μ\in [-λ, λ]$ and a bounded $k$-th central moment $\mathbb{E}[|X-μ|^k] \le σ^k$ for any fixed $k > 1$. Moreover, our sample complexity is order-optimal in all such tail regimes, i.e., for every such $k$ value. For $k \neq 2$, our estimator's sample complexity matches the unquantized minimax lower bounds plus an unavoidable $O(\log(λ/σ))$ localization cost. For the finite-variance case ($k=2$), our estimator's sample complexity has an extra multiplicative $O(\log(σ/ε))$ penalty, and we establish a novel information-theoretic lower bound showing that this penalty is a fundamental limit of 1-bit quantization. We also establish a significant adaptivity gap: for both threshold queries and more general interval queries, the sample complexity of any non-adaptive estimator must scale linearly with the search space parameter $λ/σ$, rendering it vastly less sample efficient than our adaptive approach. Finally, we present algorithmic variants that (i) handle an unknown sampling budget, (ii) adapt to an unknown scale parameter $σ$ given (possibly loose) bounds, (iii) require only two stages of adaptivity to achieve order-optimal sample complexity at the expense of more general 1-bit queries, and (iv) leverage multiple local samples per 1-bit query to proportionally reduce communication costs.2026-04-09T04:49:21ZThis article substantially extends the AISTATS version, arXiv:2509.21940Ivan LauJonathan Scarletthttp://arxiv.org/abs/2605.23236v1A Posterior MWPM Decoding Boosts the XYZ Planar Code2026-05-22T05:08:57ZThe minimum-weight perfect matching (MWPM) decoder is a standard decoding strategy for surface codes, but its performance degrades considerably under biased noise. In this paper, a modified surface code, termed the XYZ planar code, is introduced, and the MWPM decoder is extended to posterior MWPM (pMWPM) with almost no increase in decoding complexity. The XYZ planar code exhibits higher and more stable thresholds than the planar code under almost all bias conditions, while also achieving significantly lower logical error rates. Specifically, in the infinite-bias case, the threshold of the XYZ planar code is improved by about \(36\%\) compared to that of the surface code, and it maintains comparable or higher thresholds under other biases -- for example, the threshold reaches approximately \(15.5\%\) at bias \(η= 1\) and \(14.2\%\) at \(η= 100\). Furthermore, pMWPM can be adapted to a wide range of modified surface codes, and the results presented in this work also indicate its excellent potential in other scenarios, such as configurations in which \(Y\) operators involve a larger number of data qubits.2026-05-22T05:08:57ZZhiwei WangLiqi Wanghttp://arxiv.org/abs/2605.23225v1Entropy Equivalence Testing2026-05-22T04:35:04ZWe introduce the problem of \emph{entropy equivalence testing} for probability distributions, a relaxation of the well-studied closeness testing problem, where the distribution testing algorithm is now only required to distinguish, given samples from two unknown distributions $p,q$ and a parameter $\varepsilon \in(0,1/2]$, between $p=q$ and $|H(p)-H(q)| \geq \varepsilon$ (where $H$ denotes the Shannon entropy). We provide a time- and sample-efficient algorithm for this task, showing that the optimal sample complexity for this task can be significantly lower than that of closeness testing. As an application, we leverage this result to provide the first non-trivial testing algorithm for (standard) closeness of low-degree \emph{Bayesian networks}, which significantly improves on either the sample or time complexity of a baseline based on full learning.2026-05-22T04:35:04ZClément L. CanonneYash PoteJonathan ScarlettJoy Qiping Yanghttp://arxiv.org/abs/2605.23224v1On APN Exponents and the Differential and Boomerang Properties of Binomials in Characteristic 32026-05-22T04:33:54ZRecent studies on binomials of the form $F_r(x) = x^r(1 + χ(x))$ over $\mathbb{F}_{p^n}$ have shown that these functions can exhibit very low boomerang uniformity. In this paper, we focus on the specific behavior of such binomials in characteristic $3$, where instances of extremely low boomerang uniformity-namely $0$ or $1$-seem to arise more frequently than in other characteristics. First, we provide a systematic analysis of Almost Perfect Nonlinear (APN) power functions in characteristic $3$. We present an explicit parametrization of APN exponents arising from the construction of Zha and Wang and demonstrate through numerical results for $n \le 13$ that this generalized framework accounts for several previously known and sporadic APN instances. Building on this classification, we identify and rigorously prove two classes of binomials $F_r$ that are locally-PN and possess the minimum possible boomerang uniformity of $0$. These classes involve exponents derived from the aforementioned APN construction and the differentially 4-uniform exponent $r = 2 \cdot 3^{\frac{n-1}{2}} + 1$. Furthermore, we analyze the binomial $F_r$ with $r = 3^n - 3$, proving that it is locally-APN with boomerang uniformity $1$ when $n\ge 5$ is odd, and completely determine its boomerang spectrum through the evaluation of character sums. Our results clarify and extend existing studies on the cryptographic properties of binomials, providing a systematic characterization of several classes of binomials with very low boomerang uniformity in characteristic $3$.2026-05-22T04:33:54ZNamhun KooSoonhak KwonMinwoo KoByunguk Kimhttp://arxiv.org/abs/2603.04005v2Training-Free Rate-Distortion-Perception Traversal With Diffusion2026-05-22T01:41:05ZThe rate-distortion-perception (RDP) tradeoff characterizes the fundamental limits of lossy compression by jointly considering bitrate, reconstruction fidelity, and perceptual quality. While recent neural compression methods have improved perceptual performance, they typically operate at fixed points on the RDP surface, requiring retraining to target different tradeoffs. In this work, we propose a training-free framework that leverages pre-trained diffusion models to traverse the entire RDP surface. Our approach integrates a reverse channel coding (RCC) module with a novel score-scaled probability flow ODE decoder. We theoretically prove that the proposed diffusion decoder is optimal for the distortion-perception tradeoff under AWGN observations and that the overall framework with the RCC module achieves the optimal RDP function in the Gaussian case. Empirical results across multiple datasets demonstrate the framework's flexibility and effectiveness in navigating the ternary RDP tradeoff using pre-trained diffusion models. Our results establish a practical and theoretically grounded approach to adaptive, perception-aware compression.2026-03-04T12:49:13ZAccepted by the Forty-Third International Conference on Machine Learning (ICML) 2026Yuhan WangSuzhi BiYing-Jun Angela Zhanghttp://arxiv.org/abs/2605.23124v1Deep-Learning-Aided Successive Cancellation List Flip Decoding for Polar Codes2026-05-22T00:48:01ZPolar codes are the first error-correcting code proven to achieve channel capacity based on infinite code length. The Successive Cancellation List Flip (SCLF) decoding algorithm was proposed by flipping an erroneous bit during the next decoding attempt. To identify the erroneous bits, the Log-Likelihood Ratio (LLR) is used to indicate the reliability of each decision bit. To improve the accuracy of the erroneous bit prediction, we propose deep-learning-aided (DL-aided) SCLF decoding algorithms. We first offer a stacked LSTM network that contains new features to train our models, which are able to improve the accuracy of the prediction of positions of erroneous bits. Then we separately train the stacked LSTM models to predict the position of both the first and second erroneous bits and whether to continue flipping. As a result, the DL-aided SCLF decoding algorithms based on the proposed stacked LSTM \mbox{flip-1} model, stacked LSTM \mbox{flip-2} model, and the stacked LSTM \mbox{continue-flipping} check (CFC) model are able to provide a better performance at a lower number of average decoding attempts when compared to other state-of-the-art decoding algorithms.2026-05-22T00:48:01Z14 pages, 17 figuresFu-Siang LiangShan LuYeong-Luh Ueng10.1109/TCCN.2023.3326330http://arxiv.org/abs/2605.23120v1The Closure of LCD-to-GI Reductions via Generalized Inner Products2026-05-22T00:43:13ZThe Permutation Equivalence Problem (PEP) for linear codes is a fundamental problem in coding theory and cryptography. A recent reduction shows that PEP for Linear Complementary Dual (LCD) codes reduces to Graph Isomorphism (GI) via orthogonal projectors, but is restricted to codes with trivial hull. We prove that this approach extends to bilinear forms $M = aI + bJ$, and that no other nondegenerate symmetric form yields a valid reduction. A code is reducible if and only if its hull dimension is at most $1$ with an explicit condition on the hull vector; in characteristic $2$, only LCD codes are reducible. This establishes the closure of the orthogonal projector method. We derive exact enumeration formulas via character sums over quadratic forms and provide a polynomial-time reduction algorithm.2026-05-22T00:43:13ZKeita Ishizuka