https://arxiv.org/api/RIjkcwEu0zngYwHZCQ6cxoNp6g8 2026-06-22T23:20:40Z 54841 555 15 http://arxiv.org/abs/2605.23901v1 LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws 2026-05-22T17:59:38Z

Existing scaling laws for Large Language Models (LLMs), predominantly monotonic power laws, fail to explain emerging non-monotonic phenomena such as catastrophic overtraining and quantization-induced degradation, where performance deteriorates despite increased compute. We propose the Shannon Scaling Law, a unified theoretical framework that models LLM training as information transmission over a noisy channel, grounded in the Shannon-Hartley theorem. By mapping model parameters to channel bandwidth and training tokens to signal power, our formulation explicitly captures the interaction between learning signal and intrinsic noise. This perspective reveals a fundamental Shannon capacity for LLMs: scaling model size or data without preserving a sufficient signal-to-noise ratio (SNR) inevitably amplifies noise, inducing a transition from monotonic improvement to U-shaped performance degradation. We validate our theory through experiments on Pythia and OLMo2 under perturbations, including Gaussian noise, quantization and supervised fine-tuning on math, QA and code tasks. The Shannon Scaling Law consistently outperforms classical scaling laws and recent perturbation-aware laws, achieving strong $R^2$ scores and accurately capturing loss basins missed by prior approaches. It also extrapolates: fitted on $\leq$6.9B Pythia models with $\leq$180B tokens, it predicts the unseen 12B model up to 307B tokens at pooled $R^2{=}0.847$, while monotonic baselines collapse.

2026-05-22T17:59:38Z Accepted by ICML 2026 Xu Ouyang Deyi Liu Yuhang Cai Jing Liu Yuan Yang Chen Zheng Thomas Hartvigsen Yiyuan Ma http://arxiv.org/abs/2605.23894v1 A Two-Branch Finite-Field Construction for Regular CSS LDPC Bases 2026-05-22T17:56:26Z

This paper develops a two-branch multiplicative-coset construction for regular Calderbank-Shor-Steane (CSS) quantum low-density parity-check base matrices. For a target column weight $J$ and an even row weight $L$, the method reduces regularity, CSS orthogonality, and same-type 4-cycle exclusion to explicit quotient-coset conditions over a finite field. A normalized exhaustive search for these conditions produces base matrices for several $(J,L)$ pairs, so the construction is not tied to a single degree distribution. The construction separates the finite-length design into two stages: the base matrix fixes the degree distribution and the first girth constraints, and a cyclic lift randomizes edge connections subject to exact algebraic checks. As a detailed example, we carry one $(3,10)$-regular base through the lift and decoding stages. For this example, the selected 64-fold lift gives a code whose same-type Tanner graphs have girth at least eight, and it also excludes a specified weight-16 nondegenerate logical-support orbit. The resulting instance is a $[[10240,4108,\,10\le d\le32]]$ CSS code. For decoding, we use joint log-domain belief propagation together with low-complexity deterministic post-processing rules for small residual syndromes, including repairs for residual patterns with two unsatisfied checks. The frame error rate (FER) measurements provide finite-length decoding data for this detailed example; at depolarizing probability $p=0.058$, the post-processing FER is $1.0\times10^{-7}$.

2026-05-22T17:56:26Z Koki Okada Kenta Kasai http://arxiv.org/abs/2605.11138v2 Field Theory of Data: Anomaly Detection via the Functional Renormalization Group. The 2D Ising Model as a Benchmark 2026-05-22T16:26:23Z

We establish a correspondence between anomaly detection in high-noise regimes and the renormalization group flow of non-equilibrium field theories. We provide a physical grounding for this framework by proving that the detection of phase transitions in interacting non-equilibrium systems maps to the study of an effective equilibrium field theory near its Gaussian fixed point, which we identify with the universal Marchenko-Pastur distribution. Applying the Functional Renormalization Group to the two-dimensional Model A, we demonstrate that the noise-to-signal ratio acts as a physical temperature, where the signal emerges as ordered domains within a thermalized background of fluctuations. Using the exact Onsager solution as a benchmark, we show that this approach identifies critical thresholds with an error below 4%, significantly outperforming standard information-theoretic metrics such as the Kullback-Leibler divergence. Our results provide a universal strategy for resolving structures in complex datasets near criticality, bridging the gap between statistical mechanics and statistical inference.

2026-05-11T18:43:14Z 15 pages, 2 appendixes; correction of typos and captions, improved clarity Riccardo Finotello Vincent Lahoche Parham Radpay Dine Ousmane Samary http://arxiv.org/abs/2605.09655v2 Geometry of Rényi Entropy on the Majorization Lattice 2026-05-22T14:38:40Z

Majorization is a stochastic ordering relation that compares the relative diversity of probability distributions with numerous applications in econometrics, spectral theory, and ecology. It is well-known that the majorization partial order forms a complete lattice on the set of ordered probability distributions. In this work, we study the properties of Rényi entropy on the majorization lattice. We establish a fundamental relation between the comonotone coupling and the independent coupling associated with a collection of marginal distributions. Consequently, we show that, for every order $α\in [0,\infty]$, the Rényi entropy is subadditive on the majorization lattice. We further characterize the supermodular regime, showing that Rényi entropy is supermodular on the majorization lattice for $α\in \{0\} \,\cup \, [1,\infty]$. For the Tsallis entropy, we show that it also satisfies subadditivity on the majorization lattice, for every order $α\in [0,\infty)$. Finally, we show that, unlike the Rényi entropy, the Tsallis entropy is supermodular on the majorization lattice for every $α\in [0,\infty)$.

2026-05-10T16:58:44Z 20 pages, 2 figures Anuj Kumar Yadav Yanina Y. Shkel http://arxiv.org/abs/2605.23683v1 Multi-User MIMO with Rotatable Antennas and IRS: Joint Antenna Boresight and IRS Orientation Design 2026-05-22T14:32:47Z

In this paper, we investigate an intelligent reflecting surface (IRS)-assisted multi-user system, where the base station (BS) employs rotatable antennas (RAs) and the IRS can adjust the panel orientation.To alleviate the severe multiplicative path loss of the cascaded channel, the IRS is deployed near the BS, while the user-BS and user-IRS links remain in the far field. We formulate a sum-rate maximization problem by jointly optimizing the receive beamforming, IRS phase shifts, BS antenna boresights, and IRS panel orientation. To tackle the resulting highly coupled and non-convex problem, we first study a single-user case to reveal the structure of the dual-rotation gain, which is shown to be multiplicatively separable in the far field but coupled in the near field. For the general multi-user case, we develop an alternating optimization algorithm, where the receive beamforming is updated in closed form, the IRS phase shifts are optimized by an FP-assisted Riemannian conjugate gradient method, and the BS antenna boresights and IRS panel orientation are updated via projected gradient methods. Simulation results demonstrate the significant sum-rate gains achieved by the proposed coordinated rotation design over fixed-orientation and single-rotation benchmark schemes, and provide useful insights into near-field dual-rotation design.

2026-05-22T14:32:47Z Guoying Zhang Qingqing Wu Ziyuan Zheng Qiaoyan Peng Ailing Zheng Yanze Zhu Ying Gao Wen Chen http://arxiv.org/abs/2605.23638v1 List Reconstruction Problem with List Size Two 2026-05-22T13:52:06Z

The problem of computing the cardinality of the intersection of multiple balls in the Hamming space has attracted a lot of attention recently due to their applications in the list reconstruction problem and information retrieval in Associative Memories. In previous work, most of the results are for the cases where the radii of each ball, $r$ and the distance between the centers of these balls, $k$ are fixed when the length $n$ of each codeword tend to infinity. In this work, we focus on the case where $r = αn$ and $k=βn$ for some constants $α$ and $β$ and compute the maximum asymptotic rate of the cardinality of the intersection of three balls. We provide the maximum asymptotic rate as a function of two parameters $α$ and $β$. We also provide numerical results and compare these results with the intersection of two balls.

2026-05-22T13:52:06Z 6 pages, 1 figure, submitted to ISITA 2026 Binh Vu VinUniversity, Hanoi, Vietnam Shuche Wang National University of Singapore, Singapore Van Khu Vu VinUniversity, Hanoi, Vietnam http://arxiv.org/abs/2605.23502v1 Distributed Two-Phase Processing for Modular XL-MIMO with Wireless Fronthaul under Hardware Impairments 2026-05-22T11:05:45Z

Modular extremely large-scale MIMO (XL-MIMO) architectures combined with wireless fronthaul provide a scalable alternative to monolithic arrays, but their performance is sensitive to hardware impairments and resource allocation strategies. In this paper, we consider a distributed two-phase processing framework for modular XL-MIMO systems employing amplify-and-forward wireless fronthaul under practical hardware constraints. We jointly model access-side and fronthaul-side distortions and formulate a weighted minimum mean-square error (WMMSE)-based optimization problem that maximizes the uplink sum spectral efficiency (SE) by jointly adjusting UE transmit powers and fronthaul amplification levels. The resulting algorithm alternates between distortion-aware receiver design and convex power-control updates. Numerical results demonstrate that the proposed joint optimization significantly improves spectral efficiency compared to fixed transmission strategies, particularly when the CPU has a moderate number of antennas, while also quantifying the relative impact of access and fronthaul impairments.

2026-05-22T11:05:45Z 5 pages, 2 figures, accepted to be presented at EUSIPCO 2026 Özlem Tuğfe Demir http://arxiv.org/abs/2605.23498v1 Constant-Envelope Quantized Precoding with Power Control for Cell-Free Massive MIMO-OFDM 2026-05-22T11:02:18Z

Cell-free massive MIMO has matured into a key candidate technology for 6G and beyond, owing to its ability to provide nearly uniform service quality to many user equipments (UEs) over the same time-frequency resources. Unlike conventional cellular massive MIMO, the core idea is to distribute a large number of low-cost access points (APs) across the network and enable joint coherent transmission and reception. While early works largely assumed ideal hardware, hardware impairments become inevitable when APs are implemented with low-cost components. In this context, this paper investigates the adverse impact of low-resolution digital-to-analog converters (DACs) on the downlink performance of cell-free massive MIMO-OFDM systems. In contrast to prior studies that mainly quantify spectral-efficiency degradation under low-resolution DACs, we consider the design of quantized constant-envelope (CE) precoding, which additionally enables the use of highly power-efficient amplifiers. To the best of our knowledge, this is the first work on quantized CE precoding for cell-free massive MIMO-OFDM. Beyond adapting the classical maximum-antenna-power method, we propose a novel power-control strategy across APs that mitigates the detrimental effects of severely quantized transmitters by reducing the contribution of harmful APs. Simulation results demonstrate that the proposed power-control mechanism significantly improves the uncoded bit error rate performance.

2026-05-22T11:02:18Z 5 pages, 2 figures, accepted to be presented at EUSIPCO 2026 Özlem Tuğfe Demir Salih Gümüşbuğa http://arxiv.org/abs/2605.06958v2 Hybrid Multiport Receivers for Slow Fluid Antenna Multiple Access 2026-05-22T10:48:26Z

We propose a novel receiver architecture that preserves the performance benefits of multiport selection in fluid-antenna systems while requiring only a very small number of radio-frequency (RF) chains. The resulting fluid-antenna hybrid multiport (FAHM) receiver effectively decouples port selection from signal combining by integrating a low-complexity analog combining network similar to those used in conventional hybrid multiantenna designs. We develop a stopping criterion to determine the number of selected ports, which limits the performance loss associated with port selection, and then design the hybrid combiner for a given RF-chain budget. The FAHM architecture is evaluated in a multiuser set-up operating under slow fluid-antenna multiple access (FAMA). In this scenario, a FAHM implementation with only 2 RF chains showcases a performance comparable to a fully-digital conventional multiport scheme with a much larger number of RF chains. Additionally, the proposed receiver architecture attains over 60% reduction in computational burden when integrated with a novel efficient implementation of the state-of-the-art generalized-eigenvector port-selection method.

2026-05-07T21:23:23Z 12 pages, 8 figures, 1 table. This work has been submitted to the IEEE for publication José P. González-Coma José David Vega-Sánchez F. Javier López-Martínez http://arxiv.org/abs/2605.23460v1 Self-Orthogonal Twisted Generalized Reed-Solomon Codes and Their Application to Quantum Error-Correcting Codes 2026-05-22T10:20:20Z

In this paper, two classes of twisted generalized Reed-Solomon (TGRS) codes with multi-twists are studied. Firstly, some sufficient and necessary conditions for these codes to be self-orthogonal and self-dual are established. Then several explicit constructions of self-orthogonal and self-dual codes are presented, from which quantum stabilizer codes are further derived. Finally, some corresponding examples are given, especially that some of these codes are MDS, AMDS or NMDS and that some of the resulting quantum stabilizer codes are optimal, achieving the quantum Singleton bound.

2026-05-22T10:20:20Z Yanxin Chen Yanli Wang Tongjiang Yan http://arxiv.org/abs/2605.23424v1 Sparse In-Network Learning via Shortest-Path Backpropagation and Finite-Rate Gating 2026-05-22T09:35:05Z

In-network learning (INL) trains distributed neural modules by exchanging latent activations and backpropagated errors over a communication graph. This letter proposes Dijkstra-pruned INL (D-INL), which removes non-tree links by retaining a capacity-aware shortest-path tree rooted at the fusion node. To balance sparsity and predictive information, local routing (or aggregation) is modeled as a finite-rate stochastic gate with rate $R_g=I(Z; T)$. We derive a rate-distortion-generalization bound and validate the method on a reproducible distributed-classification experiment, where D-INL reduces training exchange by $70.4\%$ while preserving accuracy within the standard deviation of dense INL. Adding finite-rate regularization further reduces the estimated latent rate by $45.7\%$ relative to unregularized Dijkstra INL.

2026-05-22T09:35:05Z Mohammad Reza Deylam Salehi http://arxiv.org/abs/2605.23421v1 Stochastic Generalized Sampling 2026-05-22T09:32:47Z

Reconstructing an infinite-dimensional signal from a finite set of measurements is a fundamental problem in approximation theory and signal processing. While the generalized sampling (GS) framework provides a robust methodology for recovering elements in arbitrary separable Hilbert spaces, deterministic approaches suffer from severe basis-dependent dimensionality constraints, often requiring a quadratic sample complexity $m \gtrsim n^2$ to avoid numerical instability. In this paper, we introduce a fully stochastic framework for GS that natively overcomes these deterministic barriers. By drawing measurements according to an optimal leverage-score probability distribution, we prove that stable recovery is guaranteed with high probability at a near-linear sample complexity of $m \gtrsim n\log n$. Crucially, this optimal rate is universal-independent of the specific choice of measurement and reconstruction bases-and holds even when the sensing system is a highly redundant frame. To establish these guarantees, we derive a novel matrix Bernstein inequality for random rectangular operators, allowing us to rigorously control the aliasing error governed by the empirical cross-term. Finally, we demonstrate the practical efficacy of our approach on the classical problem of recovering analytic functions from continuous Fourier measurements via Legendre polynomials, where our randomized method achieve near-exponential convergence rates.

2026-05-22T09:32:47Z Luca Finotti Matteo Santacesaria http://arxiv.org/abs/2605.23390v1 Layered construction of Message-Wise Unequal Error Protection Codes 2026-05-22T09:00:49Z

Conventional communication systems are mainly designed to reduce error rates and increase transmission rates, and therefore usually provide uniform protection to all transmitted messages. However, in intent-oriented applications, different messages may have different semantic meanings and importance levels, requiring different levels of reliability. This paper proposes a layered construction of message-level unequal error protection (UEP) codes for short-blocklength communication. Instead of appending an explicit protection tag to each codeword, the proposed method embeds the protection structure directly into the Hamming-distance structure of the codebook. By assigning larger minimum intra-level distances to higher-importance message groups and imposing suitable inter-level distance constraints, the proposed codebook provides differentiated error-correction capabilities while enabling reliable importance-level classification at the receiver. Theoretical conditions for correct group classification are derived, and simulations over AWGN and VLC-ISI channels show that the proposed scheme improves BER performance and group classification accuracy compared with a tag-based ECC baseline.

2026-05-22T09:00:49Z 6pages,5 figures Qiming Lu Shan Lu Takaya Yamazato http://arxiv.org/abs/2605.23362v1 Instance-Optimal Estimation with Multiple LLM Judges on a Budget 2026-05-22T08:26:08Z

Evaluating large language models increasingly relies on LLM-as-a-judge protocols, but such evaluations remain costly: different judges have different prices and reliabilities, and the difficulty of each prompt-response pair can vary substantially. This raises a basic allocation question: under a fixed budget, how should one distribute evaluation queries across heterogeneous judges and instances to obtain the most accurate score estimates? We formalize this question as *budgeted heteroskedastic multi-judge estimation*. Given $K$ prompt-response pairs, $J$ judges with known costs, and unknown query-judge variances, the goal is to estimate a bounded score vector while minimizing an $\ell_p$-error. Our first contribution is to analyze the inverse-variance weighted estimator (IVWE) and to derive the oracle allocation that minimizes its error rate. Since this allocation depends on the unknown variances, we then address the practical unknown-variance setting by proposing EST-IVWE, an adaptive algorithm that constructs and leverages *optimistically biased* variance estimates to stabilize the empirical allocation. We prove that EST-IVWE matches the oracle IVWE rate up to lower-order terms in the budget. Our second and central theoretical contribution is a matching *local* minimax lower bound, which establishes the instance-optimality of the proposed algorithms. A key technical insight is that Fano-type high-probability arguments are too coarse for this problem: their packing construction loses the local variance structure that governs the optimal allocation. We instead use an Assouad-type in-expectation argument, based on local perturbations, which preserves this structure and yields the sharp allocation-dependent lower bound. Finally, we numerically validate the superiority of our approach over naïve uniform allocation on synthetic and HelpSteer2 datasets.

2026-05-22T08:26:08Z 53 pages, 4 figures; the first two authors contributed equally Junghyun Lee Sanghwa Kim Yassir Jedra Alexandre Proutière Se-Young Yun http://arxiv.org/abs/2602.07235v2 ArcMark: Distortion-Free Multi-Byte LLM Watermark via Optimal Transport 2026-05-22T08:22:26Z

Watermarking is an important tool for promoting the responsible use of large language models (LLMs). Existing watermarks insert a signal into generated tokens that either flags LLM-generated text (zero-bit watermarking) or encodes more complex messages (multi-bit watermarking). Though a number of recent approaches insert multiple bits into text without perturbing average next-token predictions, they largely extend design principles from the zero-bit setting, such as encoding a single bit per token. In contrast, a watermarker capable of embedding multiple bytes into the text would dramatically increase the potential applications, by embedding information such as the ID of the user who submitted the prompt, the precise model version that was used, or even the prompt itself. We address this problem by introducing ArcMark: a new watermark construction based on coding and information-theoretic principles that is capable of reliably embedding multiple bytes of information into just a few hundred tokens, without any distortion of the underlying LLM next-token distribution. We derive ArcMark by formulating the distortion-free watermarking problem as a channel coding problem, and deriving an information-theoretic channel capacity that establishes the fundamental limit of embedding information in LLM output in a distortion-free manner. This capacity formulation informs the design of ArcMark. In practice, ArcMark outperforms competing multi-bit distortion-free watermarks in terms of reconstruction accuracy, including in the face of attacks that alter a subset of the LLM text. ArcMark output is also shown to be indistinguishable from unwatermarked text in terms of perplexity, and in downstream task quality.

2026-02-06T22:28:03Z Atefeh Gilani Sajani Vithana Carol Xuan Long Oliver Kosut Lalitha Sankar Flavio P. Calmon