https://arxiv.org/api/YlmC3MMGzFfvKLr91Q1VNhYd1ss2026-06-21T23:49:08Z5484125515http://arxiv.org/abs/2512.03767v3CaFTRA: Frequency-Domain Correlation-Aware Feedback-Free MIMO Transmission and Resource Allocation for 6G and Beyond2026-06-07T07:31:49ZThe fundamental designs of wireless systems toward AI-Native 6G and beyond are driven by the need for ever-increasing demand of mobile data traffic, extreme spectral efficiency, and adaptability across diverse service scenarios. To overcome the limitations posed by feedback-based multiple-input and multiple-output (MIMO) transmission, we propose a novel frequency-domain Correlation-aware Feedback-free MIMO Transmission and Resource Allocation (CaFTRA) framework tailored for fully-decoupled radio access networks (FD-RAN) to meet the emerging requirements of AI-Native 6G and beyond. By leveraging artificial intelligence (AI), CaFTRA effectively eliminates real-time uplink feedback by predicting channel state information (CSI) based solely on user geolocation. We introduce a Learnable Queries-driven Transformer Network for CSI mapping from user geolocation, which utilizes multi-head attention and learnable query embeddings to accurately capture frequency-domain correlations among resource blocks (RBs), thereby significantly improving the precision of CSI prediction. Once base stations (BSs) adopt feedback-free transmission, their downlink transmission coverage can be significantly expanded due to the elimination of frequent uplink feedback. To enable efficient resource scheduling under such extensive-coverage scenarios, we apply a low-complexity many-to-one matching theory-based algorithm for efficient multi-BS association and multi-RB resource allocation, which is proven to converge to a stable matching within limited iterations. Simulation results demonstrate that CaFTRA achieves stable matching convergence and significant gains in spectral efficiency and user fairness compared to 5G, underscoring its potential value for 6G standardization efforts.2025-12-03T13:15:52Z17 pages, 19 figures. Accepted by IEEE Transactions on Mobile ComputingIEEE Transactions on Mobile Computing, 2026Bo QianHanlin WuJiacheng ChenYunting XuXiaoyu WangHaibo ZhouYusheng Ji10.1109/TMC.2026.3696984http://arxiv.org/abs/2605.25085v2Polynomial Context-Truncation Sensitivity in Autoregressive Language Models: Sequential Wyner-Ziv Bounds for KV Cache Compression2026-06-07T06:58:29ZWe study the rate-distortion limits of online KV cache compression in autoregressive language models, formulating it as sequential Wyner-Ziv source coding on the filtration induced by the model, with the next-step query as decoder side information. Empirically, across four models spanning two families and $0.5$-$3$B parameters, we find that the next-token distribution's sensitivity to context truncation decays \emph{polynomially} rather than \emph{geometrically}: a power law improves on an exponential fit by an order of magnitude in extrapolation, the fitted exponent is recovered independently from a sink-plus-recent KL measurement, and the decay is verified to be free of positional-encoding artifacts by a position-preserving ablation. Under a corresponding \emph{polynomial truncation-sensitivity} assumption, our main result characterizes the per-token memory requirement of \emph{suffix-only} cache policies: a sliding-window scheme attains distortion $\varepsilon$ with window $w = O(\varepsilon^{-1/α})$, and -- under an additional two-sided Bayes-risk condition -- a converse shows $w = Ω(\varepsilon^{-1/α})$ is necessary within this policy class, so the scaling is $Θ(\varepsilon^{-1/α})$ for suffix-only policies. Whether recurrent or propagating cache summaries can beat this scaling is left open. An explicit block-Markov scheme achieves the upper bound; its rate-of-convergence exponent matches the converse under additional forward-decay and regularity hypotheses (not implied by truncation sensitivity alone), and differs by a factor of two otherwise. Empirically, the polynomial law predicts the degradation curves of concrete cache policies: recency-based eviction (sliding, sink-plus-recent) suppresses distortion by roughly two orders of magnitude over random retention at equal budget, with a power-law decay in the budget.2026-05-24T13:54:13ZMunsik Kimhttp://arxiv.org/abs/2605.18856v3SPHERICAL KV: Angle-Domain Attention and Rate-Distortion Retention for Efficient Long-Context Inference2026-06-07T06:58:25ZLong-context inference is increasingly constrained by the KV cache: resident memory grows with context length, and decoding becomes limited by repeated High Bandwidth Memory (HBM) streaming rather than arithmetic. Existing methods such as eviction, windowing, quantization, and offloading reduce footprint, but often leave the critical-path bottleneck only partially addressed, especially when compressed states must still be reconstructed into dense vectors during decoding.
We present Spherical KV, a long-context inference method that treats KV allocation as a rate-distortion problem grounded in attention geometry for efficient decoding. The method is built on two ideas: (i) represent directional information cheaply in the decode hot loop, and (ii) allocate retention and precision according to estimated future utility. Its first component, Angle-Domain Attention (ADA), stores keys in a spherical parameterization consisting of a scalar radius and compact angle codes, and computes attention logits directly from these codes without reconstructing dense keys. This preserves a paged, block-local, fusion-friendly decode path and directly targets HBM traffic in realistic serving settings. Its second component, Rate-Distortion Retention (RDR), jointly chooses keep/drop decisions and precision tiers per token and head under a fixed budget, producing tier-homogeneous pages with lightweight metadata and coalesced reads. Together, ADA and RDR provide a deployment-oriented mechanism for reducing KV residency while preserving decode efficiency.2026-05-13T18:48:48ZAnay ChauhanGurucharan Marthi Krishna KumarArion DasAmit DhandaVinija JainAman ChadhaAmitava Dashttp://arxiv.org/abs/2606.08463v1Simplest Nontrivial Maxwellian Random Field Models for Stochastic LoS MIMO Using the Dyadic Green's Function2026-06-07T05:58:23ZThis letter introduces a novel, full-wave, physics-compliant stochastic dyadic Green's function (SDGF) framework for modeling electromagnetic (EM) multiple-input-multiple-output (MIMO) channels under wavenumber uncertainty. Unlike conventional phenomenological fading models, the proposed approach provides what appear to be the simplest exact random field models of electromagnetic line-of-sight (LoS) propagation that are also exact solutions of Maxwell's equations. Hence, we dub them Maxwellian random field theoretic models. These physically consistent stochastic models, including an analytically tractable wavenumber Gaussian model and a more general stochastic plane wave (SPW) model, serve as fundamental baseline models for stochastic LoS channel characterization. By preserving the vectorial structure of Maxwell's equations and the dispersion relation, the framework naturally incorporates both propagating and evanescent modes. Our analysis of ergodic capacity and degrees of freedom (DoF) reveals that the key results of the complex SPW model can be reproduced by the simpler Gaussian model with limited variance. Furthermore, we provide examples using 2D continuous MIMO systems, illustrating how the model's Maxwell-consistent stochasticity explains observed increases in channel capacity and DoF over the deterministic MIMO capacity baseline. These idealized Maxwellian random field theoretic models offer a physically grounded reference point for understanding fundamental limits in stochastic LoS propagation environments.2026-06-07T05:58:23ZLumeng XuSaid Mikkihttp://arxiv.org/abs/2606.09922v1The Bioelectrical Information Theory: Investigating the theoretical compression limit of bioelectrical signals under artificial intelligence2026-06-07T04:18:07ZBioelectrical signals are increasingly acquired at scales that challenge the bandwidth of brain-computer interfaces. However, their compression is still often framed as a problem of waveform preservation, limited by the entropy of the raw signal. Here we propose an information-theoretic framework in which the effective information of bioelectrical data is determined not only by signal fidelity, but also by physiological structure, model capacity and downstream task requirements. We formulate bioelectrical compression as a three-level hierarchy. At the signal level, noise is reduced to the information they carry about latent physiological sources. At the physiological level, parametric encoders map purified signals into compact, structured and quantized representations. At the semantic level, task-irrelevant information is discarded, while deep learning models exploit causal dependencies to replace marginal entropy with conditional entropy. This perspective reframes the compression limit of bioelectrical signals as a model- and task-conditioned quantity rather than a fixed property of the waveform. As increasingly expressive models become integrated with neural and physiological interfaces, bioelectrical compression may shift from transmitting signals to transmitting only the residual information required for task-level interpretation.2026-06-07T04:18:07ZJiawen ZouBo Yanhttp://arxiv.org/abs/2606.08385v1A Switching Beamformer for Highly Non-Stationary Environments2026-06-07T00:44:39ZAdaptive beamforming is a cornerstone of array signal processing, yet its performance often collapses in the face of complex, rapidly changing interference. When interferers appear or move unpredictably, conventional estimators encounter a fundamental memory trade-off: short windows enable rapid tracking but suffer from high estimation variance, while long windows provide stable rejection but fail to adapt to shifts. This challenge is resolved by introducing the Universal Switching Beamformer (USB), which integrates competitive sequential prediction into the beamforming architecture. By employing a linear transition diagram, the USB implicitly maintains an exponentially large family of candidate covariance histories and dynamically re-weights them based on their cumulative output power. This mechanism allows the beamformer to automatically vary its effective memory length without explicit change detection or heuristic parameter tuning. A theoretical upper bound is proven on the regret relative to an omniscient oracle that selects the best piecewise-stationary covariance model in hindsight. Extensive simulations and experiments on the SwellEx-96 dataset demonstrate that the USB achieves the agility of short-window estimators and the precision of long-term integration, providing a principled solution for tracking highly non-stationary scenes.2026-06-07T00:44:39Z11 pages, 19 figures, under reviewManan MittalRyan M. CoreyJohn R. BuckAndrew C. Singerhttp://arxiv.org/abs/2604.20897v2Watts-per-Intelligence Part II: Algorithmic Catalysis2026-06-06T17:22:07ZWe develop a thermodynamic theory of algorithmic catalysis within the watts per intelligence framework, identifying reusable computational structures that reduce irreversible operations for a task class while satisfying bounded restoration and structural selectivity constraints. We prove that any class specific speed-up is upper-bounded by the algorithmic mutual information between the substrate and the class descriptor, and that encoding this information incurs a minimum thermodynamic cost via Landauer erasure. Combining these results yields a coupling theorem that lower-bounds the deployment horizon required for an algorithmic catalyst to be energetically favourable. The framework is illustrated on an affine SAT class and situates contemporary learned systems within an information thermodynamic constraint on intelligent computation.2026-04-21T13:36:33ZCamera ready version, AGI-2026Elija Perrierhttp://arxiv.org/abs/2602.03363v2Entropy Functions on Two-Dimensional Faces of Polymatroid Region Spanned by a Matroid and a Rank-One Matroid2026-06-06T17:12:18ZCharacterization of entropy functions is of fundamental importance in information theory. By imposing constraints on their Shannon outer bound, i.e., the polymatroidal region, one obtains the faces of the region and entropy functions on them with special structures. In this paper, we characterize entropy functions on 2-dimensional faces of polymatroidal region of degree n spanned by a matroid and a rank-1 matroid. We classify all such 2-dimensional faces into four types.2026-02-03T10:35:44ZKaizhe HeQi Chenhttp://arxiv.org/abs/2601.06077v2One if by Land, Two if by Sea, Three if by Four Seas, and More to Come -- Values of Perception, Prediction, Communication, and Common Sense in Decision Making2026-06-06T15:59:47ZThis work aims to rigorously define the values of perception, prediction, communication, and common sense in decision making. The defined quantities are decision-theoretic, but have information-theoretic analogues, e.g., they share some simple but key mathematical properties with Shannon entropy and mutual information, and can reduce to these quantities in particular settings. One interesting observation is that, the value of perception without prediction can be negative, while the value of perception together with prediction and the value of prediction alone are always nonnegative. The defined quantities suggest answers to practical questions arising in the design of autonomous decision-making systems. Example questions include: Do we need to observe and predict the behavior of a particular agent? How important is it? What is the best order to observe and predict the agents? The defined quantities may also provide insights to cognitive science and neural science, toward the understanding of how natural decision makers make use of information gained from different sources and operations.2025-12-29T19:18:19ZAolin Xuhttp://arxiv.org/abs/2603.15249v2On the Nonasymptotic Bounds of Joint Source-Channel Coding with Hierarchical Sources2026-06-06T12:40:41ZThis paper establishes tractable bounds of joint source-channel coding with hierarchical sources in the finite blocklength regime. In this setting, both the indirect source and observable source must be reconstructed under correlated distortion constraints, leading to a joint excess-distortion event. First, to build computable tight bounds, we introduce a novel $\mathsf{d}(\cdot)$-functional distortion relaxation, which enables tractable and tight bounding of the joint excess-distortion probability induced by correlated sources. By this approach, the nonasymptotic converse and achievability bounds are given. Second, Gaussian approximations for the proposed bounds are obtained, which are optimal for the transmission of a Gaussian memoryless source over an additive white Gaussian noise channel with mean-square error distortion. The optimal scheme is obtained via a structured analysis that captures the intrinsic tradeoff between semantic and observable reconstructions. Furthermore, for the transmission of Gaussian memoryless sources over AWGN channels, we obtain explicit and computable bounds, by providing a new geometric structure involving three correlated spherical regions. This results extend the classical two-spherical region analysis for a single distortion constraint. Numerical simulations demonstrate that the proposed achievability and converse bounds tightly sandwich the Gaussian approximation and align closely with Monte Carlo numerical results.2026-03-16T13:19:54ZShuo ShaoChao QiJincheng DaiWenrui DaiHongkai Xionghttp://arxiv.org/abs/2606.08124v1Soft Covering via Hypothesis Testing: Typical-Code Exponents and Mismatched Detection2026-06-06T11:57:39ZWe study the typical-code (quenched) behavior of the false-alarm (FA) and missed-detection (MD) error exponents of the Neyman-Pearson test associated with soft covering, complementing the average-code (annealed) analysis that has been carried out in a companion paper [1]. We prove that, as the block-length tends to infinity, for almost every randomly selected fixed-composition codebook, the negative normalized logarithms of both error probabilities converge to their respective average-code exponents. In other words, the error exponents are self-averaging. We then extend the scope and study a mismatched likelihood ratio test that assumes the wrong channel model. Here, we derive the mismatched error exponents, show that self-averaging persists under mismatch, and characterize the degradation. In particular, we characterize the coding rate beyond which the two kinds of error exponents cannot be positive at the same time, which in the matched case, is given by the channel input-output mutual information rate.2026-06-06T11:57:39Z20 pages, submitted for publicationNeri Merhavhttp://arxiv.org/abs/2606.08017v1Fluid Antenna System-Enabled Mitigation of Asynchronous Reception in Cell-Free Massive MIMO Systems2026-06-06T07:13:49ZPractical distributed deployments inherently suffer from asynchronous signal arrivals, which exacerbate multi-user interference and degrade system performance, especially for coherent transmission. To natively mitigate the asynchronous reception effect, this paper proposes integrating fluid antenna systems (FASs) into distributed cell-free massive MIMO systems, exploiting their reconfigurable spatial positions to release additional spatial degrees of freedom (DoFs). We establish the FAS-enabled data transmission model with asynchronous reception, i.e., delay phases. We also derive the analytical downlink spectral efficiency (SE) performance of the proposed system under coherent and non-coherent transmissions, using low-complexity Maximum Ratio (MR) precoding to provide fundamental theoretical bounds. Specifically, we propose a novel nonmonotone accelerated projected gradient ascent algorithm to jointly optimize FAS positions and power control coefficients, maximizing the downlink sum SE. Numerical results demonstrate that while asynchronous reception severely degrades system performance for coherent transmission, the spatial DoFs unlocked by optimized FAS positions, along with efficient power control, can significantly counteract the effects of unknown delay phases and outperform traditional fixed-position antennas. For non-coherent transmission, which inherently bypasses asynchronous reception, the application of FAS leverages spatial reconfigurability to natively maximize signal strength and achieve more pronounced SE gains. Ultimately, our proposed FAS-enabled system, coupled with efficient power control, mitigates performance degradation due to asynchronous reception and outperforms traditional fixed-position antennas, paving the way for the practical deployment of FASs in robust, highly efficient 6G cell-free massive MIMO systems.2026-06-06T07:13:49Z13 pages, 6 figures. This work has been submitted to the IEEE for possible publicationJun QianZan LiJunhui RaoRoss MurchKhaled B. Letaiefhttp://arxiv.org/abs/2602.11788v2The Arithmetic Singleton Bound on the Hamming Distances of Simple-rooted Constacyclic Codes over Finite Fields2026-06-06T03:03:52ZIn this work, We introduce a new upper bound on the Hamming distance of simple-root constacyclic codes over finite fields, which we call the arithmetic Singleton bound. The main technical tool is the notion of a multiple equal-difference (MED) representation. Via the MED representations of the defining set of the generator polynomial of a simple-root constacyclic code, we obtain a family of upper bounds on its Hamming distance, among which the weakest one coincides with the Singleton bound, while the strongest one is defined to be the arithmetic Singleton bound for this code. Consequently, the arithmetic Singleton bound is always at least as strong as the classical Singleton bound, and is in fact strictly stronger in numerous nontrivial cases. The arithmetic Singleton bound partially measures the restriction on the Hamming distance of a simple-root constacyclic code imposed by its arithmetic structure. In particular, for an irreducible constacyclic code, the MED representations of the defining set of its generator polynomial are completely determined, via which the arithmetic Singleton bound is computed concretely. Finally for any simple-root cyclic code the arithmetic Singleton bound and the BCH bound are compared.2026-02-12T10:09:56ZLi ZhuHongfeng Wuhttp://arxiv.org/abs/2606.07933v1Finite-Blocklength Lossy Joint Source-Channel Coding over Unknown Channels2026-06-06T01:50:58ZWe analyze the finite-blocklength performance of lossy joint source-channel codes (JSCC) in an unknown-channel framework, where the true channel is unknown but the source distribution is known. We establish achievability results for mismatched-design JSCC, where the code design is based on a channel $Q_{Y|X}$ but deployed over a different channel $P_{Y|X}$. Our mismatched-design achievability result allows nonstationary channel laws and arbitrary standard Borel alphabets for the source, reproduction, channel input and channel output. The achievability bound is given in terms of the rate-distortion and rate-dispersion functions, as well as two channel-dependent quantities that we call the mismatched-design rate and mismatched-design rate-dispersion. For block erasure channels, our result shows that channel mismatch incurs no penalty. We then show a second-order universal family of source-channel codes over the set of block erasure channels. Our code construction uses Poisson functional representations of suitable conditional probability measures to produce the encoder and decoder outputs. We use a parameterized family of Gibbs posteriors as the decoder-side kernels, whose envelope recovers the generalized mutual information.2026-06-06T01:50:58ZAdeel MahmoodHarish ViswanathanJinfeng Duhttp://arxiv.org/abs/2606.07931v1Pointwise Complexity for Gaussian Fields: Upper Envelopes, Algorithmic Lower Bounds, and Separation2026-06-06T01:50:06ZWe prove a variance-aware pointwise majorizing-measure theorem for centered Gaussian processes. Classical generic chaining characterizes the scalar quantity $\mathbb E\sup_{x\in T}X_x$; the theorem here gives a simultaneous high-probability envelope for the entire field. For an ambient prior $μ$, the envelope at $x$ is governed by a pointwise Fernique-Talagrand functional \[Φ_μ(x):=\int_0^{4σ(x)}\sqrt{\log\frac{1}{μ(B_d(x,\varepsilon))}}\,d\varepsilon,\] together with the corresponding Gaussian tail term. The theorem provides a reusable field-level refinement of classical generic chaining and a Gaussian-process counterpart of pointwise empirical-process bounds for deep neural networks.
We also record a Bayesian algorithmic lower envelope from the interactive Fano/data-processing principle. For a known prior $π$, an observation channel, and a concrete estimator $\widehat t(Y)$, the lower bound is expressed through the exact ghost small-ball mass $\mathbb E_{Y\sim Q}π(B_d(\widehat t(Y),Δ))$, rather than a worst-case covering number. In Gaussian location experiments, comparison decoders convert Bayes location error into lower bounds on decision-aligned Gaussian ranges. We then construct an elementary weighted-basis example separating the usual Fano relaxation for a fixed prior, the Bayesian algorithmic lower envelope, the pointwise Gaussian envelope on the selected subatlas, and the full-class minimax risk/global Gaussian scale. Together, these results show that algorithmic lower bounds provide local-geometric certificates of pointwise complexity for fixed estimators in overparameterized ambient classes, precisely in regimes where classical minimax theory becomes either too coarse or oracle-dependent.2026-06-06T01:50:06ZYunbei Xu