https://arxiv.org/api/0nFPvWC2Z1bDxqs5ko9pZFldnN42026-06-09T21:32:30Z546781515http://arxiv.org/abs/2606.03570v2STC: Reversible Digit-Context Decomposition for BWT-Family Text Compression2026-06-08T08:15:16ZBurrows-Wheeler-transform-based compressors rely on local context regularity, but structured text also contains dates, counters, identifiers, coordinates, and other digit runs whose values vary differently from their surrounding tokens. STC is a practical BWT-family compressor that separates this source of variation before the component BWT stage. It replaces digit runs in the main stream with an unambiguous placeholder and stores the removed digits in length- and context-conditioned side streams. The side streams use stable bucket ordering and compact digit packing, so the decoder can reconstruct the original run order from the normalized main stream without storing a separate permutation. The resulting components are encoded by a fixed internal BWT/M03-style component coder. On enwik9, STC produces a 157,388,188-byte archive with a 183,174-byte decoder source package, giving a local LTCB-style total of 157,571,362 bytes. A full-enwik9 same-coder ablation shows that the digit-context decomposition reduces the archive by 2,629,561 bytes relative to the no-split control. The result is locally verified by full decode and SHA-256 matching; official benchmark status requires independent maintainer-side verification.2026-06-02T12:38:58Z16 pages, 3 figures, 6 tables. Code and data: https://github.com/thu-nmrc/STC-for-BWT-FamilyText-CompressionJingyang DuYang ShenAnling Xianghttp://arxiv.org/abs/2606.09176v1Performance Evaluation of Social Learning2026-06-08T08:14:56ZSocial Learning is a decentralized decision-making paradigm in which spatially dispersed agents collect streaming observations regulated by one of a finite number of models (the hypotheses). The agents are interested in assigning probability scores (the beliefs) to the possible hypotheses. To this end, the agents exchange their beliefs according to a certain communication graph. It has been shown that, under reasonable conditions on the identifiability of the decision model and the network connectivity, each agent ultimately places all the belief mass on the true hypothesis governing the data. However, several questions remain unanswered regarding the evaluation of the social learning performance. One recently adopted performance metric is the rejection rate, i.e., the rate at which the beliefs about the erroneous hypotheses vanish. One contribution of this work is to establish that the rejection rate leads to several paradoxes, which make it unsuitable as a valid performance measure. We then focus on studying the error probability measure. For a binary Gaussian problem, we derive an analytical formula characterizing the ratio between the individual agents' probabilities and the optimal Bayesian probability. The formula shows that this ratio is expressed by the product of two terms quantifying the effect of the network connectivity and the role of the prior information. As a result, an irreducible gap emerges between the decentralized and the centralized error probabilities, which is agent-dependent and does not disappear asymptotically.2026-06-08T08:14:56ZThis work has been submitted to the IEEE for possible publicationFelice ScalaMarco CarpentieroVincenzo MattaAli H. Sayedhttp://arxiv.org/abs/2606.09161v1Extreme Points of the $(0,δ)$-LDP Polytope with Small Input Size and Arbitrary Output Sizes2026-06-08T07:53:59ZThe structure of locally differentially private (LDP) mechanisms can be understood through the geometry of the corresponding privacy polytope. While the extreme points of the \( (ε,0)\)-LDP polytope are well characterized (Kairouz \emph{et al.}, 2014; Holohan \emph{et al.}, 2017; Pensia \emph{et al.}, 2017), comparatively little is known for the \((ε,δ)\)-LDP polytope with \(δ>0\). Recent work (Elangovan and Jog, 2024) has shown that even in the special case \(ε=0\), the \( (0,δ) \)-LDP privacy polytope exhibits fundamentally different behaviour. In this work, we provide complete characterizations of the extreme points for the low-input-alphabet regime \(k=2\) and \(k=3\) and with arbitrary output alphabet size \(m \). We also identify new extreme mechanisms for larger input alphabet sizes $k$, of the star configuration type, as introduced by Elangovan and Jog (2024).2026-06-08T07:53:59ZSupriya RawatMyna VajhaGowtham R. KurriAnand Sarwatehttp://arxiv.org/abs/2606.09014v1Deterministic versus Stochastic Optimization for Joint Path Planning and Dynamic Time Splitting in Multiple-UAV-Cached IoT Networks2026-06-08T04:27:10ZThis paper examines wireless-powered Internet of Things (IoT) networks involving multiple unmanned aerial vehicles (UAVs) equipped with backscatter and caching technologies to relay and transmit signals. For data communication and energy harvesting (EH), the source transmits information and power to UAVs using the dynamic time splitting (DTS) method. UAVs use harvested energy for passive communication (backscatter) and for active communication (transmitting information) to the destination. The primary objective is to maximize the total throughput by jointly optimizing the DTS ratio, trajectory, and transmission power, leveraging the UAVs' caching capability. This optimization problem is challenging due to its non-convexity. Therefore, an efficient alternating algorithm using the block coordinate descent (BCD) method is proposed to optimize each variable given the fixed values of the other parameters. By applying the Karush-Kuhn-Tucker (KKT) conditions, we derive a closed-form expression for the optimal DTS ratio, significantly reducing computation time. The optimal values for the other two parameters are determined using the BCD. In order to thoroughly assess the effectiveness of various solutions for the original problem, this paper introduces an approach leveraging a genetic algorithm (GA). The GA in this context employs a one-point crossover method, value mutation, and rank-based selection based on fitness values. Numerical results show that the BCD and GA achieve at least 31% throughput improvement over the benchmarks, with reduced computational time. These findings demonstrate the performance gain and practical feasibility of our solutions in caching-enabled UAV-aided IoT networks.2026-06-08T04:27:10Z15 pages, 6 figures, and 7 tables. Accepted by the IEEE IoT JournalTrinh Van ChienDinh Thanh TungWaqas KhalidNgo Cong DungBanh Thi Quynh MaiSymeon Chatzinotashttp://arxiv.org/abs/2606.08964v1Embedding linear codes over Z4 into self-orthogonal codes2026-06-08T03:04:08ZThe purpose of this paper is to investigate the self-orthogonal embedding problem for linear codes over Z4. We propose several tight bounds on the length of the shortest self-orthogonal embedding over Z4, and determine the exact shortest self-orthogonal embedding length under specific conditions. As an example satisfying these conditions, we establish the exact length of the shortest self-orthogonal embedding for the quaternary Preparata codes. Furthermore, to establish these results, we completely classify the exact length of the shortest doubly even self-orthogonal embedding for binary linear codes in every possible case. Finally, when the shortest self-orthogonal embedding length of a given free code over Z4 is equal to the shortest doubly even self-orthogonal embedding length of its residue code, we present an algorithm to construct all possible shortest self-orthogonal embeddings. With our algorithm, we found twelve linear codes over Z4 whose minimum Lee distances are higher than those of the Z4-linear codes in Aydins database.2026-06-08T03:04:08ZJunmin AnJon-Lark KimSan Linghttp://arxiv.org/abs/2601.22526v2Flexible FTN-Aided OTFS Modulation for High-Mobility LEO Satellite-to-Ground Communications2026-06-08T02:59:54ZIn low Earth orbit (LEO) satellite communications, the link quality fluctuates drastically during a satellite pass, exhibiting a wide dynamic range from the horizon to the zenith. Moreover, the high relative velocity induces severe Doppler shifts. While orthogonal time frequency space (OTFS) modulation effectively resolves the doubly-selective fading, its spectral efficiency is fundamentally bounded by the Nyquist limit. To break this bottleneck while adapting to dynamic channel variations, this paper proposes a LEO satellite-assisted flexible faster-than-Nyquist (FFTN)-OTFS (LEO-FFTN-OTFS) scheme. Conventional fixed-parameter FTN signaling suffers from severe inter-symbol interference at low elevation angles or spectral inefficiency at the zenith. To overcome this, a low-complexity Look-Up Table (LUT) mechanism is designed to adaptively optimize the time-domain compression factor based on the instantaneous signal-to-noise ratio. At the receiver, a linear minimum mean-square error (LMMSE) detector is formulated to suppress the colored noise and structured interference with minimal computational overhead. Besides, a rigorous theoretical framework is established incorporating 3GPP Tapped Delay Line (TDL) channel models to derive analytical expressions for effective throughput, energy efficiency, and bit error rate (BER) bounds.Simulation results demonstrate that the proposed adaptive scheme eliminates the irreducible error floor inherent in aggressive static FTN configurations at low SNRs, and maximizes the effective throughput across the entire elevation trajectory, achieving a superior trade-off between spectral efficiency and transmission reliability.2026-01-30T04:00:01ZSubmitted to IEEE JournalChaorong ZhangBenjamin K. NgHui XuYue LiuChan-Tong LamHalim Yanikomerogluhttp://arxiv.org/abs/2603.18077v2A New Approach to Code Smoothing Bounds2026-06-08T02:56:32ZCode smoothing is a phenomenon in which an error distribution makes a code statistically close to the uniform distribution over the ambient space. This closeness is measured by total variation distance. Recently, Debris-Alazard et al.\ introduced a smoothing bound, which is an upper bound on this total variation distance. Although the smoothing bound evaluates how the error distribution smooths a code, this bound applies only to linear codes. In this paper, we generalize this bound to not only linear codes but also specific non-linear codes. While the smoothing bound in previous work was obtained by Fourier analysis over finite abelian groups, we derive this bound using a graph-theoretic approach. To derive the smoothing bound, we consider code smoothing as the mixing of random walks on a specific graph, and use the concept of equitable partitions, which is well-studied in graph theory.2026-03-18T06:56:21ZTsuyoshi MiezakiYusaku NishimuraKatsuyuki Takashimahttp://arxiv.org/abs/2601.20600v2Shortest LCD embeddings of binary, ternary and quaternary linear codes2026-06-08T02:54:01ZIn the recent years, there has been active research on self-orthogonal embeddings of linear codes since they yielded some optimal self-orthogonal codes. LCD codes have a trivial hull so they are counterparts of self-orthogonal codes. So it is a natural question whether one can embed linear codes into optimal LCD codes. To answer it, we first determine the number of columns to be added to a generator matrix of a linear code in order to embed the given code into an LCD code. Then we characterize all possible forms of shortest LCD embeddings of a linear code. As examples, we start from binary and ternary Hamming codes of small lengths and obtain optimal LCD codes with minimum distance 4. Furthermore, we find new ternary LCD codes with parameters including $[23, 4, 14]$, $[23, 5, 12]$, $[24, 6, 12]$, and $[25, 5, 14]$ and a new quaternary LCD $[21, 10, 8]$ code, each of which has minimum distance one greater than those of known codes. This shows that our shortest LCD embedding method is useful in finding optimal LCD codes over various fields.2026-01-28T13:38:17ZJunmin AnJi-Hoon HongJon-Lark KimHaeun Limhttp://arxiv.org/abs/2606.08895v1Optimal Regret Exponents for Bayesian Statistical Decision Problems2026-06-08T00:44:24ZWe study finite-state finite-action Bayesian statistical decision problems. While exact error-exponent characterizations are known for several special cases, including hypothesis testing and hypothesis exclusion, the asymptotic behavior of the optimal Bayes regret is largely unknown for general decision problems. In this paper, we show that the optimal regret always decays exponentially fast and characterize its exact exponent for arbitrary loss functions. The exponent is given by the minimum multivariate Chernoff information over the minimal incompatible subsets of states, where an incompatible subset is a collection of states for which no single action is optimal for all states in the subset. Our result recovers the classical pairwise-minimum Chernoff exponent for symmetric multiple hypothesis testing and the multivariate Chernoff exponent for hypothesis exclusion, while also yielding, to the best of our knowledge, the first exact exponent characterization for list hypothesis testing.2026-06-08T00:44:24Z5 pages. This work has been submitted to the IEEE for possible publicationHyun-Young ParkSi-Hyeon Leehttp://arxiv.org/abs/2505.04753v2Hybrid-Field 6D Movable Antenna for Terahertz Communications: Channel Modeling and Estimation2026-06-07T22:58:55ZIn this work, we study a six-dimensional movable antenna (6DMA)-enhanced Terahertz (THz) network that supports a large number of users with a few antennas by controlling the three-dimensional (3D) positions and 3D rotations of antenna surfaces/subarrays at the base station (BS). However, the short wavelength of THz signals combined with a large 6DMA movement range extends the near-field region. As a result, a user can be in the far-field region relative to the antennas on one 6DMA surface, while simultaneously residing in the near-field region relative to other 6DMA surfaces. Moreover, 6DMA THz channel estimation suffers from increased computational complexity and pilot overhead due to uneven power distribution across the large number of candidate position-rotation pairs, as well as the limited number of radio frequency (RF) chains in THz bands. To address these issues, we propose an efficient hybrid-field generalized 6DMA THz channel model, which accounts for planar wave propagation within individual 6DMA surfaces and spherical waves among different 6DMA surfaces. Furthermore, we propose a low-overhead channel estimation algorithm that leverages directional sparsity to construct a complete channel map for all potential antenna position-rotation pairs.
Numerical results show that the proposed hybrid-field channel model achieves a sum rate close to that of the ground-truth near-field channel model and confirm that the channel estimation method yields accurate results with low complexity.2025-05-07T19:33:08ZXiaodan ShaoYixiao ZhangShisheng HuZhixuan TangMingcheng HeXinyu HuangWeihua ZhuangXuemin Shenhttp://arxiv.org/abs/2606.08829v1Flexible Coupler Antenna Enhanced Wireless Communication: Modeling and Coupler Position Optimization2026-06-07T20:48:10ZThis paper proposes a novel flexible coupler antenna (FCA) that translates passive coupling elements around a fixed-position active antenna to reshape the induced currents on the passive elements for radiation. A new form of mechanical beamforming is achieved by moving only the passive coupling elements while keeping the active antenna stationary. The proposed design significantly reduces the antenna and radio-frequency (RF) chain costs of conventional active array beamforming with low mechanical control complexity and energy consumption. For the purpose of exposition, we consider a point-to-point communication system with one FCA at the transmitter and one fixed antenna at the receiver. Specifically, based on multi-port circuit theory, we establish both the line-of-sight (LoS) and multipath channel models and derive the mechanical beamforming weights of the passive couplers as functions of their positions. Then, we formulate a new problem to maximize the received signal-to-noise ratio (SNR) by optimizing the positions of passive couplers at the transmitter, subject to coupler movement and transmit power constraints. Solving the resulting problem is inherently difficult because coupled channel and mechanical beamforming create non-linearity in the objective function.To tackle this problem, we propose an efficient block-coordinate conditional gradient method to search for the best positions of all passive couplers by sequentially optimizing the position of each coupler with those of the other couplers fixed in an iterative manner.Simulation results demonstrate that the proposed system significantly outperforms benchmark schemes in terms of achievable rate, but
with significantly reduced active antennas and RF chains.2026-06-07T20:48:10Z13 pagesXiaodan ShaoChuangye ShanYunlong DuJunling LiRui ZhangCheng-Xiang Wanghttp://arxiv.org/abs/2606.08774v1Non-Uniform Codebook Design for Optical IRS-Assisted VLC Systems2026-06-07T18:38:44ZOptical intelligent reflecting surfaces (OIRS) can improve the coverage of indoor visible light communication (VLC) systems, however, practical deployment requires a finite offline codebook to avoid repeated real-time optimisation of mirror orientations. A uniform codebook with fixed angular steps does not provide uniform coverage on the user plane, because the mapping from steering angles to reflection locations on the user plane is nonlinear. To address this problem, this paper proposes a geometric-optics-based non-uniform codebook design for OIRS-assisted VLC systems. The proposed method constructs an individual codebook for each IRS element according to its geometric position, so that the reflected beams are distributed more uniformly over the user plane. The codebook accuracy is evaluated using the Frobenius norm of the channel error matrix. Simulation results show that the proposed design provides more uniform spatial mapping with fewer codewords than the uniform codebook, and that the sweep-angle resolution has a stronger effect on the codebook accuracy than the tilt-angle resolution.2026-06-07T18:38:44ZRashid IqbalDimitrios BozanisDimitrios TyrovolasChristos K. LiaskosMuhammad Ali ImranGeorge K. KaragiannidisHanaa Abumarshoudhttp://arxiv.org/abs/2605.19228v2Diagnosing Multi-step Reasoning Failures in Black-box LLMs via Stepwise Confidence Attribution2026-06-07T18:34:11ZLarge Language Models have achieved strong performance on reasoning tasks with objective answers by generating step-by-step solutions, but diagnosing where a multi-step reasoning trace might fail remains difficult. Confidence estimation offers a diagnostic signal, yet existing methods are restricted to final answers or require internal model access. In this paper, we introduce Stepwise Confidence Attribution (SCA), a framework for closed-source LLMs that assigns step-level confidence based only on generated reasoning traces. SCA applies the Information Bottleneck principle: steps aligning with consensus structures across correct solutions receive high confidence, while deviations are flagged as potentially erroneous. We propose two complementary methods: (1) NIBS, a non-parametric IB approach measuring consistency without graph structures, and (2) GIBS, a graph-based IB model that learns subgraphs through a differentiable mask to capture logical variability. Extensive experiments on mathematical reasoning and multi-hop question answering show that SCA reliably identifies low-confidence steps strongly correlated with reasoning errors. Moreover, using step-level confidence to guide self-correction improves the correction success rate by up to 13.5\% over answer-level feedback.2026-05-19T00:57:51ZAccepted by ICML 2026Xiaoou LiuTiejin ChenDengjia ZhangYaqing WangLu ChengHua Weihttp://arxiv.org/abs/2606.08771v1Algebra of Bivariate-Bicycle Surface Codes2026-06-07T18:24:32ZWe relate the properties of bivariate-bicycle-surface (BBS) codes, constructed from a pair of bivariate polynomials over a finite field, to the number and location of their common roots in the extension field. The number of roots $(x,y)$ with finite, non-zero coordinates -- counted with algebraic multiplicity -- determines the dimension of the codes. This dimension is invariant under monomial automorphisms of the Laurent polynomial ring. Conversely, roots with zero or infinite $x$- or $y$-coordinates indicate that specialized generators are required near the corresponding boundary (e.g., the left or right boundary for a root where $x$ is zero or infinite, respectively). These roots can appear or disappear under monomial transformations, which reveals the structure of tilted boundaries. Based on these results, we formulate a prescription for constructing BBS codes that works for regions with rectangular, diagonal, and arbitrarily tilted boundaries. A key advantage of this approach is that no corner corrections are needed, provided the polynomials satisfy orientation-specific edge conditions.2026-06-07T18:24:32Z22 pages, 8 figures includedRenyu WangLeonid P. Pryadkohttp://arxiv.org/abs/2606.08750v1New Codes from Cyclic and Negacyclic Codes of Even Length over $\mathbb{Z}_4$2026-06-07T17:48:47ZThis paper uses theoretical results previously established in the literature to design search algorithms to find new linear codes over $\mathbb{Z}_4$ from cyclic and negacyclic codes of even length. As a result of these searches, we have found 2500 new cyclic codes and 730 negacyclic codes. These new codes exhibit improved parameters compared to previously known codes. Additionally, we have obtained binary quantum codes with good parameters from such $\mathbb{Z}_4$ codes.2026-06-07T17:48:47ZNuh AydinMohamed O. BelghithGodwin IdowuTrang T. T. NguyenLong B. Tran