Masked Neural Detection for Constrained Channel Coding in Molecular Communication

2026-06-10T11:09:11Z

Molecular communication (MC) suffers from severe diffusion memory because molecules released for one symbol may arrive during later symbols. Neural sequence detectors, especially sliding bidirectional recurrent neural networks (SBRNNs), can substantially outperform threshold detectors in such channels. This raises a central question for MC channel coding: does a code whose advantage was established under threshold detection retain it when both coded and uncoded transmission are evaluated with neural detection? This letter answers this question for run-length-limited ISI-mitigation (RLIM) codes, a class of constrained codes previously shown to provide large BER gains in MC. Across the tested operating points, the best RLIM-SBRNN receiver beats the best uncoded receiver, chosen between threshold and SBRNN detection, in $46$ of $59$ cases, with a mean gain of $10.36\times$ over those wins. We also propose an RLIM-tailored training mask for compact SBRNN detectors, improving the unmasked RLIM-SBRNN in $227$ of $236$ comparisons with $3.267\times$ mean gain when masking is beneficial. Finally, the compact masked RLIM-SBRNN is competitive with channel-state-aware MLSE despite using no channel knowledge.

STCC: A Unified Source-Channel Semantic Token Coding Framework for Semantic Communications

2026-06-10T08:51:51Z

Deep Joint Source-Channel Coding (JSCC) has emerged as a promising paradigm for overcoming the ``cliff effect" in wireless communications. However, existing Deep JSCC frameworks operate directly on raw analog data such as image pixels rather than the discrete semantic tokens that foundation models require. Moreover, traditional systems employ fixed, hand-designed constellations that treat all tokens equally, leading to catastrophic random errors under channel noise. In this paper, the Semantic Token Codebook Communication (STCC) is proposed as a unified source-channel semantic token coding framework designed to transmit the discrete semantic tokens of foundation models over noisy channels. The core of STCC is the Semantic Token Codec (STC). It accepts discrete tokens as input, which maintains compatibility with foundation models while employing a residual multiple layer perceptron, i.e., MLP-based encoder that learns geometrically structured constellations optimized with a triple-loss objective. This learned mapping forces the channel topology to align with the semantic embedding space, ensuring that channel noise results in topological errors rather than random corruption. This phenomenon is theoretically and empirically characterized, identifying ``Semantic Drift" in symbolic modalities and ``Structural Distortion" in perceptual modalities, where errors shift predictions to semantically or structurally similar tokens. Extensive experiments demonstrate that STCC significantly outperforms traditional systems in low-SNR regimes, effectively converting channel noise into semantic variations without requiring receiver-side modification.

What Limits Does Quantization Place on Dense Top-$k$ Retrieval? A Theoretical Study

2026-06-10T08:11:41Z

We establish conditions for embedding a corpus of $N$ documents as $d$-dimensional vectors such that every $k$-subset $S \subseteq [N]$ is realizable as a result of top-$k$ retrieval by some query vector. Recent work shows that $d = O(k)$ suffices for such embeddings to exist in $\mathbb{R}^d$, independently of $N$. We theoretically prove that this corpus-independent bound is specific to infinite precision. With $B$ bits per coordinate, perfect top-$k$ retrieval requires $Bd = Ω(k \ln N)$; thus, at any fixed precision, the dimension must grow at least logarithmically with $N$. Specializing to a $\ell_2$-normalized $B$-bit uniform scalar quantization model, we also identify a threshold on the precision $B^{*} = O(\ln \ln N)$ below which no dimension suffices, together with two further regimes that bound the feasible $(B, d)$ pairs. Our result implies that in practical vector databases and dense retrieval systems where quantization is standard, the embedding dimension and possibly the precision must grow with the corpus size.

Segment-Wise Soft Robotics Inspired Flexible Antenna Arrays: Design and Optimization

2026-06-10T07:56:47Z

In this paper, we propose a segment-wise soft robotic antenna (SRA) system, where each soft robotic arm referred to as a tentacle, comprises multiple independently controllable segments with bending, elongation-retraction, and sweeping motions. By adjusting segment motion parameters, the positions of surface-mounted antennas are reconfigured, distinguishing it from conventional reconfigurable antenna (RA) systems. Based on this model, we propose two antenna deployment schemes: the segmented end-antenna configuration (SEAC), where fixed antennas are mounted at the segment ends and reconfigured via segment motions; and the hybrid end-and-intermediate antenna configuration (HEIAC), where RAs are further integrated as intra-segment antennas. In HEIAC, soft-robot segment deformation provides large-scale spatial reconfiguration, while RAs enable fine-grained adjustment. For SEAC, we formulate a sum-rate maximization problem accounting for inter-segment connectivity and the nonlinear mapping from segment deformation parameters to antenna coordinates, and develop a penalty dual decomposition-projected gradient ascent (PDD-PGA) algorithm. For HEIAC, we jointly optimize segment deformation, intra-segment antenna positions, and antenna activation using a block coordinate descent (BCD)-PDD-PGA algorithm with greedy backward antenna selection. Simulation results demonstrate that the proposed schemes substantially outperform fixed-position antenna arrays and conventional RA baselines. In particular, SEAC and HEIAC achieve 37.9% and 32.1% sum-rate gains over conventional 3D reconfigurable arrays, respectively, while SEAC provides up to a 49.3% gain in compact array deployments.

Non-special Divisors, LCPs of Codes, and LCD Codes on Kummer Extensions

2026-06-10T07:43:09Z

Recently, constructions of linear complementary pairs (LCPs) of codes and linear complementary dual (LCD) codes on function fields have attracted considerable attention due to the wide range of applications of these codes. Such constructions rely on non-special divisors of degrees $g$ and $g-1$. In this work, we investigate Kummer extensions defined by $y^m = f(x)$ with $f(x)\in\mathbb{F}_q(x)$ and establish an arithmetic characterization of non-special divisors whose support can contain non-totally ramified places. Based on this characterization, we explicitly construct non-special divisors of degree $g-1$ on the GK curve. Moreover, utilizing pure gaps, we explicitly provide several families of effective non-special divisors of degree $g$ on Kummer extensions with the same multiplicities. We then develop a general framework for constructing LCPs of algebraic geometry (AG) codes on Kummer extensions. By virtue of canonical divisors, we show that the security parameters of LCPs of AG codes can be determined within this framework, which also enables the construction of LCD AG codes. Finally, we illustrate our results with representative examples, including LCPs of codes on the GK curve and LCD codes on quotients of the Hermitian curve.

A New Approach to Code Smoothing Bounds

2026-06-10T05:40:50Z

Code smoothing is a phenomenon in which an error distribution makes a code statistically close to the uniform distribution over the ambient space. This closeness is measured by total variation distance. Recently, Debris-Alazard et al.\ introduced a smoothing bound, which is an upper bound on this total variation distance. Although the smoothing bound evaluates how the error distribution smooths a code, this bound applies only to linear codes. In this paper, we generalize this bound to not only linear codes but also specific non-linear codes. While the smoothing bound in previous work was obtained by Fourier analysis over finite abelian groups, we derive this bound using a graph-theoretic approach. To derive the smoothing bound, we consider code smoothing as the mixing of random walks on a specific graph, and use the concept of equitable partitions, which is well-studied in graph theory.

On the independence number of de Bruijn graphs

2026-06-10T05:31:59Z

We derive the asymptotic formula $α(k,q)=λ_{k-1}q^k+o(q^k)$, where $α(k,q)$ is the independence number of the de Bruijn graph $B(k,q)$, and $λ_{k-1}$ is a constant arising from a variational problem on the unit $(k-1)$-dimensional cube. When $k=4$, we show the bounds $91/240\le λ_3\le 11/28$. For odd prime $k$, we analyse the binary case $q=2$ via a phase reduction on rotation orbits. For $k=11,13,17$ this yields compact orbit-marker certificates for optimal constructions. Combined with a lifting theorem by Lichiardopol, these certificates give exact formulas for $α(11,q)$, $α(13,q)$, and $α(17,q)$ for all $q\ge2$, extending the known cases $k=3,5,7$.

Vision-Language-Action Models Meet World Models: Embodied Agentic AI for Low-Altitude Wireless Networks

2026-06-10T03:30:05Z

Low-Altitude Wireless Networks (LAWNs), composed of Unmanned Aerial Vehicles (UAVs) and other aerial platforms, provide integrated perception, communication, and computation services in low-altitude airspace. However, deploying large generative models in this domain faces three major challenges: 1) Limited embodied action mapping; 2) Inadequate physical environment modeling; 3) Insufficient closed-loop optimization. To address these challenges, this study proposes an Embodied Agentic UAV framework. Centered on a Vision-Language-Action (VLA) model as the execution core, the framework establishes an end-to-end embodied decision-making pipeline from multimodal environmental perception to continuous control generation. In addition, a World Model (WM) is introduced to capture the coupling between UAV actions and environmental state evolution, thereby supporting environment prediction, policy verification, and dynamic optimization. Furthermore, memory and reflection mechanisms are incorporated to form an adaptive closed-loop optimization paradigm of decision, execution, evaluation, and update, thereby enhancing the system's autonomous decision-making capability and continual evolution ability in complex dynamic environments. Experimental results validate its effectiveness in enabling robust, predictive, and sustainable autonomous control in LAWNs.

Superspace Concentration and Adversarial Robustness in Quantum Algorithms

2026-06-10T02:13:40Z

We study superspace concentration as a quantum resource, formalized through the focus measure F(\r{ho}) = λ_max(\r{ho}_super) - the largest eigenvalue of the reduced superspace state - which quantifies the capacity of a quantum system to concentrate informational weight into a preferred subspace of an extended degree-of-freedom space. We develop a complete resource-theoretic framework around this measure and validate its properties through GPU-accelerated numerical simulation. Analytic decoherence predictions are confirmed to machine precision (1.11 x 10^{-16}) for superspace dimensions dS in {2,4,8,16,32}. Focus monotonicity holds across 10,000 random states with zero violations under four focus-non-generating channels across six system configurations. Focused quantum states resist coherent unitary attacks with significantly greater resilience than standard fidelity predicts, with focus remaining above 0.9 at attack strength ε = 0.302 versus ε = 0.174 for fidelity. We further demonstrate that the focus measure and the U(dS)-asymmetry measure are operationally distinct: asymmetry remains near zero and provides no robustness signal under coherent and targeted attacks while focus tracks spectral concentration and remains robust until ε > 0.3. The connection between Grover's algorithm and superspace concentration is made explicit via the identity F(|ψ_k><ψ_k|) = P(marked), providing a resource-theoretic interpretation of oracle query complexity. Finally, we provide the first numerical characterization of the focus capacity gap ΔF, identifying a log_2(dS) scaling law confirmed for both product and correlated noise channels.

Prime Event Languages: An Information-Theoretic Investigation of Twin-Prime Event Structure

2026-06-10T01:37:04Z

Prime numbers are traditionally studied through numerical, probabilistic, and analytic frameworks. In this work, we introduce the concept of a prime event language, in which arithmetic phenomena are represented as symbolic event sequences and analyzed using tools from information theory and stochastic processes. Using all primes up to $N = 5 \times 10^9$ (234,954,223 primes), we construct event languages based on twin-prime occurrences and record prime-gap events. We investigate their statistical properties through finite-order Markov models, train/test validation, mutual-information analysis, and information-horizon measurements. For the Twin Prime Event Language, first-order Markov modeling reduces test-set cross entropy from 0.325350 bits to 0.319949 bits, corresponding to an information gain of approximately 0.0054 bits. This gain survives out-of-sample validation and therefore reflects genuine statistical structure rather than overfitting. Mutual-information analysis independently confirms the Markov results and shows that measurable dependence is concentrated almost entirely at lag 1. The mutual information decreases from approximately $5.96 \times 10^{-3}$ bits at lag 1 to approximately $5.07 \times 10^{-7}$ bits at lag 2 (approximately 11,700-fold reduction), representing a reduction of more than four orders of magnitude. Beyond lag 2, residual information fluctuates near the statistical noise floor. These results indicate that prime event languages are neither perfectly memoryless nor strongly predictable. Instead, they exhibit weak but reproducible short-range statistical structure characterized by first-order dependence and an effective information horizon of approximately one event.

SDQM: Synthetic Data Quality Metric for Object Detection Dataset Evaluation

2026-06-10T01:28:46Z

The performance of machine learning models depends heavily on training data. The scarcity of large-scale, well-annotated datasets poses significant challenges in creating robust models. To address this, synthetic data generated through simulations and generative models has emerged as a promising solution, enhancing dataset diversity and improving the performance, reliability, and resilience of models. However, evaluating the quality of this generated data requires an effective metric. We introduce the Synthetic Dataset Quality Metric (SDQM) to assess data quality for object detection tasks without requiring model training to converge. This metric enables more efficient generation and selection of synthetic datasets, addressing a key challenge in resource-constrained object detection tasks. In our experiments, SDQM demonstrated a strong correlation with the mean average precision (mAP) scores of YOLO11, a leading object detection model, whereas previous metrics only exhibited moderate or weak correlations. In addition, it provides actionable insights into improving dataset quality, minimizing the need for costly iterative training. This scalable and efficient metric sets a new standard for evaluating synthetic data. The code for SDQM is available at https://github.com/ayushzenith/SDQM

Measuring language complexity from hierarchical reuse of recurring patterns

2026-06-10T00:29:25Z

We introduce the ladderpath index as a measure of language complexity grounded in algorithmic information theory. It counts the minimum steps needed to reconstruct a sequence through hierarchical reuse of repeated substructures, capturing an exactly computable but constrained form of algorithmic compressibility related to, but distinct from, Kolmogorov complexity. We apply the ladderpath approach to 21 parallel corpora from the Parallel Universal Dependencies dataset. The ladderpath index is approximately invariant across the languages, and varies much less than the corpus length. This is more pronounced when all corpora are mapped to a unified binary representation, providing evidence for the equi-complexity hypothesis from a representation-independent perspective. We also observe trade-offs between character inventory size and corpus length, and between vocabulary-level and corpus-level reconstruction complexity, supporting the trade-off hypothesis that total complexity is conserved and redistributed across linguistic levels. The reusable substructures identified by the ladderpath approach, without any linguistic input, overlap with words and morphological components attested in the natural vocabulary. The hierarchical reuse captured by the ladderpath approach parallels the chunking mechanisms proposed in cognitive science, where the human cognitive system compresses linguistic input into nested, reusable units under shared memory and processing constraints. This connection between cognitive chunking and the ladderpath approach provides a new interpretation for the equi-complexity and trade-off hypotheses, grounding both in the shared cognitive architecture that underlies language processing across human languages.

Maximizing Connectivity of Uplink RIS-Assisted UAV Networks

2026-06-10T00:00:37Z

In this paper, we present a new approach for unmanned aerial vehicle (UAV) positioning and reconfigurable intelligent surface (RIS) partitioning to enhance connectivity of uplink RIS-assisted UAV networks. To achieve this, our approach optimizes RIS-aided link selection, RIS partitioning, and UAV positions to maximize network connectivity characterized by its Fiedler value. Meanwhile, it maintains a specific signal-to-interference plus noise ratio (SINR) constraint for user equipment (UE), which is influenced by RIS partitioning and UAV reliability. The network connectivity optimization problem is formulated using the Fiedler value subject to RIS elements allocation and SINR constraints. This problem is a computationally expensive combinatorial optimization, necessitating an efficient iterative approach. In particular, we propose a perturbation method for RIS-aided link selection, and derive a closed-form solution for RIS partitioning, with each partition tailored to optimize SINR for individual UAV. For the given RIS-aided links and RIS partitioning, we then show that the problem of UAV positioning can be formulated as a low complexity semi-definite programming (SDP) optimization problem, which can be solved using off-the-shelf CVX solvers. Our simulations show the potential gain of UAV positioning and RIS partitioning compared to the benchmark schemes from the literature.

Joint Movable Antenna Positioning and RIS Partitioning for Sum-Rate Maximization

2026-06-09T23:40:05Z

This paper investigates the utility of the movable antenna (MA) and reconfigurable intelligent surface (RIS) framework for downlink wireless communications. In the considered scenario, a base station (BS) is equipped with two sub-arrays of MAs transmits signals to the users via the RIS. By jointly exploiting the antenna-positioning flexibility of MAs and the RIS element selection capability, the proposed joint MA-RIS framework introduces additional design degrees of freedom to enhance desired signals and mitigate inter-user interference, thereby maximizing the network sum-rate. To this end, we formulate a joint optimization problem involving MA positioning, sub-array beamforming, and RIS element selection, subject to the minimum antenna separation and transmit power constraints. The resulting problem is highly non-convex and challenging to solve directly. To address this issue, an alternating optimization framework is developed that decomposes the problem into three tractable subproblems. Specifically, zero-forcing beamforming is employed for transmit beamformer design, a low-complexity one-dimensional search is derived for RIS element selection, and the MA positioning problem is solved using block coordinate descent (BCD) and convex optimization techniques implemented via CVX. Simulation results demonstrate that the proposed joint MA-RIS framework significantly improves the achievable sum-rate compared with conventional fixed MAs and benchmark schemes with random configurations.

Fluid Antenna Systems Enabling 6G HRLLC With Port Switching Delay

2026-06-09T23:07:25Z

Fluid antenna systems (FAS) exploit antenna position reconfigurability to unlock massive spatial diversity within compact form factors, making them a promising enabler for 6G user terminals (UTs). However, practical port switching incurs latency and signaling overhead, which can be particularly detrimental to hyper-reliable low-latency communications (HRLLC) under finite blocklength operation. This paper investigates FASenabled HRLLC by explicitly capturing the coupled effects of spatial correlation, port switching delay, and finite blocklength coding. We derive exact closed-form expressions for the average block error rate (BLER) and average achievable rate over spatially correlated fading channels. The resulting analysis reveals a fundamental design trade-off: increasing the number of ports improves diversity but linearly reduces the effective blocklength, thereby intensifying finite-blocklength penalties. A key theoretical contribution is a rigorous proof that reliability, achievable rate, and energy efficiency are strictly unimodal in the port dimension, ensuring a unique optimal port configuration. Furthermore, we characterize an explicit switching-delay threshold that separates regimes where FAS yields net gains over fixed-position antenna (FPA) systems. Numerical results validate the analysis and show that substantial HRLLC performance gains are achievable when the switching latency remains below the derived bound.