When Efficient Communication Explains Convexity

2026-05-08T22:52:17Z

Much recent work has argued that the variation in the languages of the world can be explained from the perspective of efficient communication; in particular, languages can be seen as optimally balancing competing pressures to be simple and to be informative. Focusing on the expression of meaning -- semantic typology -- the present paper asks what factors are responsible for successful explanations in terms of efficient communication. Using the Information Bottleneck (IB) approach to formalizing this trade-off, we first demonstrate and analyze a correlation between optimality in the IB sense and a novel generalization of convexity to this setting. In a second experiment, we manipulate various modeling parameters in the IB framework to determine which factors drive the correlation between convexity and optimality. We find that the convexity of the communicative need distribution plays an especially important role. These results move beyond showing that efficient communication can explain aspects of semantic typology into explanations for why that is the case by identifying which underlying factors are responsible.

On Reducing Decoding Complexity of Successive-Cancellation List Flip Decoding of Polar Codes

2026-05-08T21:14:32Z

The recently proposed SCLF decoding algorithm for polar codes improves the error-correcting performance of state-of-the-art SCL decoding. However, it comes at the cost of a higher complexity. In this paper, partitioned polar codes tailored for the proposed PSCLF decoding algorithm are used to reduce the complexity of SCLF. Indeed, compared to SCLF, PSCLF allows early termination and is able to restart by skipping part of the decoding tree traversed sequentially. In order to maximize the coding gain, design of partitions tailored to PSCLF is proposed. In this extended paper, dynamic flip metric is used, as well as the possibility to flip multiple times during SCL. An analysis on the impact of this strategy on the early-termination or the CRC collisions encountered in PSCLF is carried out. Error-correction performance of multiple code rates and multiple partition strategies are shown. With the baseline algorithm SCL with $L=2$, degradation of $0.05$ dB is shown with respect to SCL-64, using $ω=3$ flip per trial with $T_{max}=300$ trials. Numerical results show that the proposed PSCLF algorithm has an error-correction performance gain of up to 0.1 dB with respect to SCLF with same decoding parameters. This work is also compared with existing techniques to reduce the complexity of the SCLF decoding algorithm. The proposed algorithm reduces the complexity up to 77 % at the frame-error rate of $0.01$ with respect to SCLF and is able to reduce more the decoding complexity of SCLF embedding as well a restart mechanism. The average execution time of PSCLF matches the latency of SCL at $\text{FER}=4\cdot10^{-3}$ and lower.

On Observation Time for Recovering Latent Hawkes Networks

2026-05-08T19:06:26Z

Dynamics of interacting systems in engineering, society, and nature often evolve over latent networks that govern which entities can interact. We study the problem of inferring these networks from event-based observations, which arise naturally in finance, seismology, and neuroscience. While there is substantial algorithmic work addressing this important problem, theoretical results are scarce. In this paper we ask the following fundamental question: what is the minimum time that one must observe the dynamics in order to exactly recover the underlying network, as a function of the number $d$ of interacting entities? For a class of stationary Hawkes processes with sparse, weak interactions, we prove that an observation time of order $\log d$ is sufficient and necessary. For the upper bound we construct a two-stage estimator that uses clipped and binned event data for screening, followed by a least-squares refinement, and apply concentration bounds derived from the Poisson cluster representation. For the lower bound we combine Fano's inequality with Jacod's Girsanov formula for point processes on a suitable subclass of networks.

Finite-State Dimension and The Davenport Erdős Theorem

2026-05-08T19:03:50Z

A 1952 result of Davenport and Erdős states that if $p$ is an integer-valued polynomial, then the real number $0.p(1)p(2)p(3)\dots$ is Borel normal in base ten. A later result of Nakai and Shiokawa extends this result to polynomials with arbitrary real coefficients and all bases $b\geq 2$. It is well-known that finite-state dimension, a finite-state effectivization of the classical Hausdorff dimension, characterizes the Borel normal sequences as precisely those sequences of finite-state dimension 1. For an infinite set of natural numbers, and a base $b\geq 2$, the base $b$ Copeland-Erdős sequence of $A$, $CE_b(A)$, is the infinite sequence obtained by concatenating the base $b$ expressions of the numbers in $A$ in increasing order. In this work we investigate the possible relationships between the finite-state dimensions of $CE_b(A)$ and $CE_b(p(A))$ where $p$ is a polynomial. We show that, if the polynomial is permitted to have arbitrary real coefficients, then for any $s,s^\prime$ in the unit interval, there is a set $A$ of natural numbers and a linear polynomial $p$ so that the finite-state dimensions of $CE_b(A)$ and $CE_b(p(A))$ are $s$ and $s^\prime$ respectively. The corresponding result for strong finite-state dimension is also shown. We demonstrate that linear polynomials with rational coefficients do not change the finite-state dimension of any Copeland-Erdős sequence, but there exist polynomials with rational coefficients of every larger integer degree that change the finite-state dimension of some sequence. We also prove the surprising fact that there exist sets $A$ and integer-valued monomials $p$ such that $CE_b(A)$ is normal, but $CE_b(p(A))$ has finite-state dimension strictly less than one.

Learning to Transmit Over Unknown Erasure Channels with Empirical Erasure Rate Feedback

2026-05-08T17:46:00Z

We address the problem of reliable data transmission within a finite time horizon $T$ over a binary erasure channel with unknown erasure probability. We consider a feedback model wherein the transmitter can query the receiver infrequently and obtain the empirical erasure rate experienced by the latter. We aim to minimize a regret quantity, i.e. how much worse a strategy performs compared to an oracle who knows the probability of erasure, while operating at the same block error rate. A learning vs. exploitation dilemma manifests in this scenario -- specifically, we need to balance between (i) learning the erasure probability with reasonable accuracy and (ii) utilizing the channel to transmit as many information bits as possible. We propose two strategies: (i) a two-phase approach using rate estimation followed by transmission that achieves an $O({T}^{\frac 23})$ regret using only one query, and (ii) a windowing strategy using geometrically-increasing window sizes that achieves an $O({\sqrt{T}})$ regret using $O(\log(T))$ queries.

Secure Integrated Sensing and Communication against Communication and Sensing Eavesdropping

2026-05-08T17:39:12Z

Sensing privacy and communication confidentiality play fundamentally different but interconnected roles in adversarial wireless environments. Capturing this interplay within a single physical-layer framework is particularly challenging in integrated sensing and communication (ISAC) systems, where the same waveform simultaneously serves dual purposes. We study a secure ISAC system in which a monostatic transmitter simultaneously sends a confidential message to a legitimate receiver and senses an environmental state, while a passive adversary attempts both message decoding and state estimation. We partially characterize the fundamental trade-offs among three performance measures: the transmitter's secrecy rate, its detection exponent, and the adversary's detection exponent. Beyond the joint input distribution that governs overall performance, the trade-offs are further shaped by the transmitter's ability to extract keys via feedback and hide both the content and structure of the codewords via wiretap and resolvability codes. We derive an achievable region, and illustrate the resulting design trade-offs through a numerical example.

Semantic Smoothing for Language Models via Distribution Estimation and Embeddings

2026-05-08T16:50:25Z

We propose semantic smoothing, a smoothing method for language models that uses embeddings to share statistical observations across semantically similar contexts. The starting point is a decomposition of log-perplexity that motivates smoothing as a collection of distribution-estimation problems under Kullback-Leibler (KL) loss. We then show that, under a Lipschitz-logit model for embedding-based language generation, proximity of context embeddings implies proximity of the corresponding next-word distributions in KL divergence. Combining these observations, we formulate semantic smoothing as distribution estimation in KL loss with KL-proximity side information. For $n$ samples on a $d$-symbol alphabet with a side-information distribution at KL distance $Δ$, we give an interpolation estimator with worst-case KL risk $O(\min\{Δ,d/n\})$, and prove a matching-order lower bound for uniform side information. We extend the estimator to multiple and empirically estimated synonymous distributions. Experiments on synthetic Markov data and WikiText-103 bigram models using Word2Vec, GloVe, and GPT-2 embeddings show that semantic smoothing consistently reduces test perplexity when applied to add-constant and Kneser-Ney estimates.

When Diffusion Model Can Ignore Dimension: An Entropy-Based Theory

2026-05-08T16:30:37Z

Diffusion models perform remarkably well on high-dimensional data such as images, often using only a modest number of reverse-time steps. Despite this practical success, existing convergence theory does not fully explain why such samplers remain efficient in high dimensions. Many prior KL guarantees bound the discretization error in terms of the ambient dimension, while other improved results replace this dependence using intrinsic-dimensional or geometric structure assumptions. In this work, we develop an alternative information-theoretic perspective on diffusion sampler convergence. We prove that, for Gaussian mixture targets, the discretization error is controlled by the Shannon entropy of the latent mixture component rather than by the ambient dimension. Consequently, the leading step complexity scales linearly with latent entropy and depends only logarithmically on the second moment of the data. Our analysis also extends to discrete target distributions, where the relevant complexity is the entropy of the target rather than the dimension of the embedding space. These results suggest that diffusion sampling can remain efficient in high-dimensional spaces when the data distribution admits a compact latent representation, as is widely believed to be the case for natural images.

Geometric Characteristics and Stable Guarantees for Phaseless Operators and Structured Matrix Restoration

2026-05-08T15:39:27Z

In this paper, we first propose a unified framework for analyzing the stability of the phaseless operators for both amplitude and intensity measurement on an arbitrary geometric set, thereby characterizing the robust performance of phase retrieval via the empirical minimization method. We introduce the random embedding of concave lifting operators to characterize the unified analysis of any geometric set. Similarly, we investigate the robust performance of structured matrix restoration problem through the robust injectivity of a linear rank one measurement operator on an arbitrary matrix set. The core of our analysis is to establish unified empirical chaos processes characterization for various matrix sets. Talagrand's $γ_α$-functionals are employed to characterize the connection between the geometric constraints and the number of measurements required for stability or robust injectivity. We also construct adversarial noise to demonstrate the sharpness of the recovery bounds derived through the empirical minimization method in the both scenarios.

Beam-Aware Radio Map Estimation With Physics-Consistent Parametric Modeling for Unknown Multiple Satellites

2026-05-08T14:05:47Z

Satellite networks with dense low Earth orbit (LEO) constellations rely on aggressive spectrum reuse, making co-channel interference a dominant and rapidly varying factor that limits link availability and complicates spectrum sharing and compliance. Satellite radio map (RM) construction is therefore essential for interference cognition, yet it is challenging because the active satellite set is unknown, beam footprints and pointing are not directly observable, and received signal strength (RSS) measurements are difficult to calibrate under coupled link budget variations and noise. These latent uncertainties yield a severely underdetermined inverse problem with strong signature coherence, where existing methods often trade detection recall for precision and still fail to recover a faithful continuous RSS field. This paper proposes a beam-aware RM estimation framework that unifies active satellite identification and RSS field reconstruction through physics-consistent parametric modeling. An interpretable structural prior links geometry and beam shaping to spatial RSS formation, and an adaptive model order selection strategy infers the number of active satellites from measurements by balancing fit and complexity. Extensive experiments across varying signal to noise ratio (SNR), total satellite count, and active satellite count demonstrate consistently higher RSS spatial correlation, lower root mean squared error (RMSE), and improved F1 score, validating the proposed approach for interference-aware satellite RM construction in satellite networks.

Affine Subcode Ensemble Decoding for Degeneracy-Aware Quantum Error Correction

2026-05-08T13:30:02Z

Quantum low-density parity-check codes are promising candidates for low-overhead fault-tolerant quantum computing, but degeneracy is known to impair the convergence of belief-propagation (BP) decoding of these codes. In this work, we show that appending linearly independent rows to a check matrix of a stabilizer code can reduce the search space for a valid degenerate solution. Motivated by this, we extend the recently proposed affine subcode ensemble decoding technique from the classical to the quantum setting. Moreover, we employ overcomplete matrices for each decoding path. Monte-Carlo simulations on toric and generalized bicycle codes demonstrate improved convergence and reduced logical error rate.

Future Validity is the Missing Statistic: From Impossibility to $Φ$-Estimation for Grammar-Faithful Speculative Decoding

2026-05-08T13:08:18Z

Grammar-constrained generation is often combined with local vocabulary masking and speculative decoding, but the resulting sampling law is not the grammar-conditional distribution users usually intend. We show that any speculative decoder with local mask access, Leviathan rejection, and rollback soundness samples from the locally projected distribution $μ^{\mathrm{proj}}$ rather than the grammar-conditional distribution $μ^\star$. This extends the GAD impossibility result to speculative decoding; on Dyck grammars with Qwen3-8B, the total-variation gap can reach 0.996. We identify the future-validity function $Φ_t(y)=\Pr_p[\mathrm{valid\ completion}\mid y]$ as the missing correction statistic. The target distribution is a Doob transform of the base model with $h=Φ$, while local masking corresponds to setting $h$ to one. With exact $Φ$, our oracle decoder FVO-Spec samples exactly from $μ^\star$; with approximate $Φ$, we bound the resulting total-variation error. Because exact future validity is hard for general context-free grammars, we evaluate estimator hierarchies on tractable Dyck and finite JSON languages. OneStep reduces Dyck TV by 14% with under 1% throughput overhead, exact dynamic programming reduces it by 97%, and finite-language correction closes JSON gaps to numerical precision. All fidelity claims are scoped to enumerable grammars and token tries.

Optical Communications with Relative Intensity Noise: Channel Modeling and Information Rates

2026-05-08T11:55:56Z

We consider optical communications with intensity modulation and direct detection affected by laser relative intensity noise (RIN). Starting from a continuous-time waveform model, we derive an equivalent discrete-time channel model. As a result of RIN, the resulting channel model exhibits signal-dependent noise with memory. Unlike the commonly-assumed model in the literature, the conditional variance of this noise term has a polynomial dependence on the symbol of interest. Finally, we study achievable information rates for this channel under practically-relevant system parameters. We take a mismatched decoding approach and compute the generalized mutual information (GMI) using a memoryless decoding metric. Our numerical results show that when the memory in the channel is ignored by the receiver, GMI saturates as the constellation size increases, and thus, dense constellations do not offer gains. We also show that this saturation results from nonsymmetric nonvanishing contributions of the symbols to the GMI.

Secure Beamforming and Reflection Design for RIS-ISAC Systems Under Collusion of Passive and Active Eavesdroppers

2026-05-08T11:24:35Z

In the paper, the physical-layer security for reconfigurable intelligent surface (RIS) aided integrated sensing and communication (ISAC) system is studied. There is an active eavesdropper (AE) as well as a passive eavesdropper (PE), and they cooperate each other. By joint base station beamforming and RIS reflection design, we aim to achieve the best secure data communications with guaranteed sensing performance. Mathematically, taking the constraints on sensing performance and transmission power in consideration, the system secrecy rate maximization problem is formulated with respect to transmit beamforming, RIS reflection, and receive beamforming. The formulated problem is non-convex and is decomposed to three subproblems by applying the alternating optimization (AO). For the decomposed subproblem, we utilize the quadratic penalty method and successive convex approximation (SCA) for the solution derivation. Thereafter, an iterative numerical algorithm, referred to as the joint beamforming and reflection design (JBRD) algorithm, is proposed. Finally, numerical results demonstrate the effectiveness and superiority of the proposed algorithm.

A Syndrome-Space Approach to Proximity Gaps and Correlated Agreement for Random Linear Codes

2026-05-08T11:11:17Z

Proximity gaps and correlated agreement have become central tools in the analysis of interactive oracle proofs of proximity (IOPPs) and code-based SNARKs. Informally, a proximity-gap statement says that for a structured set of words -- such as a line, an affine space, or a curve -- either all points are close to the code, or most are far from it. Such statements are essential in sampling-based proof systems, where a verifier queries only a few random locations on a structured object but must still obtain a global soundness guarantee. In Reed--Solomon-based proof systems, one would ideally like the proximity parameter to approach the information-theoretic limit $1-R$, since this is the largest possible radius for a rate-$R$ code and directly affects protocol efficiency. While recent work has substantially strengthened the picture for algebraic codes and linked proximity gaps to decoding-related structural properties, it remains unclear whether analogous results for random linear codes can be proved directly, rather than through decoding-theoretic surrogates. In this work, we establish a direct approach to proximity gaps and correlated agreement for random linear codes in the random parity-check-matrix model, without relying on list decoding as the main engine of the proof. Our approach is based on a syndrome-space reformulation together with a witness-based reduction mechanism, and it yields strong results for affine lines, affine spaces, and polynomial curves. It is conceptually different from the existing decoding-driven route for random linear codes, and it also leads to sharper parameters, including the optimal-up-to-$\varepsilon$ large-alphabet radius bound $ρ<1-R-\varepsilon$ for $q=Θ(n)$, as well as near-capacity bounds over constant alphabets with improved alphabet-size requirements.