Synthetic Counterfactual Labels for Efficient Conformal Counterfactual Inference

2026-05-07T15:16:20Z

This work addresses the problem of constructing reliable prediction intervals for individual counterfactual outcomes. Existing conformal counterfactual inference (CCI) methods provide marginal coverage guarantees but often produce overly conservative intervals, particularly under treatment imbalance when counterfactual samples are scarce. We introduce synthetic data-powered CCI (SP-CCI), a new framework that augments the calibration set with synthetic counterfactual labels generated by a pre-trained counterfactual model. To ensure validity, SP-CCI incorporates synthetic samples into a conformal calibration procedure based on risk-controlling prediction sets (RCPS) with a debiasing step informed by prediction-powered inference (PPI). We prove that SP-CCI achieves tighter prediction intervals while preserving marginal coverage, with theoretical guarantees under both exact and approximate importance weighting. Empirical results on different datasets confirm that SP-CCI consistently reduces interval width compared to standard CCI across all settings.

Efficient Online Random Sampling via Randomness Recycling

2026-05-07T14:32:09Z

This article studies the fundamental problem of using i.i.d. coin tosses from an entropy source to efficiently generate random variables $X_i \sim P_i$ $(i \ge 1)$, where $(P_1, P_2, \dots)$ is a random sequence of rational discrete probability distributions subject to an \textit{arbitrary} stochastic process. Our method achieves an amortized expected entropy cost within $\varepsilon > 0$ bits of the information-theoretically optimal Shannon lower bound using $O(\log(1/\varepsilon))$ space. This result holds both pointwise in terms of the Shannon information content conditioned on $X_i$ and $P_i$, and in expectation to obtain a rate of $\mathbb{E}[H(P_1) + \dots + H(P_n)]/n + \varepsilon$ bits per sample as $n \to \infty$ (where $H$ is the Shannon entropy). The combination of space, time, and entropy properties of our method improves upon the Knuth and Yao (1976) entropy-optimal algorithm and Han and Hoshi (1997) interval algorithm for online sampling, which require unbounded space. It also uses exponentially less space than the more specialized methods of Kozen and Soloviev (2022) and Shao and Wang (2025) that generate i.i.d. samples from a fixed distribution. Our online sampling algorithm rests on a powerful algorithmic technique called \textit{randomness recycling}, which reuses a fraction of the random information consumed by a probabilistic algorithm to reduce its amortized entropy cost. On the practical side, we develop randomness recycling techniques to accelerate a variety of prominent sampling algorithms. We show that randomness recycling enables state-of-the-art runtime performance on the Fisher-Yates shuffle when using a cryptographically secure pseudorandom number generator, and that it reduces the entropy cost of discrete Gaussian sampling. Accompanying the manuscript is a performant software library in the C programming language.

A Low-Complexity Framework for Multi-access Coded Caching Systems with Arbitrary User-cache Access Topology

2026-05-07T14:13:19Z

This paper studies the multi-access coded caching (MACC) problem with arbitrary user-cache access topology, which extends existing MACC models that rely on highly structured and combinatorially designed topologies. We consider a MACC system consisting of a single server, $Λ$ cache-nodes, and $K$ user-nodes. The server stores $N$ equal-size files, each cache-node has a storage capacity of $M$ files, and each user-node $k\in[K]$ can access an arbitrary subset of cache-nodes $\mathcal{A}_k\subseteq[Λ]$ and retrieve the cached content stored in cache-nodes $\mathcal{A}_k$. The objective is to design a universal framework for the MACC delivery problem. Decoding conflicts among the requested packets are captured by a conflict graph, and the design of the delivery is reduced to a graph coloring problem, where achieving a lower transmission load corresponds to coloring the graph using fewer colors. Under this formulation, the classical DSatur algorithm achieves a transmission load close to the index-coding (IC) converse bound, thereby providing a practical benchmark. However, its computational complexity becomes prohibitive for large-scale graphs. To overcome this limitation, we develop a learning-driven approach using graph neural networks (GNNs) that efficiently constructs coded multicast transmissions with performance close to the theoretical bounds and generalizes across different user-cache access topologies and numbers of users. In addition, we extend the IC converse bound to MACC systems with arbitrary access topology and propose a low-complexity greedy approximation that closely matches the IC converse bound. Numerical results demonstrate that the proposed approach achieves performance close to the DSatur algorithm and the IC converse bound, while significantly reducing computational complexity, making it well-suited for large-scale MACC systems.

Post-Selection Distributional Model Evaluation

2026-05-07T13:42:24Z

Formal model evaluation methods typically certify that a model satisfies a prescribed target key performance indicator (KPI) level. However, in many applications, the relevant target KPI level may not be known a priori, and the user may instead wish to compare candidate models by analyzing the full trade-offs between performance and reliability achievable at test time by the models. This task, requiring the reliable estimate of the test-time KPI distributions, is made more complicated by the fact that the same data must often be used both to pre-select a subset of candidate models and to estimate their KPI distributions, causing a potential post-selection bias. In this work, we introduce post-selection distributional model evaluation (PS-DME), a general framework for statistically valid distributional model assessment after arbitrary data-dependent model pre-selection. Building on e-values, PS-DME controls post-selection false coverage rate (FCR) for the distributional KPI estimates and we establish explicit conditions under which it is provably more sample efficient than a baseline method based on sample splitting. Experiments on synthetic data, text-to-SQL decoding with large language models, and telecom network performance evaluation demonstrate that PS-DME enables reliable comparison of candidate configurations across a range of reliability levels, supporting the statistically reliable exploration of performance--reliability trade-offs.

Locally Repairable Codes with Availability via Elliptic Function Fields

2026-05-07T13:00:58Z

Locally repairable codes with availability have become essential components in modern large-scale distributed cloud storage systems and numerous other applications. In this paper, we focus on the construction of locally repairable codes with one or two recovering sets via elliptic function fields. Prior pioneering work by Li et al. (IEEE Trans. Inf. Theory, vol. 65, no. 1, 2019) and Ma and Xing (J. Comb. Theory Ser. A., vol. 193, 2023) employed maximal supersingular elliptic curves to obtain several optimal (classical) locally repairable codes. In contrast, we consider ordinary elliptic curves with many rational points. This approach yields several new families of $q$-ary optimal locally repairable codes with length $O(q+2\sqrt{q})$ and flexible locality. Consequently, our work broadens the selection of curves available for the construction of optimal locally repairable codes. Furthermore, we present a general framework for constructing locally repairable codes with two recovering sets via automorphism groups of elliptic function fields. To realize this framework, we devise a novel construction for determining the functions $e_i$ in the construction of locally repairable codes. By employing both supersingular and ordinary elliptic curves, we obtain several families of locally repairable codes with two recovering sets. In particular, we construct a family of $q^2$-ary locally repairable codes with two recovering sets, achieving length $O(q^2+2q)$ and Singleton-defect $O\!\left(\frac{2\ell}{q^2+2q-8\ell}\right)$, where $\ell \mid\mid q + 2$ with $4\ell < q$.

Identification for Inverse Gaussian Channels

2026-05-07T12:20:19Z

We derive lower and upper bounds on the identification capacity of inverse Gaussian channels, a fundamental model for molecular communications in fluid environments. The analysis considers deterministic encoding schemes under a peak time constraint and characterizes the asymptotic growth of codebook sizes. A central result reveals that, under a mild regularity condition on the noise, i.e., the stochastic first arrival time of an information-carrying molecule propagating via diffusion and drift to the receiver, the identification capacity exhibits super-exponential growth in the codeword length, $n,$ i.e., $\sim 2^{(n \log n)R},$ where $R$ is the coding rate.

On the Intersection and Composition properties of conditional independence

2026-05-07T12:18:51Z

Compositional graphoids are fundamental discrete structures which appear in probabilistic reasoning, particularly in the area of graphical models. They are semigraphoids which satisfy the Intersection and Composition properties. These important properties, however, are not enjoyed by general probability distributions. This paper surveys what is known about them, providing systematic constructions of examples and counterexamples as well as necessary and sufficient conditions. Novel sufficient conditions for both properties are derived in the context of discrete random variables via information-theoretic tools.

Geometric Means and Lebesgue-type Decomposition of Completely Positive Maps

2026-05-07T11:14:32Z

We introduce the geometric mean and the parallel sum of completely positive (CP) maps on von Neumann algebras, based on the Pusz--Woronowicz theory of positive sesquilinear forms. We provide a concrete characterization via a block matrix positivity condition and establish their fundamental properties, including the AM--GM--HM inequality with respect to the CP order. In finite-dimensional settings, our construction is compatible with the Choi--Jamiolkowski correspondence, under which the geometric mean of CP maps corresponds to the Kubo--Ando geometric mean of their Choi matrices. This yields a natural operator-theoretic framework for interpolating quantum channels. As an application, we obtain index-type inequalities for conditional expectations in subfactor theory. Finally, we establish a Lebesgue-type decomposition of CP maps via a parallel sum construction, thereby providing a unified framework that simultaneously generalizes Ando's decomposition of bounded positive operators and Kosaki's decomposition of normal positive functionals on von Neumann algebras.

Reliable one-bit quantization of bandlimited graph data via single-shot noise shaping

2026-05-07T08:38:42Z

Graph data are ubiquitous in natural sciences and machine learning. In this paper, we consider the problem of quantizing graph structured, bandlimited data to few bits per entry while preserving its information under low-pass filtering. We propose an efficient single-shot noise shaping method that achieves state-of-the-art performance and comes with rigorous error bounds. In contrast to existing methods it allows reliable quantization to arbitrary bit-levels including the extreme case of using a single bit per data coefficient.

TGPP: Trajectory-Guided Plug-and-Play Priors for Sparse Radio Map Reconstruction

2026-05-07T08:21:32Z

Radio map (RM) reconstruction is essential for environment-aware wireless networks, but practical measurements are often collected along mobility trajectories rather than randomly scattered over the target region. Such trajectory-sampled observations induce spatially heterogeneous uncertainty: near-trajectory regions are directly constrained, whereas distant or occluded regions remain weakly observed, leading to degraded reconstruction accuracy in under-constrained areas. To address this problem, we propose Trajectory-Guided Plug-and-Play Priors (TGPP), a general guidance module for sparse RM reconstruction. TGPP learns an explicit guidance map as an interpretable input-space risk prior, and an implicit guide feature that is projected and fused with backbone hidden representations. TGPP can be attached to different reconstruction backbones without changing their original task formulation. We further introduce RadioFlow-LDM, a latent flow-based generative backbone, and apply TGPP to deterministic, adversarial, graph-based, and latent generative reconstruction models. Experiments on RadioMapSeer with five trajectory sampling rates show that trajectory-sampled reconstruction differs substantially from random sparse interpolation. TGPP improves most reconstruction metrics across backbones, achieving up to 43.1% NMSE reduction relative to the corresponding base backbone without trajectory-guided priors.

An Additive Approximation Scheme for Generating Dyadic Codings for the Outputs of an LLM

2026-05-07T08:11:37Z

We study the problem of approximating a discrete probability distribution, such as the next-token distribution of a large language model, by a dyadic distribution induced by a binary tree under encoding rate constraints. The objective is to partition the support of the distribution and assign dyadic probabilities to minimize total variation distance while achieving a prescribed rate. We formulate this task as a tree-based partitioning problem and develop a polynomial-time additive approximation scheme for the rate-constrained setting in the constant-rate regime. Our results provide provable guarantees for near-optimal dyadic approximations and, as an application, yield a principled framework for LLM-based steganography, where the rate maps to bits of hidden information embedded per token and the total variation bound controls statistical detectability.

On the Rate-Distortion-Complexity Tradeoff for Semantic Communication

2026-05-07T07:45:32Z

Semantic communication is a novel communication paradigm that focuses on conveying the user's intended meaning rather than the bit-wise transmission of source signals. One of the key challenges is to effectively represent and extract the semantic meaning of any given source signals. While deep learning (DL)-based solutions have shown promising results in extracting implicit semantic information from a wide range of sources, existing work often overlooks the high computational complexity inherent in both model training and inference for the DL-based encoder and decoder. To bridge this gap, this paper proposes a rate-distortion-complexity (RDC) framework which extends the classical rate-distortion theory by incorporating the constraints on semantic distance, including both the traditional bit-wise distortion metric and statistical difference-based divergence metric, and complexity measure, adopted from the theory of minimum description length and information bottleneck. We derive the closed-form theoretical results of the minimum achievable rate under given constraints on semantic distance and complexity for both Gaussian and binary semantic sources. Our theoretical results show a fundamental three-way tradeoff among achievable rate, semantic distance, and model complexity. Extensive experiments on real-world image and video datasets validate this tradeoff and further demonstrate that our information-theoretic complexity measure effectively correlates with practical computational costs, guiding efficient system design in resource-constrained scenarios.

A PAC-Bayesian Analysis of Channel-Induced Degradation in Edge Inference

2026-05-07T05:11:28Z

In the emerging paradigm of edge learning, neural networks (NNs) are partitioned across distributed edge devices that collaboratively perform inference via wireless transmission. However, deploying NNs for edge inference over wireless channels inevitably leads to performance degradation, as the exact channel realizations in the inference stage are not known in the training stage. In this paper, we establish a theoretical framework to evaluate and bound this performance degradation. Inspired by statistical learning theory, we define a wireless generalization error to characterize the gap between the empirical performance during training and the expected inference performance under the true stochastic channel. To enable theoretical analysis, we introduce an augmented NN model that incorporates channel statistics directly into the weight space. Leveraging the PAC-Bayesian framework, we derive a high-probability bound on this error, which provides theoretical guarantees for wireless inference performance. Furthermore, we propose a channel-aware training algorithm that minimizes a tractable surrogate objective based on the derived bound. Simulations demonstrate that the proposed algorithm effectively improves wireless inference performance and model robustness under various channel conditions.

Near-field Channel Estimation for XL-RIS-aided mmWave MIMO Systems

2026-05-07T04:13:13Z

Extremely large-scale reconfigurable intelligent surfaces (XL-RISs) have emerged as a promising technology for millimeter-wave (mmWave) communications. However, the exceedingly large aperture of XL-RISs renders the RIS-user links likely to operate in the near-field region, where the conventional planar-wave assumption and angular-domain sparse representation become invalid, thus making channel estimation significantly more challenging. In this paper, we investigate cascaded channel estimation for an XL-RIS-aided multi-user multiple-input multiple-output (MU-MIMO) system, in which the BS-RIS channel is modeled in the far field, while the RIS-user channels exhibit near-field spherical-wave characteristics. To tackle the resulting hybrid-field estimation problem, we propose a low-overhead two-stage channel estimation scheme by jointly exploiting the common BS-RIS link shared by all users and the polar-domain sparsity of the RIS-user channels. Specifically, the multi-antenna users are firstly decomposed into multiple virtual single-antenna users, based on which the common BS-RIS parameters are extracted from a typical virtual user and the RIS-user channels are initialized via compensated polar-domain sparse recovery. Then, an alternating least-squares refinement procedure is developed to jointly improve the common BS-RIS operator and the user-specific RIS-side channels. Simulation results show that the proposed scheme achieves competitive channel estimation performance with substantially reduced pilot overhead compared with the existing near-field benchmarks.

Relativistic Hamiltonian as an emergent structure from information geometry

2026-05-07T03:33:17Z

We show that the relativistic energy-momentum relation can emerge as an effective ensemble-averaged structure from a multiplicative Hamiltonian when fluctuations of an auxiliary parameter are treated using maximum entropy inference. The resulting probability distribution is uniquely fixed by scale-invariant constraints, which are shown to arise naturally from the Fisher-Rao geometry of the associated statistical manifold. Within this information-geometric framework, the relativistic dispersion relation appears without initially imposing Lorentz symmetry, but as a consequence of statistical averaging and geometric invariance.