Anytime-Valid Quantum State Tomography via Confidence Sequences

2026-05-15T10:48:28Z

In this letter, we address the problem of developing quantum state tomography (QST) methods that remain valid at any time during a sequence of measurements. Specifically, the aim is to provide a rigorous quantification of the uncertainty associated with the current state estimate as data are acquired incrementally. To this end, the proposed framework augments existing QST techniques by associating current point estimates of the state with confidence sets that are guaranteed to contain the true quantum state with a user-defined probability. The methodology is grounded in recent statistical advances in anytime-valid confidence sequences. Numerical results confirm the theoretical coverage properties of the proposed anytime-valid QST.

Conditional Entropy of Heat Diffusion on Temporal Networks

2026-05-15T10:47:44Z

Many complex systems can be modeled by temporal networks, whose organization often evolves through distinct structural phases. Detecting the change points that delimit these phases is both important and challenging. In this work, we extend the conditional entropy of heat diffusion from static graphs to temporal networks and study its properties. We provide an upper bound and explain how discrepancies from it arise from the presence of asymmetric temporal paths. Moreover, we show that this quantity is monotone in time, yielding an information-theoretic analog of the second law of thermodynamics for inhomogeneous diffusion on temporal networks. We then introduce a local version of conditional entropy, designed to probe diffusion over finite temporal windows, and show that it provides an informative signal for change-point detection in continuous-time temporal networks. We evaluate the proposed methodology on synthetic benchmarks, including comparative experiments with existing nonparametric baselines in the snapshot setting, and then apply it to a real-world temporal contact network from a French primary school. Finally, we show how to use detected change points to perform community detection on targeted sub-intervals, improving the quality and interpretability of the clustering results.

Active Redundancy Allocation Strategy at Component and System Level

2026-05-15T10:20:44Z

Researchers and practitioners in the field of reliability engineering and optimization frequently use active redundancy techniques to intensify the performance of systems. In this article, we study allocation strategies of non-matching active redundancies (spares) in coherent systems consisting of possibly dependent and identical components for achieving better system reliability. The dependence of the components is modeled through copulas using the distortion function. Sufficient conditions are derived to establish optimal allocation strategies for two heterogeneous active redundancies at the component or system levels. Moreover, the results are true for the component lifetimes following a general family of parametric distributions. The results guarantee the likelihood ratio (reversed hazard) ordering between the coherent systems at the component level (system level) active redundancies. Some aging properties are also established in this endeavor. Several examples are provided to demonstrate the theoretical results.

Real-Time Reconstruction and Actuation Error Analysis for Markov Sources over MPR Channels

2026-05-15T09:50:54Z

We study real-time reconstruction and actuation for two binary Markov sources that share a wireless multi-packet reception (MPR) channel. Each sensor follows a stationary randomized sampling policy, and the receiver maintains source estimates using the most recently decoded updates. We derive closed-form expressions for the steady-state real-time reconstruction error (RTE) and the cost of actuation error (CAE) as functions of the source transition probabilities and the effective update probabilities. We then characterize these update probabilities under randomized sampling, linking the physical-layer MPR model to task-oriented reconstruction and actuation metrics. Using these expressions, we formulate a sampling-constrained optimization problem with a weighted-error objective. The resulting analysis reveals how source dynamics, semantic weights, and MPR coupling affect the allocation of sampling resources. Numerical results show that optimized randomized sampling outperforms random, greedy, and time-sharing baselines.

Optimum Peer-Turbo: A Scalable and Efficient Solution for P2P Broadcasting

2026-05-15T08:07:04Z

Blockchain systems such as Solana or Monad employ tree- or star-shaped broadcast topologies in which a single source node disseminates message shards to a set of target peers within a strictly bounded time window. In these architectures, shard propagation must complete before the next consensus step, making timely delivery to a large fraction of the validator set essential. A fundamental limitation of such designs is that the outbound bandwidth of the source node constitutes the primary system bottleneck. In this paper, we introduce peer Turbo, a technique that allows target nodes to exchange shards using Random Linear Network Coding (RLNC), thereby assisting each other in completing decoding without requiring explicit shard state coordination. We use a tractable fluid approximation of the degree of freedom distribution of peer-Turbo-enabled systems show that this approach reduces source bandwidth required for a set service quality by up to one order of magnitude, or equivalently reduces propagation latency by one order of magnitude under fixed bandwidth constraints.

Overfitting has a limitation: a model-independent generalization gap bound based on Rényi entropy

2026-05-15T06:45:39Z

Will further scaling up of machine learning models continue to bring success? A significant challenge in answering this question lies in understanding generalization gap, which is the impact of overfitting. Understanding generalization gap behavior of increasingly large-scale machine learning models remains a significant area of investigation, as conventional analyses often link error bounds to model complexity, failing to fully explain the success of extremely large architectures. This research introduces a novel perspective by establishing a model-independent upper bound for generalization gap applicable to algorithms whose outputs are determined solely by the data's histogram, such as empirical risk minimization or gradient-based methods. Crucially, this bound is shown to depend only on the Rényi entropy of the data-generating distribution, suggesting that a small generalization gap can be maintained even with arbitrarily large models, provided the data quantity is sufficient relative to this entropy. This framework offers a direct explanation for the phenomenon where generalization performance degrades significantly upon injecting random noise into data, where the performance degrade is attributed to the consequent increase in the data distribution's Rényi entropy. Furthermore, we adapt the no-free-lunch theorem to be data-distribution-dependent, demonstrating that an amount of data corresponding to the Rényi entropy is indeed essential for successful learning, thereby highlighting the tightness of our proposed generalization bound.

A Finite-State Gibbs Construction from a Recognition Cost

2026-05-15T06:44:40Z

On a finite outcome space, the canonical Gibbs distribution is usually obtained by maximizing Shannon entropy at fixed mean of an externally supplied energy functional. This paper studies the finite-state consequences of a ratio-cost construction instead: after adopting the normalized d'Alembert degree-two closure called the Recognition Composition Law (RCL), with unit log-curvature calibration at the reference ratio, the continuous nontrivial positive branch is $J(x)=\tfrac12(x+x^{-1})-1=\cosh(\log x)-1$. Given the induced cost vector $X_ω=J(r_ω)$, multinomial counting and convex duality recover the finite-state Gibbs weights and the identity $F_{\mathrm{R}}(q)-F_{\mathrm{R}}(p)=T_{\mathrm{R}}\,D_{\mathrm{KL}}(q\Vert p)$; the entropy-maximization steps are classical once the cost is fixed. New technical content includes a non-asymptotic Stirling bound and soft-shell constrained-type theorems for real-valued costs. A three-state example compares the Gibbs law to squared-log, affinity-as-energy, and Tsallis alternatives at the same cost vector and mean-cost constraint, with sample-size power calculations at fixed RCL ground truth. The framework is conditional on axioms (A1)--(A3) and restricted to finite outcome spaces with strictly positive weights; it does not derive the composition law from a more primitive principle.

Statistical two-round search for one excellent element

2026-05-15T04:50:05Z

We formulate and study a statistical version of Katona's two-round search problem of finding at least one excellent element in a set. A population of $n$ elements is considered, where each element is independently excellent with probability $λ/n$, $λ> 0$. A subset test is noiseless: it returns positive exactly when the queried subset contains at least one excellent element. The goal is to minimize the expected number of tests subject to finding one excellent element with probability at least $1-α$, where $0<α<1$, under the restriction that testing is performed in two rounds. Unlike classical group testing, the objective is not to recover the full set of excellent elements, but only to identify one of them. We first show that success is fundamentally limited by the possibility that no excellent element exists. In the sparse Poisson regime, this imposes the necessary feasibility condition $α\ge e^{-λ}$. When the target success probability is feasible, we prove that the optimal expected number of tests grows logarithmically with the population size. The upper bound is obtained by combining an initial existence test with a second-round separating design; the lower bound follows from an information-counting argument. Numerical illustrations show the feasibility boundary and the resulting logarithmic scaling.

Dual-Scale Antenna Deployment for Pinching Antenna Systems

2026-05-15T03:38:38Z

A dual-scale deployment (DSD) framework is proposed for pinching antenna systems (PASS), under which four protocols are provided. 1) For the coarse-scale deployment, the pinching antenna (PA) is transferred over a large-scale range at the waveguide level. 2) For the fine-scale deployment, the PA is adjusted with high precision within a small-scale region. By simultaneously optimizing both scales, the proposed DSD framework can unleash the full potential of PA deployment, while maintaining low computational complexity. Based on this framework, we establish a practical power consumption model and derive theoretical energy efficiency expressions for PASS. Then, an energy-efficiency maximization problem is formulated to jointly optimize the transmit precoding, PA radiation power, and dual-scale PA deployment. To solve this non-convex, highly coupled problem, a low-complexity penalty-based alternating optimization algorithm is proposed. Simulation results validate the accuracy of theoretical results and the convergence of the proposed algorithm. It is demonstrated that the proposed DSD framework is highly effective for PASS, delivering about $70\%$ higher energy efficiency than the conventional cell-free architecture and nearly a \emph{twofold} improvement relative to MIMO systems.

On the Generalization of Knowledge Distillation: An Information-Theoretic View

2026-05-15T03:25:35Z

Knowledge distillation is widely used to improve generalization in practice, yet its theoretical understanding remains elusive. In the standard distillation setting, a teacher model provides soft predictions to guide the training of a student model. We model teacher and student training as coupled stochastic processes and introduce a distillation divergence, defined as the Kullback-Leibler divergence between these two stochastic kernels. Within this framework, we derive two generalization bounds for the student model relative to the teacher's generalization gap: an upper bound under a sub-Gaussian assumption via algorithmic stability, and a lower bound under a central condition with sharper dependence on the distillation divergence. We further develop a loss-sharpness-aware bound with an explicit tightness regime, showing that the teacher's local flatness can strictly tighten the bound. Additionally, in a linear Gaussian case study, the distillation divergence admits an interpretable decomposition into bias, variance, and rank-bottleneck costs, yielding practical guidance for distillation design.

PrismQuant: Rate-Distortion-Optimal Vector Quantization for Gaussian-Mixture Sources

2026-05-15T01:05:28Z

For a Gaussian source under mean-squared error (MSE), classical transform coding is rate--distortion (RD) optimal: the Karhunen--Loeve transform (KLT) diagonalizes the covariance, reverse waterfilling allocates the bits, and scalar quantization closes the loop. This elegant story breaks down for multimodal sources, where no single covariance can capture heterogeneous local geometries, and the RD function loses its closed form. We revisit this problem through Gaussian-mixture sources and develop a constructive RD theory for them. Our key finding is that the mixture structure incurs only a component label cost. Conditioned on the active mixture component, each branch is Gaussian; the challenge is allocating bits across heterogeneous branches. We prove that the genie-aided conditional RD function is governed by a single global reverse-waterfilling level shared across all components and eigenmodes. Building on this result, we introduce PrismQuant, which transmits the component label losslessly and encodes the residual using the component-matched KLT, followed by scalar quantization, achieving a rate of H(C)/n bits per source dimension of the converse, with a vanishing asymptotic gap. We further develop a practical implementation based on EM-driven Gaussian-mixture learning, component-adaptive KLTs, and entropy-constrained scalar quantization (ECSQ). Experiments on synthetic Gaussian mixtures show that PrismQuant closely approaches the theoretical RD bound, while experiments on real-world channel-state-information (CSI) data demonstrate competitive or superior performance compared with transformer-based learned codecs at more than one order of magnitude smaller model size.

Additivity Results for the Rényi-2 Entanglement of Purification

2026-05-14T21:41:31Z

We reformulate the Rényi entanglement of purification as a constrained minimum output Rényi entropy problem. Equivalently, for $p>1$, this formulation can be expressed in terms of a constrained maximal output Schatten $p$-norm. More precisely, for a completely positive map $Ω:L(B')\to L(A)$, we consider the quantity $\upsilon_p(Ω)$ defined by optimizing $\|(Ω\otimes \mathrm{id}_E)(σ^{B'E})\|_p$ over all bipartite states $σ^{B'E}$ whose $B'$-marginal is maximally mixed. We focus on the case $p=2$. First, we compute $\upsilon_2$ for the transpose-depolarizing channel and prove that it is multiplicative under tensor powers. We then establish a general multiplicativity criterion: whenever a completely positive map $N:L(B')\to L(A)$ satisfies $N^{\dagger} \mathbin{\circ} N=a\,\mathrm{id}_A+b\,\mathrm{Tr}[\cdot]\,I_d$ for some constants $a,b\ge 0$, where $N^{\dagger}$ denotes the Hilbert-Schmidt adjoint of $N$, the quantity $\upsilon_2(N)$ is multiplicative under tensor powers. Examples of channels satisfying this criterion include the transpose-depolarizing channel, the depolarizing channel, and their respective complementary channels. Furthermore, we show that, for every completely positive map $Ω$, multiplicativity of $\upsilon_p(Ω)$ implies multiplicativity for its complementary map. This yields the corresponding additivity statements for the associated Rényi-2 entanglement of purification.

Orthogonal Polynomials and the MacWilliams Transform for Permutation-Invariant Qudit Codes

2026-05-14T19:58:50Z

We derive an explicit formula for the intrinsic MacWilliams transform for permutation-invariant qudit codes. Such codes naturally live in symmetric power representations, where the relevant error sectors are determined by the irreducible decomposition of the conjugation action on the associated operator space. Using the multiplicity-free structure of this decomposition and the corresponding intertwiner algebra, we identify the intrinsic MacWilliams matrix with a finite Racah transform. The entries are given by a terminating hypergeometric series, and the rows of the matrix are Racah orthogonal polynomials with parameters determined explicitly by the block length and local dimension. Computing the spectrum of the degree-one twirl reveals that this spectrum lies on an affine quadratic lattice. Then we derive a tridiagonal multiplication rule from the representation theory of the adjoint sector. As consequences, we obtain closed-form orthogonality, detailed-balance, and involutivity identities for the transform. The resulting formula supplies an explicit MacWilliams matrix for computing linear programming bounds on permutation-invariant qudit codes.

Mitigation of UE Antenna Calibration Errors via Differential STBC in Cell-Free Massive MIMO

2026-05-14T18:33:41Z

This letter investigates the use of differential space-time block coding (DSTBC) to address antenna array calibration impairments at multi-antenna user equipment (UE) in the downlink (DL) of cell-free massive MIMO (CF-mMIMO) systems. We show that, by exploiting DSTBC, reliable DL communication can be achieved without explicit UE-side calibration or channel phase knowledge. Simulation results demonstrate that the proposed DSTBC-based transmission effectively mitigates the impact of antenna-dependent phase offsets, restoring near-coherent performance in CF-mMIMO networks.

Universal quantum resource distillation via composite generalised quantum Stein's lemma

2026-05-14T17:57:05Z

The performance of quantum resource manipulation protocols, including key examples such as distillation of quantum entanglement, is measured in terms of the rate at which desired target states can be produced from a given noisy state. However, to achieve optimal rates, known protocols require precise tailoring to the quantum state in question, demanding a perfect knowledge of the input and allowing no errors in its preparation. Here we show that distillation of quantum resources in the framework of resource non-generating operations can be performed universally: optimal rates of distillation can be achieved with no knowledge of the input state whatsoever, certifying the robustness of quantum resource distillation. The findings apply in particular to the purification of quantum entanglement under non-entangling maps, where the optimal rates are governed by the regularised relative entropy of entanglement. Our result relies on an extension of the generalised quantum Stein's lemma in quantum hypothesis testing to a composite setting where the null hypothesis is no longer a fixed quantum state, but is rather composed of i.i.d. copies of an unknown state. The solution of this asymptotic problem is made possible through new developments in one-shot quantum information and a refinement of the blurring technique from [Lami, arXiv:2408.06410].