https://arxiv.org/api/9nSyOGi17pK4UIsU/fxdZFML/Is2026-06-22T01:14:56Z5484127015http://arxiv.org/abs/2606.07852v1Affine Filtering Measurements and Their Applications to Quantum Decoding2026-06-05T21:25:34ZUnambiguous state discrimination (USD) measurements are attractive because outcomes are either marked as conclusive (i.e., error free) or inconclusive (i.e., erased). We study affine filtering measurements, a structured variant of USD for decoding classical linear codes over pure-state classical-quantum channels, where a conclusive outcome identifies an affine subspace containing the transmitted codeword and an inconclusive outcome is treated as an erasure. For a group-covariant indexing of pure-state codewords, we show that the optimal design of affine filtering measurements is a semidefinite program that can be reduced to a linear program via character-based diagonalization. We use the resulting measurement to build a quantum decoding framework for local codes, and we demonstrate (via simulations on regular LDPC codes from Gallager ensembles using single parity check local constraints) that affine filtering based decoding can outperform symbol-wise USD and symbol-wise pretty good measurement based decoding methods on i.i.d. pure-state channels. In an independent and concurrent work, Buzet and Chailloux study similar fine-grained USD measurements for symmetric families of states. Their focus is on the code-agnostic setting whereas our focus is on code-aware constructions and decoding.2026-06-05T21:25:34ZAvijit MandalNoah ShuttyHenry D. PfisterStephen P. Jordanhttp://arxiv.org/abs/2602.18364v2Quantum Maximum Likelihood Prediction via Hilbert Space Embeddings2026-06-05T18:59:00ZMaximum likelihood prediction (MLP) is a core task at the heart of modern large language models. Here, we study a quantum version of this task for a simplified data model consisting of independent and identically distributed samples, as a first step. The quantum maximum likelihood predictor is obtained by embedding of empirical probability distributions into quantum states and performing a minimization of quantum relative entropy over a given class of states. We provide an interpretation of this predictor in terms of quantum reverse information projection and quantum Pythagorean theorem when the class of quantum models is sufficiently expressive. We further derive non-asymptotic performance guarantees in terms of convergence rates and concentration inequalities, both in trace norm and quantum relative entropy. Our approach provides a unified framework to handle MLP within both classical and quantum LLMs.2026-02-20T17:16:38Z31+3 pages, 1 figureSreejith SreekumarNir Weinbergerhttp://arxiv.org/abs/2511.18945v4MIST: Mutual Information Estimation Via Supervised Training2026-06-05T17:37:11ZWe propose a fully data-driven approach to designing mutual information (MI) estimators. Since any MI estimator is a function of the observed sample from two random variables, we parameterize this function with a neural network (MIST) and train it end-to-end to predict MI values. Training is performed on a large meta-dataset of 625,000 synthetic joint distributions with known ground-truth MI. To handle variable sample sizes and dimensions, we employ a two-dimensional attention scheme ensuring permutation invariance across input samples. To quantify uncertainty, we optimize a quantile regression loss, enabling the estimator to approximate the sampling distribution of MI rather than return a single point estimate. This research program departs from prior work by taking a fully empirical route, trading universal theoretical guarantees for flexibility and efficiency. Empirically, the learned estimators largely outperform classical baselines across sample sizes and dimensions, including on joint distributions unseen during training. The resulting quantile-based intervals are well-calibrated and more reliable than bootstrap-based confidence intervals, while inference is orders of magnitude faster than existing neural baselines. Beyond immediate empirical gains, this framework yields trainable, fully differentiable estimators that can be embedded into larger learning pipelines. Moreover, exploiting MI's invariance to invertible transformations, meta-datasets can be adapted to arbitrary data modalities via normalizing flows, enabling flexible training for diverse target meta-distributions.2025-11-24T09:55:28ZGerman GritsaiMegan RichardsMaxime MélouxKyunghyun ChoMaxime Peyrardhttp://arxiv.org/abs/2507.15805v2Sparse Discovery of Functional Relationships in Solutions to Systems of Differential Equations2026-06-05T17:33:50ZThis work develops a framework to discover relations between the components of the solution to a given initial-value problem for a first-order system of ordinary differential equations. This is done by using sparse identification techniques on the data represented by the numerical solution of the initial-value problem at hand. The only assumption is that there are only a few terms that connects the components, so that the mathematical relations to be discovered are sparse in the set of possible functions. We illustrate the method through examples of applications.2025-07-21T17:05:12Z11 pagesNicolae Tarfuleahttp://arxiv.org/abs/2507.12927v2Trace Reconstruction with Language Models2026-06-05T17:29:13ZThe general trace reconstruction problem seeks to recover an original sequence from its noisy copies independently corrupted by insertions, deletions, and substitutions. This problem arises in applications such as DNA data storage, a promising storage medium due to its high information density and longevity. However, errors introduced during DNA synthesis, storage, and sequencing require correction through algorithms and codes, with trace reconstruction often used as part of data retrieval. In this work, we propose TReconLM, a decoder-only transformer that solves trace reconstruction as a next-token prediction task. TReconLM outperforms state-of-the-art trace reconstruction algorithms, including prior deep-learning approaches, recovering a substantially higher fraction of sequences without error. We pretrain on synthetic data generated from a simple error model and fine-tune on real-world data to adapt to technology-specific error patterns. Code is available at https://github.com/MLI-lab/TReconLM.2025-07-17T09:08:41ZFranziska WeindelMichael GirschReinhard Heckelhttp://arxiv.org/abs/2504.00613v2LLM-Guided Search for Deletion-Correcting Codes2026-06-05T17:17:48ZFinding deletion-correcting codes of maximum size has been an open problem for over 70 years, even for a single deletion. We adapt FunSearch, a large language model (LLM)-guided evolutionary search, to discover functions that construct deletion-correcting codes at short code lengths. For a single deletion, our search finds a function that we prove constructs the conjectured-optimal Varshamov-Tenengolts code. For multiple deletions and quaternary edit codes, the discovered functions improve on prior explicit, search-based, and neural constructions but remain empirical heuristics without new theoretical insights. We study design choices for LLM-guided evolutionary search and find that, for our problem, compute is better allocated to sampling more functions than to longer reasoning traces per function, and that co-evolving natural language descriptions with code hurts search quality. We propose deduplicating logically identical functions during evolution, which we find critical for search diversity. Our results demonstrate the potential of LLM-guided evolutionary search for information theory and code design and represent the first application of such methods for constructing error-correcting codes. However, in our current formulation, evaluating a function scales exponentially with code length, limiting the approach to short codes.2025-04-01T10:11:32ZFranziska WeindelReinhard Heckelhttp://arxiv.org/abs/2606.07443v1Sort, Partition, Randomize: Optimal Binary Hypothesis Testing under Local Differential Privacy2026-06-05T16:41:31ZWe study optimal design of $\varepsilon$-locally differentially private mechanisms for binary hypothesis testing. Each observation is drawn from one of two known distributions $P_0,P_1$ on a finite alphabet of size $k$, privatized by a mechanism $Q$, and then used to infer which distribution generated the data. We measure testing utility using an $f$-divergence, including total variation, KL, and hockey-stick divergences, between the two induced output distributions. Previous work established structural properties of optimal mechanisms, but only yielded exponential-time algorithms. We prove a sharp structure: for every $\varepsilon$ and every $f$-divergence objective, after sorting the alphabet by likelihood ratio, there exists an optimal mechanism that partitions the sorted alphabet into contiguous blocks and applies randomized response to the block label. We call this class Sort-Partition-Randomize (SPR). This characterization yields an exact dynamic program that computes an optimal mechanism in $O(k^3)$ time, and more generally in $O(\ell k^2)$ time with an $\ell$-output budget. Our results make it possible to efficiently compute and characterize the exact optimum across the full privacy range, beyond asymptotic privacy regimes.2026-06-05T16:41:31Z42 pages, 6 figuresElena GhaziJawad NasserFlavio CalmonIbrahim Issahttp://arxiv.org/abs/2606.07409v1Rate Loss in Quantum Channels with Classical State and Applications for Quantum Broadcast Channels2026-06-05T15:57:33ZWe consider the problem of \textit{rate loss} - a strict penalty suffered in achievable rates due to the lack of channel state information at the receiver (Rx) of a classical-quantum (CQ) channel. First, we identify non-commutative CQ channels and analytically prove a rate loss. Building on this, we next prove that coset-code-based strategies can strictly outperform conventional unstructured IID-code-based strategies for non-commutative 3-user CQ broadcast channels.2026-06-05T15:57:33ZIgor BernardArun Padakandlahttp://arxiv.org/abs/2109.12586v5Distributed Instrument Simulation with Quantum Side Information in the One-Shot Regime2026-06-05T15:46:06ZThree distributed parties, two transmitters (Txs) and a receiver (Rx), hold one component each of a tripartite quantum state \(ρ^{A_1A_2C}\). The goal is to simulate the action of a separable instrument acting on the \(A_1\) and \(A_2\) components, with the Rx recovering the classical outcome. To enable this, each Tx \(k\) can transfer bits on a noiseless bit pipe and share randomness at rates \(R_k\) and \(C_k\), respectively, with the Rx. Undertaking a Shannon-theoretic study, we characterize two new sets of inner bounds. The first set, derived for the one-shot regime, is based on instrument simulation protocols built using unstructured IID codes, while the second set, derived for the asymptotic regime, relies on coset codes and new decoding POVMs. The first set of bounds recovers current known inner bounds for instrument and measurement simulation in all previously studied scenarios. Our protocols are based on likelihood POVMs, and our analysis leverages Sen's smooth multiparty covering and simultaneous decoding, while handling the distributed-component scenario via a compatible operator sliding trick.2021-09-26T12:44:15ZIgor BernardArun Padakandlahttp://arxiv.org/abs/2606.07390v1Exact output statistics of Icart's encoding in the exceptional \(j=0\) case2026-06-05T15:27:21ZIcart's encoding is a classical deterministic map from finite fields to elliptic curves and a basic ingredient in early hash-to-curve constructions. We determine the exact one-output distribution of this map in the exceptional \(j=0\) case. More precisely, for \[
E_{0,b}:Y^2=X^3+b,\ q\equiv2\pmod3, \] we compute the complete fibre distribution of \(f_{0,b}:\mathbb F_q\to E_{0,b}(\mathbb F_q)\). This gives closed formulae for the image size, total variation distance from uniform, collision probability, power sums, entropy measures and basic batch statistics. We also derive the exact second moment of all nontrivial character sums of the output distribution. Via the Weil pairing, this becomes an exact energy formula for pairing-character tests on the supersingular \(j=0\) family whose odd prime order subgroups have embedding degree two.2026-06-05T15:27:21Z11 pagesDavid Kumallagovhttp://arxiv.org/abs/2606.07346v1Geometric Factorization of Sufficient Harmonic Representations2026-06-05T14:56:15ZFor tasks of likelihood families invariant under the action of a lie group, the quotient is the minimal sufficient invariant representation. On compact homogeneous spaces, this quotient representation admits a harmonic realization through spherical Fourier coefficients; for finite-band harmonic exponential families, the empirical harmonic coefficients are minimal sufficient statistics. The partition function can be expressed algebraically by extracting the trivial representation component through Clebsch-Gordan decomposition.2026-06-05T14:56:15ZKennon Stewarthttp://arxiv.org/abs/2606.07325v1A Temporal Spatial Minimax Rate for Smoothly-Varying Distributions in Wasserstein Space2026-06-05T14:43:10ZWe study the minimax rate of estimating a future value $μ_{t_n+h}$ of a curve $t\mapstoμ_t$ in the $2$-Wasserstein space $\mathcal{P}_2(\mathbb{R}^d)$ from finitely many noisy snapshots of its past, under an adiabatic bound $\|\nabla_t^k v\|\le\varepsilon$ on the $k$-th covariant derivative of the velocity field. Our central result is a unified temporal-spatial minimax lower bound: over regular, locally transport-rich subclasses, every estimator incurs $W_2$-risk with $M$-exponent $γ_d(k+1)/(k+1+γ_d)$, $γ_d=\min(1/d,1/2)$ ($M$ the total sample size). It follows from a temporal-to-spatial reduction: the smoothness budget defines a reachable $W_2$-ball into which a transport packing is embedded along the time axis, and the information of the entire snapshot experiment is controlled by a Fano argument -- the spatial packing is classical, but its smoothness-admissible temporal embedding and the full-window analysis are new. The bound interpolates a dimension-free extrapolation floor of order $\varepsilon h^{k+1}$ -- the irreducible cost of an unobserved future, present even with the exact past -- and the spatial estimation curse $M^{-γ_d}$, recovering the static distribution-estimation rate as $k\to\infty$. We state the lower bound in a design-dependent form -- with a design-weighted effective sample size -- valid for arbitrary observation times, and obtain the closed-form exponent in the dense (equispaced) regime. The matching upper bound is established at $k=0$ (rate $M^{-1/(d+1)}$, $d\ge3$) and, in a translation submodel, for all $k$; for $k\ge1$ a covariant estimator attains the rate conditionally on two estimates (a comparison-geometry bias bound and an optimal-transport map-estimation rate), leaving the unconditional general-$k$ upper bound as an open problem. Numerical experiments on synthetic curved and flat families corroborate the predicted exponents.2026-06-05T14:43:10ZMunsik Kimhttp://arxiv.org/abs/2606.07321v1Letting Homogeneity Entropy Select S-Pairs in Buchberger's Algorithm2026-06-05T14:39:35ZWe present a novel S-pair selection strategy called Homogeneity Entropy, for deciding the sequence of S-polynomials to construct in Buchberger's algorithm to compute a Groebner basis. The strategy uses an information theoretic measure derived from the distribution of degrees among the monomials of the S-polynomial: a very different approach to the classical heuristics such as Degree, Normal and Sugar, or indeed the more recent machine learning approaches to the problem. We implement this strategy and evaluate it on two different datasets: (1) variations of randomly generated polynomial systems with controlled numbers of variables, degrees, and densities; and (2) the PHCpack benchmark dataset sourced from real world problems. The Homogeneity Entropy strategy significantly outperforms classical strategies on random polynomial datasets, but on the PHCpack dataset the classical strategies perform better. This suggests the right strategy varies with the shape of the data and we explore this in several experiments. The new strategy offers practically meaningful gains on certain distributions, and represents the first use of such information-theoretic guidance in the optimisation of symbolic computation algorithms.2026-06-05T14:39:35ZUzma ShafiqMatthew EnglandAmirHosein SadeghimaneshNayyar Zaidihttp://arxiv.org/abs/2606.07277v1The Capacity of Information-Theoretic Secure Aggregation in Federated Learning2026-06-05T13:53:27ZSecure aggregation allows a server to aggregate users' local updates while preserving update privacy. Existing information-theoretic problems typically assume that correlated random keys are provided by a trusted third party (TTP) or generated via prescribed groupwise structures, while the communication cost for establishing such correlated keys is often ignored. Consequently, the fundamental limits under general key-distribution mechanisms remain unknown. In this paper, we study the $T$-colluding information-theoretic secure aggregation problem with $N$ users under a general two-phase framework consisting of a key distribution phase and an update aggregation phase. Unlike prior work, we model key distribution through user-to-user communication and allow arbitrary user-generated key-distribution mechanisms, eliminating TTP or prescribed structures. This enables a joint characterization of three resources: randomness for security, key-distribution communication, and aggregation communication. We completely characterize the capacity region among these three resources by constructing a novel secure aggregation scheme together with a matching information-theoretic converse. In particular, we develop an explicit deterministic capacity-achieving construction over any finite field of size at least $N$, whereas most existing schemes either rely on TTP or employ randomized or existential constructions over sufficiently large finite fields. We further show that the optimal performance can be achieved using only pairwise shared keys, enabling implementation via Diffie--Hellman key exchange. Compared with Google's seminal secure aggregation scheme, the proposed scheme requires fewer random masking keys while preserving the same aggregation communication overhead.2026-06-05T13:53:27ZLanxin YiJinbao ZhuKai WanXiaohu Tanghttp://arxiv.org/abs/2601.07622v2Clipped Affine Policy: Low-Complexity Near-Optimal Online Power Control for Energy Harvesting Communications over Fading Channels2026-06-05T13:46:41ZThis paper studies online power control for battery-limited point-to-point energy harvesting communications over slow block-fading channels. A linear-policy-based approximation is developed for the relative-value function in the Bellman equation of the power control problem. This approximation leads to two fundamental parameterized clipped affine policies: an optimistic policy derived from a certainty-equivalence-type approximation and a robust policy derived from worst-case analysis. For independent and identically distributed energy arrivals and channel states, two families of power control schemes are developed based on the optimistic clipped affine (OCA) and robust clipped affine (RCA) policies, respectively. The proposed adaptive RCA policy based on reinforcement learning (RCA-RL) is further extended to address four scenarios with contextual information: one-step energy lookahead, one-step channel lookahead, one-step joint energy-channel lookahead, and Markov energy arrivals. Extensive simulation results show that the proposed schemes provide a favorable tradeoff between computational complexity and performance. The adaptive RCA policy based on the maximin optimal linear-policy-slope approximation (RCA-OLA-A) and the RCA-RL scheme achieve the best overall performance, while the RCA policy based on the maximin optimal linear policy (RCA-OL) is the best-performing closed-form policy. In particular, RCA-OLA-A, RCA-RL, and the aforementioned RCA-RL extensions achieve less than 2% performance loss relative to the optimal policy across a range of scenarios, consistently outperforming the considered benchmark approaches, including generic reinforcement learning baselines. The RCA-OL policy also performs well with less than 4% performance loss.2026-01-12T15:06:39Z29 pages, 15 figures, v1.0Hao WuShengtian YangHuiguo GaoDiao WangJun ChenGuanding Yu