Sequences with thirteen-valued cross correlations

2026-05-19T02:30:06Z

In this paper, we completely determine the cross correlation distribution between an $m$-sequence $(s_t)$ of period $p^n-1$ and its $d$-decimated sequence $(s_{dt})$, where $d = \frac{p^n-1}{3} + p^i$, $p \equiv 1 \pmod{3}$, $\frac{1}{3}p^{-i}(p^n-1) \not\equiv 2 \pmod{3}$, and $0 \leq i < n$. It is shown that the cross correlation is $13$-valued. To the best of our knowledge, this is the first time that the cross correlation distribution of so many values has been determined.

Temporary Power Adjusting Withholding Attack

2026-05-19T01:53:11Z

We consider the block withholding attacks on pools, more specifically the state-of-the-art Power Adjusting Withholding (PAW) attack. We propose a generalization called Temporary PAW (T-PAW) where the adversary withholds a fPoW from pool mining at most $T$-time even when no other block is mined. We show that PAW attack corresponds to $T\to\infty$ and is not optimal. In fact, the extra reward of T-PAW compared to PAW improves by an unbounded factor as adversarial hash fraction $α$, pool size $β$ and adversarial network influence $γ$ decreases. For example, the extra reward of T-PAW is 22 times that of PAW when an adversary targets a pool with $(α,β,γ)=(0.05,0.05,0)$. We show that honest mining is sub-optimal to T-PAW even when there is no difficulty adjustment and the adversarial revenue increase is non-trivial, e.g., for most $(α,β)$ at least $1\%$ within $2$ weeks in Bitcoin even when $γ=0$ (for PAW it was at most $0.01\%$). Hence, T-PAW exposes a significant structural weakness in pooled mining-its primary participants, small miners, are not only contributors but can easily turn into potential adversaries with immediate non-trivial benefits.

Minimax optimal submatrix detection: Sharp non-asymptotic rates

2026-05-19T00:41:34Z

Given an observation $\mathbf Y \in \mathbb{R}^{d_1\times d_2}$ from the model $\mathbf Y = \mathbf X + \mathbf E$ where $\mathbf X$ is constant and $\mathbf E$ has i.i.d. $N(0,1)$ entries, we consider the problem of detecting a planted submatrix in the mean matrix $\mathbf X$. Specifically, we aim to distinguish the null hypothesis $\mathbf X = 0$ from the alternative hypothesis in which $\mathbf X$ is non-zero only on a submatrix of size $s_1 \times s_2$ with elevated entries bounded below by $μ>0$. We establish a minimax lower bound characterizing how large $μ$ must be to ensure that the two hypotheses are distinguishable with high probability. Furthermore, we derive novel minimax-optimal tests achieving the lower bound, and describe extensions of these tests that are adaptive to unknown sparsity levels $s_1$ and $s_2$. In contrast with previous work, which required restrictive assumptions on $s_1,s_2, d_1$ and $d_2$, our non-asymptotic upper and lower bounds match for any configuration of these parameters.

The Thermodynamic Costs of Simple Linear Regression

2026-05-18T23:51:02Z

The construction of models from data is a significant contributor to the energetic costs of computation. Because of this, understanding how foundational thermodynamic bounds apply to modeling algorithms will be increasingly important. Here, we study the thermodynamic costs of a basic and fundamental modeling algorithm: simple linear regression. Following Landauer, we approximate the thermodynamic lower bound on irreversibly performing both exact linear regression and linear regression via stochastic gradient descent as implemented on floating-point numbers. From this, we derive energycost aware scaling laws for the optimal dataset size for training a linear regression model given a generalization error dependent demand for inference. Additionally, we discuss a method to lower bound the entropy production from the mismatch cost for algorithms with continuous input variables.

Correcting Tail Deletions in Rank Modulated Composite Encoding for Data Storage in DNA

2026-05-18T21:59:18Z

We study the combination of two recent coding approaches, in the context of DNA based data storage. Composite DNA alphabets leverage properties of the DNA synthesis and sequencing process. A composite symbol does not represent a single nucleotide, but rather a designed mixture of DNA nucleotides. Using the high multiplicity that is intrinsic to synthesis and sequencing a composite symbol consists of frequencies in the mixture. Rank modulation codes use permutations to represent information. Combining the two, we construct encoding that uses permutations of nucleotide frequencies rather than the exact frequency values. Codes for this approach were addressed in previous work, under Kendall's tau distances. In this work we study deletion and insertion codes. We present bounds and constructions of efficient codes defined over partial permutations.

Function-Correcting Codes With Data Protection

2026-05-18T20:50:01Z

Function-correcting codes (FCCs) are designed to provide error protection for the value of a function computed on the data. Existing work typically focuses solely on protecting the function value and not the underlying data. In this work, we propose a general framework that offers protection for both the data and the function values. Since protecting the data inherently contributes to protecting the function value, we focus on scenarios where the function value requires stronger protection than the data itself. We first introduce a more general approach and a framework for function-correcting codes that incorporates data protection along with protection of function values. A two-step construction procedure for such codes is proposed, and bounds on the optimal redundancy of general FCCs with data protection are reported. Using these results, we exhibit examples that show that data protection can be added to existing FCCs without increasing redundancy. Using our two-step construction procedure, we present explicit constructions of FCCs with data protection for specific families of functions, such as locally bounded functions and the Hamming weight function. We associate a graph called minimum-distance graph to a code and use it to show that perfect codes and maximum distance separable (MDS) codes cannot provide additional protection to function values over and above the amount of protection for data for any function. Then we focus on linear FCCs and provide some results for linear functions, leveraging their inherent structural properties. To the best of our knowledge, this is the first instance of FCCs with a linear structure. Finally, we generalize the Plotkin and Hamming bounds well known in classical error-correcting coding theory to FCCs with data protection.

Mode-Tensorized Canonical Polyadic Decomposition for MIMO Channel Estimation

2026-05-18T19:18:08Z

This paper proposes a channel estimation method for Multiple-Input Multiple-Output (MIMO) systems based on Canonical Polyadic (CP) decomposition applied to a mode-factorized tensor representation of the channel. The proposed approach reshapes the original low-order channel tensor into a higher-order tensor by factorizing its modes into multiple virtual modes, thereby introducing additional dimensions. By exploiting the sparse structure of MIMO channels and the plane-wave propagation model in the far-field regime, the proposed mode tensorization enhances the separability of individual propagation paths. It is shown that increasing the number of tensor modes improves component separation and provides inherent denoising effects. Building on these properties, a mode-tensorized CP decomposition (MTCPD) algorithm is developed. In addition, a metric for analyzing the virtual factors obtained from MTCPD is proposed, enabling estimation of the canonical rank and selection of the most informative components contributing to overall system performance. Numerical results demonstrate that the proposed method improves channel estimation accuracy compared to conventional tensor-based approaches, particularly under low signal-to-noise ratio conditions.

4D and 5D Layer Codes through Color Routing

2026-05-18T18:00:09Z

We introduce and explicit Calderbank-Shor-Steane (CSS) code construction that generalizes the Layer codes to $D=4,5$ dimensions. Much like its predecessor, the present construction is based on embedding quantum low-density parity check (qLDPC) codes; from an $[[n,k,d]]$ code with energy barrier $Δ$, we obtain a $D=4,5$ dimensional Layer code with parameters $[[Θ(n^{D/(D-2)}), k, Θ(dn^{1/(D-2)})]]$ and energy barrier $Ω(Δ)$. Using good qLDPC codes as input, our construction saturates the $D=4,5$ dimensional BPT bounds exactly. The higher dimensional Layer Codes are modular, and thus well suited to architectures composed of modular network patches, despite our physical limitation to three dimensions. We overcome the hurdles encountered by previous generalization attempts through the use of \textit{color routing}, allowing us to resolve the structure of the check layers and line defects.

Two-Dimensional Quantization for Geometry-Aware Audio Coding

2026-05-18T17:15:31Z

Recent neural audio codecs have achieved impressive reconstruction quality, typically relying on quantization methods such as Residual Vector Quantization (RVQ), Vector Quantization (VQ) and Finite Scalar Quantization (FSQ). However, these quantization techniques limit the geometric structure of the latent space, make it harder to capture correlations between features leading to inefficiency in representation learning, codebook utilization and token rate. In this paper we introduce Two-Dimensional Quantization (Q2D2), a quantization scheme in which feature pairs are projected onto structured 2D grids, such as hexagonal, rhombic, or rectangular tiling and quantized to the nearest grid values, yielding an implicit codebook defined by the product of grid levels, with codebook sizes comparable to conventional methods. Despite its simple geometric formulation, Q2D2 improves audio compression efficiency, with low token rates and high codebook utilization while maintaining state of the art reconstruction quality. Specifically, Q2D2 achieves competitive to superior performance in various objective and subjective reconstruction metrics, across extensive experiments in speech, audio and music domains compared to state of the art models. Comprehensive ablation studies further confirm the effectiveness of our design choices.

Low-Rank Toeplitz Matrix Restoration: Descent Cone Analysis and Structured Random Matrix

2026-05-18T16:33:17Z

This note demonstrates that we can stably recover all symmetric Toeplitz matrices $\pmb{X}_0\in\mathbb{R}^{n\times n}$ of rank at most $r$ from a number of rank-one subgaussian measurements on the order of $r\log^{2} n$ with an exponentially decreasing failure probability by employing a nuclear norm minimization program. Our approach utilizes descent cone analysis through Mendelson's small ball method with the Toeplitz constraint. The key ingredient is to determine the spectral norm of a random matrix with Toeplitz structure, which may be of independent interest. This improves upon earlier analyses and resolves the conjecture in Chen et al. (IEEE Transactions on Information Theory, 61(7):4034--4059, 2015).

Theory of Minimal Weight Perturbations in Deep Networks and its Applications for Low-Rank Activated Backdoor Attacks

2026-05-18T15:16:55Z

The minimal norm weight perturbations of DNNs required to achieve a specified change in output are derived and the factors determining its size are discussed. These single-layer exact formulae are contrasted with more generic multi-layer Lipschitz constant based robustness guarantees; both are observed to be of the same order which indicates similar efficacy in their guarantees. These results are applied to precision-modification-activated backdoor attacks, establishing provable compression thresholds below which such attacks cannot succeed, and show empirically that low-rank compression can reliably activate latent backdoors while preserving full-precision accuracy. These expressions reveal how back-propagated margins govern layer-wise sensitivity and provide certifiable guarantees on the smallest parameter updates consistent with a desired output shift.

Complex Analysis of Channel Polarization on Discrete BMS Channels

2026-05-18T14:23:04Z

We develop component evolution (CE), a framework based on complex function theory for finite-blocklength channel polarization on discrete binary-input memoryless output-symmetric (BMS) channels. In this view, the Bhattacharyya parameter is treated as a real-valued instance of a broader class of complex-valued channel functionals. CE systematically derives analytic expressions for the Bhattacharyya parameters of the bit-channels of a given discrete BMS channel at arbitrary polarization levels. CE also enables structural analysis, providing new evidence of extremality of the binary erasure channel (BEC) and binary symmetric channel (BSC), and revealing new channel-dependent recursions for a class of BSC bit-channels.

Functional Multi-Reference Alignment via Deconvolution

2026-05-18T13:16:13Z

This paper studies the multi-reference alignment (MRA) problem of estimating a signal function from shifted, noisy observations. Our functional formulation reveals a new connection between MRA and deconvolution: the signal can be estimated from second-order statistics via Kotlarski's formula, an important identification result in deconvolution with replicated measurements. To design our MRA algorithms, we extend Kotlarski's formula to general dimension and study the estimation of signals with vanishing Fourier transform, thus also contributing to the deconvolution literature. We validate our deconvolution approach to MRA through both theory and numerical experiments.

Existence and Counting Bounds for High-Memory Spatially-Coupled Codes via the Combinatorial Nullstellensatz

2026-05-18T12:41:53Z

The finite-length performance of spatially-coupled low-density parity-check (SC-LDPC) codes is strongly affected by short cycle configurations and the harmful structures induced by them. This paper studies SC-LDPC code design directly at the protograph level, where the design variables are the edge-spreading assignments specified by the partition matrix. In contrast to CLLL/Moser--Tardos based constructive frameworks for QC-SC-LDPC codes, we focus on sharper nonconstructive existence and counting bounds. By encoding cycle-activation conditions as polynomial vanishing constraints over finite grids, we apply the Combinatorial Nullstellensatz to derive sufficient memory conditions for eliminating prescribed cycle-induced harmful structures. For fully connected $(γ,κ)$ base graphs, the resulting bounds explicitly characterize the memory required to destroy all $4$-cycles as well as all $4$- and $6$-cycles, and for fixed $γ$, they are asymptotically tight up to a constant factor compared with known lower bounds. We further apply the Alon--Füredi theorem to obtain lower bounds on the number of feasible edge-spreading assignments, including an explicit counting bound for assignments that eliminate all $4$-cycles and hence yield girth at least six. These results provide a refined algebraic-combinatorial characterization of the feasible design space for high-memory SC-LDPC codes, although no corresponding construction algorithm is provided.

Spectral Conditions for the Ingleton Inequality

2026-05-18T12:41:12Z

The Ingleton inequality is a classical linear information inequality that holds for representable matroids but fails to be universally valid for entropic vectors. Understanding the extent to which this inequality can be violated has been a longstanding problem in information theory. In this paper, we show that for a broad class of jointly distributed random variables $(X,Y)$ the Ingleton inequality holds up to a small additive error, even even though the mutual information between $X$ and $Y$ is far from being extractable. Contrary to common intuition, strongly non-extractable mutual information does not lead to large violations of the Ingleton inequality in this setting. More precisely, we consider pairs $(X,Y)$ that are uniformly distributed on their joint support and whose associated biregular bipartite graph is an expander. For all auxiliary random variables $A$ and $B$ jointly distributed with $(X,Y)$, we establish a lower bound on the Ingleton quantity $I(X;Y | A) + I(X;Y | B) + I(A;B) - I(X;Y)$ in terms of the spectral parameters of the underlying graph. Our proof combines the expander mixing lemma with a partitioning technique for finite sets.