Automated Heuristic Design for Network Operations

2026-05-27T09:20:51Z

Network operation relies on heuristics to solve many tasks rapidly and efficiently across the protocol stack. These heuristics are the result of thorough human-driven design rooted in expert knowledge of the target system and problem. Recently, approaches powered by Artificial Intelligence have shown promising results in devising solutions that outperform long-established heuristics in classical problems. We explore the possibility of applying such Automated Heuristic Design (AHD) frameworks to network environments by (i) discussing the general integration of AHD with network operation and the associated challenges, as well as (ii) proposing a practical implementation of AHD for a specific networking task, i.e., 5G decoding. Initial results show how modern AHD tools can devise heuristics for Low-Density Parity Check decoding on par with state-of-the-art solutions implemented in production systems.

Recursive Vision Transformer with Dynamic Depth and Width Adjustment for Resource-Efficient Image Semantic Communication

2026-05-27T08:40:19Z

Image semantic communication is a critical component in next-generation wireless communication systems. However, such systems typically suffer from large memory footprints and high computational complexity, making them difficult to deploy on resource-constrained devices. To address these challenges, we propose a vision transformer (ViT)-enabled image semantic communication system. In this system, a recursive structure is introduced to iteratively refine semantic features and reduce the parameter count. In addition, three dynamic adjustment strategies are designed to adaptively reduce computational complexity: dynamic depth adjustment, dynamic width adjustment, and joint width-depth optimization. Dynamic depth adjustment adaptively determines the number of recursive modules according to image content and channel conditions, while dynamic width adjustment selectively preserves important neurons and attention heads. The joint width-depth optimization further enables flexible computation configurations. Simulation results verify that the proposed recursive ViT-based system, combined with the three dynamic adjustment strategies, reduces the parameter count by 48.7% and achieves higher reconstruction quality than existing baselines under comparable computational complexity.

Sequential Neural Probabilistic Amplitude Shaping: Learning the Channel's Language

2026-05-27T08:25:18Z

We present the first neural probabilistic amplitude shaping that outperforms existing methods while accounting for all implementation losses, using a block-less, easily implementable sequential autoregressive encoder compatible with arithmetic distribution matching, yielding reduced rate loss and higher achievable information rates.

A Unified Fractional Regularization Framework for Sparse Recovery

2026-05-27T07:10:41Z

We propose a unified fractional regularization framework for sparse signal recovery based on the $\ell_1/\ell_p^q$ model. This model generalizes several widely used sparsity-promoting regularizers and provides additional flexibility through the parameters $p$ and $q$. Our main theoretical contribution is the characterization of the equivalence between the first-order stationary points of the $\ell_1/\ell_p^q$ formulation and the subtractive $\ell_1-α\ell_p$ model, thereby offering a unified perspective on these nonconvex regularizers. In addition, we establish a new sufficient recovery condition under the Restricted Isometry Property (RIP), which shows that the proposed framework can provide relaxed recovery guarantees and improved robustness. To solve the resulting nonconvex problem, we develop a majorization--minimization (MM) algorithm and prove its convergence by using the Kurdyka--Łojasiewicz (KL) property. Numerical experiments on sparse recovery problems with different sensing matrices and MRI reconstruction demonstrate that the proposed approach outperforms existing methods in recovery accuracy.

Construction of Minimal Ternary Linear Codes with Dimension $m+2$ Via Krawtchouk Polynomials

2026-05-27T06:57:07Z

Recently, minimal linear codes have been extensively studied due to their applications in secret sharing schemes, secure two-party computations, and so on. Constructing minimal linear codes violating the Ashikhmin-Barg condition and then determining their weight distributions have been interesting in coding theory and cryptography. In this paper, a generic construction for ternary linear codes with dimension $m+2$ is presented, where $m$ is an integer, and a necessary and sufficient condition for this ternary linear code to be minimal is derived. Based on this condition and Krawtchouk Polynomials, a new class of minimal ternary linear codes violating the Ashikhmin-Barg condition are obtained, and then their complete weight enumerators are determined.

A Geometric Profile of Semantic Information in Text: Frame-Conditional Uniqueness and a Trade-Off Triangle for Scalar Summaries

2026-05-27T04:37:26Z

How much meaning does a text carry? Shannon's theory measures uncertainty over symbols and is intentionally indifferent to meaning, while pairwise metrics such as BERTScore compare two texts rather than characterizing one. We develop a geometric framework that measures semantic content from the structure of a text's sentence embeddings. The framework has three parts. First, within a fixed embedding and baseline, six natural axioms uniquely determine a scalar measure up to scale, a frame-conditional uniqueness theorem. The resulting scalar is empirically too coarse, motivating a richer representation. Second, we propose a three-coordinate semantic profile capturing novelty (displacement from generic discourse), breadth (diversity of distinct ideas), and integration (connectedness among them), together with a discrete minimal unit (the semantic quantum) whose resolution is fixed by a clustering threshold $τ$. Third, we prove a no-go theorem: no scalar summary of the profile can simultaneously satisfy analytic stability under paraphrase and concatenation, ordinal robustness across text scales, and cross-representation comparability. We exhibit two practical scalars, $S_{\mathrm{minmax}}$ and $S_{\mathrm{rank}}$, each occupying a distinct corner of this trade-off triangle. Validation across 23 synthetic categories, 5 Project Gutenberg novels, and 3 embedding models confirms the trade-off. The recommended rank-normalized configuration passes 25 of 28 ordinal checks as point estimates (21 of 28 after Benjamini-Hochberg correction), outperforming seven baselines including unigram entropy and a BERTScore-based novelty signal. A separate variational result connects the breadth coordinate to the log-determinant of a determinantal point process (Spearman $ρ= 0.985$ over 507 Gutenberg chapters), giving an optimization-theoretic foundation for breadth.

Good Integers: (T,k)-Subclasses and Applications to Galois Duality in Coding Theory

2026-05-27T04:05:36Z

The notion of good integers, namely the divisors of the sequence $(a^s+b^s)_{s\ge 1}$ for nonzero coprime integers $a$ and $b$, together with their subfamilies such as oddly-good and evenly-good integers, has become an important arithmetic tool in the study of Euclidean and Hermitian dualities for abelian and cyclic codes. Building on this perspective, this paper introduces and studies another interesting subclass of good integers arising from the sequence $\bigl(a^{ks+T}+b^{ks+T}\bigr)_{s\ge 1}$ for some integers $0\leq T

Optimization of CF-mMIMO Systems for the Coexistence between eMBB+ and mMTC+: From Analytical to GNN-Aided Designs

2026-05-27T04:01:33Z

This paper investigates uplink multiple access for the coexistence of enhanced mobile broadband+ (eMBB+) and massive machine-type communications+ (mMTC+) in terminal-centric cell-free massive MIMO (CF-mMIMO) systems. We propose a non-orthogonal scheme in which low-rate mMTC+ transmissions are spread across the time-frequency grid shared with eMBB+ users, enabling efficient resource reuse. In the presence of imperfect channel state information, we derive closed-form expressions for the achievable rates of both services based solely on statistical channel knowledge. For mMTC+ devices, the analysis also incorporates finite blocklength (FBL) modeling to capture short-packet transmissions. To support heterogeneous service requirements, we formulate a power-control problem that maximizes the minimum energy efficiency of mMTC+ devices subject to quality-of-service constraints on eMBB+ users. The resulting nonconvex problem is solved via sequential fractional programming, accounting for both the Shannon and FBL regimes. To enable real-time operation, we further propose a graph neural network (GNN) with multi-head attention to approximate the model-based solution. Constraint satisfaction during training is enforced via an augmented Lagrangian loss. Numerical results demonstrate effective multiplexing of the two data services and show that the proposed GNN algorithm achieves near-optimal performance with a significantly lower computational complexity.

Tighter Information-Theoretic Generalization Bounds via a Novel Class of Change of Measure Inequalities

2026-05-26T23:40:51Z

Change of measure inequalities translate divergences between probability measures into explicit bounds on event probabilities, and play an important role in deriving probabilistic guarantees in learning theory, information theory, and statistics. We propose novel change of measure inequalities via a unified framework based on the data processing inequality, which is surprisingly elementary yet powerful enough to yield novel, tighter inequalities. We provide change of measure inequalities in terms of a broad family of information measures, including $f$-divergences (with Kullback-Leibler divergence and $χ^2$-divergence as special cases), Rényi divergence, and $α$-mutual information (with maximal leakage as a special case). We apply these results to generalization error analysis, PAC-Bayesian theory, differential privacy, and data memorization, obtaining stronger guarantees while recovering best-known results through simplified analyses.

Smoothed Score Queries and the Complexity of Sampling

2026-05-26T23:38:20Z

We study the query complexity of sampling from high-dimensional Gaussian distributions using gradient information. In the standard oracle model, exact gradients expose only matrix-vector products with the precision matrix, leading to polynomial approximation barriers and a characteristic $\sqrtκ$ dependence on the condition number. We show that this barrier disappears when the sampler is allowed to query \emph{smoothed scores}, namely gradients of the logarithms of the Gaussian-convolved densities. For a Gaussian target with precision matrix $Λ$, a smoothed-score query at noise level $τ$ gives access to the resolvent $(Λ+τ^{-1}I)^{-1}$. Combining geometrically spaced noise levels with sinc-quadrature rational approximation, we obtain a sampler with $q=O\!\left(\bigl(\logκ+\log(e\sqrt d/δ_{\rm TV})\bigr)\log(e\sqrt d/δ_{\rm TV})\right)$ smoothed-score queries for total variation error $δ_{\rm TV}$, improving the condition-number dependence from $\sqrtκ$ to logarithmic. We also study finite-bit gradient oracles. Using coordinatewise quantization of the transformed smoothed-score answers and a final dithering step, we obtain a sampling scheme whose total communicated gradient information is polylogarithmic in $κ$; in particular, for fixed dimension and accuracy, the bit complexity is $O(\log^2κ)$. To complement these upper bounds, we introduce a channel-synthesis, or reverse-Shannon, converse technique for sampling lower bounds. This converts total-variation simulation guarantees into communication requirements and yields an $Ω(\logκ)$ lower bound on the required gradient information. Together, these results identify smoothed scores as a provably more informative oracle for sampling and give nearly matching upper and lower bounds for its finite-bit complexity.

Adaptive Multi-Head Finite-State Gamblers

2026-05-26T22:42:03Z

Multi-head finite-state dimensions and predimensions quantify the predictability of a sequence by a gambler with trailing heads acting as "probes to the past." These additional heads allow the gambler to exploit patterns that are simple but non-local, such as in a sequence $S$ with $S[n]=S[2n]$ for all $n$. In the original definitions of Huang, Li, Lutz, and Lutz (2025), the head movements were required to be oblivious (i.e., data-independent). Here, we introduce a model in which head movements are adaptive (i.e., data-dependent) and compare it to the oblivious model. We establish that for each $h\geq 2$, adaptivity enhances the predictive power of $h$-head finite-state gamblers, in the sense that there are sequences whose oblivious $h$-head finite-state predimensions strictly exceed their adaptive $h$-head finite-state predimensions. We further prove that adaptive finite-state predimensions admit a strict hierarchy as the number of heads increases, and in fact that for all $h\geq 1$ there is a sequence whose adaptive $(h+1)$-head finite-state predimension is strictly less than its adaptive $h$-head predimension.

The Sharma-Mittal Entropy is Subadditive and Supermodular on the Majorization Lattice

2026-05-26T19:14:48Z

We prove that Sharma-Mittal entropy is a subadditive and supermodular function on the lattice of all $n$-dimensional probability distributions, ordered according to the partial order relation defined by majorization among vectors. Our result unifies and extends analogous results presented in the literature for the Shannon entropy, the Tsallis entropy, and the Rényi entropy.

Unified Fourier transform on graphs sampled from stochastic block models

2026-05-26T19:10:54Z

Recently, an approach to graph signal processing based on graphons was proposed. Here we show how such a graphon-driven approach to the Fourier transform can be used on graphs sampled from a stochastic block model (SBM). In particular, we show how a Fourier basis can be easily calculated from the block sizes and the block probability matrix. Using perturbation theory, we derive bounds on the sensitivity of the basis with respect to variations in the block sizes. We then consider SBMs constructed from weighted Cayley graphs. When block sizes are equal, a nice Fourier basis can be derived from the representation theory of the underlying group. When block sizes are nearly uniform, we demonstrate that this Fourier basis closely approximates the SBM Fourier basis. For highly non-uniform block sizes, the group-based Fourier basis is no longer applicable, though, as we show, the underlying group still provides partial information about the SBM Fourier basis.

On the Automorphism Groups of Berman Codes and associated Abelian Codes

2026-05-26T17:23:25Z

The automorphism group of a code is the group of permutations that map a code to itself. Berman codes are a class of binary linear codes characterized by two integer parameters $n\geq 2$ and $m\geq 1$, and this class includes the Reed-Muller codes as well. The class of Berman codes and their duals were recently shown to achieve the capacity of the binary erasure channel. A number of abelian codes that arise from the intersection and subspace sums of Berman and Dual Berman codes were also identified recently, for odd $n\geq 3$. A subclass of these abelian codes was shown to have good short block-length performance for AWGN channels, with efficient decoding algorithms. In this work, we identify the exact automorphism group for Berman codes and their duals. Further, we find the exact automorphism group for the above mentioned abelian codes, when $n\geq 5$. In the case of such abelian codes with $n=3$, we present partial characterizations of the automorphism groups for a large collection of parameter choices, and complete characterizations for a few.

PersianMedQA: Evaluating Large Language Models on a Persian-English Bilingual Medical Question Answering Benchmark

2026-05-26T14:36:48Z

Large Language Models (LLMs) have achieved remarkable performance on a wide range of Natural Language Processing (NLP) benchmarks, often surpassing human-level accuracy. However, their reliability in high-stakes domains such as medicine, particularly in low-resource languages, remains underexplored. In this work, we introduce PersianMedQA, a large-scale dataset of 20,785 expert-validated multiple-choice Persian medical questions from 14 years of Iranian national medical exams, spanning 23 medical specialties and designed to evaluate LLMs in both Persian and English. We benchmark 41 state-of-the-art models, including general-purpose, Persian, and medical LLMs, in zero-shot and chain-of-thought (CoT) settings. Our results show that closed-weight general models (e.g., GPT-4.1) consistently outperform all other categories, achieving 83.09% accuracy in Persian and 80.7% in English, while Persian LLMs such as Dorna underperform significantly (e.g., 34.9% in Persian), often struggling with both instruction-following and domain reasoning. We also analyze the impact of translation, showing that while English performance is generally higher, 3-10% of questions can only be answered correctly in Persian due to cultural and clinical contextual cues that are lost in translation. Finally, we demonstrate that model size alone is insufficient for robust performance without strong domain or language adaptation. PersianMedQA provides a foundation for evaluating bilingual and culturally grounded medical reasoning in LLMs. The dataset, along with a bilingual medical dictionary, is available: https://huggingface.co/datasets/MohammadJRanjbar/PersianMedQA .