Model Merging by Output-Space Projection

2026-05-27T21:01:29Z

Model merging combines fine-tuned checkpoints into a single multi-task model without retraining. Existing methods - such as task arithmetic, model soups, TIES, and DARE - are computationally efficient and empirically successful, but rely on heuristic design choices and lack formal optimality guarantees. We show that merging can be formulated as a convex quadratic programme over residual updates, yielding weights that minimise a squared-output calibration objective using calibration inputs and fine-tuned model outputs, and subsuming existing methods as special cases. Our framework yields a closed-form diagnostic - the fraction of residual energy captured by a chosen basis - that predicts downstream merge quality using only the calibration set. Empirically, the QP matches or outperforms existing methods in the single-layer setting, and we characterise when the optimal basis provides significant gains over the cheaper diagonal QP. We extend to multi-layer merging via a sequential layer-wise algorithm and demonstrate consistent gains across language and vision benchmarks.

Similarity-Sensitive Entropy under Representation Change and Inference

2026-05-27T20:48:13Z

Similarity-sensitive entropy measures the uncertainty of a probability law relative to a similarity kernel that encodes the distinguishability between states. We develop a measure-theoretic treatment covering both finite similarity matrices and general probability spaces, and study how the law and similarity kernel transform under measurable maps, Markov kernels (channels), and conditioning operations. This yields deterministic and channel data-processing inequalities, so a reduction in entropy quantifies how much distinguishability is lost under representation change. We also define a conditional similarity sensitive entropy theory, give a counterexample to a recent conjecture on concavity, and identify a useful one-dimensional Laplace pullback class where concavity holds.

A New Class of Geometric Analog Error Correction Codes for Crossbar Based In-Memory Computing

2026-05-27T19:00:38Z

Analog error correction codes have been proposed for analog in-memory computing on resistive crossbars, which can accelerate vector-matrix multiplication for machine learning. Unlike traditional communication or storage channels, this setting involves a mixed noise model with small perturbations and outlier errors. A number of analog codes have been proposed for handling a single outlier, and several constructions have also been developed to address multiple outliers. However, the set of available code families remains limited, covering only a narrow range of code lengths and dimensions. In this paper, we study a recently proposed family of geometric codes capable of handling multiple outliers, and develop a geometric analysis that characterizes their m-height profiles.

Intermediate Constacyclic Codes and Scalar-Residue Reed--Muller Layers

2026-05-27T18:01:44Z

A 2024 paper of Sun, Ding and Wang introduced a second class of constacyclic codes over finite fields, denoted $C(q,m,r,\ell)$, with length $(q^m-1)/r$, where $r\mid(q-1)$ and the defining monomials have total $q$-ary degree congruent to $r-1$ modulo $r$. In the non-projective intermediate range $2

A Local Valuation Criterion for Quadratic-Permutation Interleaved Zadoff--Chu Sequences

2026-05-27T17:39:29Z

Berggren and Popović introduced quadratic-permutation-polynomial interleaved Zadoff--Chu sequences and, from exhaustive data, conjectured that all normalized QPP-interleaved Zadoff--Chu sequences are inequivalent to ordinary Zadoff--Chu sequences precisely for prime-power lengths $N=p^n$ with $p>3$ and $n>1$. We give an exact local arithmetic criterion. For a normalized QPP $π_{a,b}(k)=ak^2+bk\pmod N$, the interleaved sequence is equivalent, under the standard five CAZAC-preserving operations, to a Zadoff--Chu sequence if and only if, for every prime power $p^α\Vert N$, the valuation of $a$ satisfies \[ ν_p(a)\ge \begin{cases} 0, & p=2,\ α=1,\\ α-1, & p=2,\ α\ge2,\\ α-1, & p=3,\\ α, & p>3. \end{cases} \] The proof is based on a third finite-difference invariant of the lifted Zadoff--Chu phase, namely \[ Δ^3\bigl((ak^2+bk+\varepsilon_N+2q)(ak^2+bk)\bigr) =12a(2ak+3a+b). \] As a consequence, the conjectured prime-power boundary is not correct: the exact non-vacuous condition for all nonzero normalized QPPs to be inequivalent to Zadoff--Chu sequences is that $N$ is odd, $9\nmid N$, and $p^2\mid N$ for at least one prime $p\ge5$. In particular, $N=75=3\cdot5^2$ is the smallest non-prime-power counterexample to the conjectured ``only if'' direction. A second corollary records the corresponding statement for irreducible QPPs.

Algebraic Resolutions of Seven Open Problems on Cyclic and Negacyclic Codes Supporting Designs

2026-05-27T17:33:50Z

This paper gives a unified algebraic solution to seven open problems of Wang, Tang and Ding on cyclic, negacyclic and constacyclic codes supporting designs. For the cyclic code \[ C\left(\frac{p^s-1}{2},\frac{p^s+1}{2}\right), \] a Cayley parametrization of the unit circle reduces the trace-zero condition to a semilinear equation on $\PG(1,q)$. Its large root sets are exactly the $\F_{p^{\gcd(m,s)}}$-sublines, yielding the complementary design \[ \overline{S(3,q_0+1,q+1)}. \] For the length $q^2+1$ negacyclic code, a quotient transport from $\U_{2(q^2+1)}$ to $\U_{q^2+1}$ and a unit-circle parametrization show that the minimum zero sets are precisely the Baer sublines of $\PG(1,q^2)$. Equivalently, the corresponding support design is the complement of the non-tangent plane sections of an elliptic quadric $\Q^-(3,q)$. For constacyclic ovoid codes of length $q^2+1$ over $\F_q$, the exact existence criterion is \[ λ\in\F_q^*,\qquad \exists\ λ\text{-constacyclic ovoid code} \Longleftrightarrow λ\notin(\F_q^*)^2. \] In particular, negacyclic ovoid codes exist exactly when $q\equiv3\pmod4$. The proof uses the corrected projective-order congruence \[ a=(q+1)c,\qquad c\equiv b\pmod{q-1},\qquad \operatorname{ord}(θ\F_q^*)=\frac{q^2+1}{\gcd(q^2+1,c)}. \] The paper also derives a universal weight enumerator for lifted ovoid codes over extension fields, independent of the chosen ovoid. Finally, consecutive-root negacyclic MDS codes are constructed to give complete simple $5$-designs, including a proper negacyclic $[11,5,7]_{23}$ code whose minimum supports form the complete $5-(11,7,15)$ design.

Non-binary LDPC codes for Data Storage

2026-05-27T17:32:01Z

In modern data storage systems, non-binary LDPC codes for recovering from disk failures are increasingly considered strong competitors to MDS codes such as Reed-Solomon codes. Since disk failures can be modeled as erasures, we analyze non-binary LDPC codes over a $q$-ary field in the $q$-ary erasure channel, relative to MDS codes. Our focus is on non-binary LDPC codes whose parity-check matrix is obtained by replacing the non-zero entries of a binary base matrix by elements of a $q$-ary finite field. For such LDPC codes, we introduce the notion of ultimate distance, which upper-bounds their minimum distance. We derive a random-coding bound on the number of non-correctable erasure patterns for the Gallager ensemble of regular non-binary LDPC codes under maximum-likelihood decoding. An algorithm for finding the ultimate distance is presented. A low-complexity algorithm for searching for the minimum distance of the non-binary LDPC code is proposed. Finally, we construct examples of non-binary LDPC codes achieving the ultimate distance.

Linear and cyclic codes over some special rings

2026-05-27T15:56:55Z

In this paper, we describe linear and cyclic codes over the rings of the form $R_{s,p}=\mathbb{Z}_{p}[u]/\left( f\left(u\right) /\left( u-s\right) \right)$, where $p$ is a prime number and $f\left( u\right) =u^{p}-u$, with $s\in \{0,1,...,p-1\}$.

Locally recoverable codes from elliptic surfaces with availability and hierarchical locality

2026-05-27T13:27:43Z

In this paper, we propose several constructions of Locally Recoverable Codes from elliptic surfaces. In particular, we are able to obtain codes with availability $t>2$, codes with hierarchical locality and, finally, codes which combine availability and hierarchical locality. Our constructions rely on the properties of the torsion groups of elliptic curves and on the fibered structure of elliptic surfaces. In particular, the geometry of the surface is used to introduce a multi-dimensional setting, allowing for more recovery sets, eventually nested one within another.

Information Age-Controllability Trade-offs in Communication-Constrained Networks

2026-05-27T12:35:53Z

We investigate the trade-off between controllability, channel access, and age-related performance in a wireless network of control systems. Controllers share a random-access channel to transmit control inputs to actuators over slotted blocks. We measure reliable control via block controllability, where a block is controllable if it contains a required number of consecutive successful transmissions. In parallel, we capture information freshness via the age of information. To enable efficient allocation of channel resources over time, we introduce adaptive access probabilities at the block level, prioritizing controllers that have not yet achieved controllability. We then derive closed-form expressions for block controllability probability, the peak latency between inter-block consecutive successes, and peak age of information. We further characterize the peak control latency, defined as the time between consecutive controllable blocks. Finally, we optimize access probabilities to jointly balance controllability and age-related metrics. Numerical results illustrate the effectiveness of the proposed adaptive access policies in managing this trade-off in interference-limited wireless control networks.

Score Based Error Correcting Code Decoder

2026-05-27T11:55:27Z

Error-correcting codes enable reliable communication, yet practical soft decoding remains challenging across code families and block lengths. We propose SB-ECC, a score-based decoder that casts decoding as continuous-time denoising. A neural denoiser defines a probability-flow ordinary differential equation (ODE) that iteratively updates the noisy channel observation toward a valid codeword, guided by parity constraints. The model is trained across noise levels without time/SNR conditioning, enabling inference without SNR estimation and supporting a direct latency accuracy trade off controlled by the ODE solver budget. We use the raw signed channel observation as input for learning a continuous denoising field. Across 42 code/SNR settings, SB-ECC achieves the best BER in 39/42 entries, with an average SNR gain of 0.17dB and a maximum gain of 0.46dB over the strongest competing baseline, we showed that swapping the solver from Euler to DPM preserves -ln(BER) while reducing end-to-end decoding time by 8.86% on average (up to 12.82%).

Noise Scheduling as Information-Guided Allocation in Diffusion Training

2026-05-27T11:26:23Z

We introduce InfoNoise, an online adaptive noise schedule for diffusion training that reallocates optimization effort toward noise levels where denoising is most informative. Together with loss weighting, a noise schedule induces an effective allocation across denoising problems, often fixed before informative noise levels are known. InfoNoise makes this allocation data-adaptive by estimating a conditional-entropy-rate profile from denoising losses during training, without auxiliary models or offline search. Through I--MMSE, this profile identifies where noisy observations rapidly reduce uncertainty about the clean sample and guides adaptation of the training noise distribution. It changes only this distribution, keeping the objective, weighting, and parameterization fixed. On image benchmarks, where schedules have been extensively tuned, InfoNoise matches or slightly exceeds strong baselines and can reach the same quality with fewer updates. On representation, sequence, and modality shifts, including DNA and language generation, InfoNoise improves over fixed and adaptive baselines and reaches target quality with up to $3\times$ less training compute. These results establish the conditional-entropy-rate profile as the data-dependent target for noise schedule design and make online adaptation a practical alternative to manual schedule search.

ISAC Privacy: Challenges and Solutions for 6G

2026-05-27T11:25:39Z

Integrated sensing and communication (ISAC) is a promising feature of future communication networks. While spatial sensing can improve network performance and enable external services, it also creates privacy challenges that go beyond the confidentiality of communication content. Future networks using millimeter-wave (mmWave) and sub-terahertz (THz) frequencies may collect or infer detailed information about people, devices, bystanders, passive objects, and environments in a sixth-generation (6G) deployment area. Such sensing can reveal location and environment data, support behavioral profiling such as movement or activity recognition, and, in advanced cases, expose physiological information such as breathing frequency or heart-rate-related data. Thus, the capabilities of spatial sensing must be controlled to satisfy privacy requirements. In this work, we organize privacy-sensitive ISAC data into three sensing levels: location and environment data, behavioral data, and physiological data, and use this classification as the organizing principle throughout the paper. Based on this classification, we discuss internal and external ISAC applications, identify privacy challenges related to consent, transparency, data ownership, profiling, bystander exposure, and sensitive sensing data, review representative solution directions, and outline future research directions for privacy-preserving ISAC.

Fluid Antenna System Meets Low-Resolution ADCs in Energy-Efficient Cell-Free Massive MIMO

2026-05-27T11:19:08Z

This paper proposes a novel fluid antenna system (FAS)-enabled architecture to improve energy efficiency (EE) without sacrificing capacity. Specifically, we integrate FAS into cell-free massive MIMO systems to counteract low-resolution ADCs. We establish a comprehensive uplink transmission model and derive analytical expressions for SE and EE. These expressions explicitly capture the quantization error under slow fluid antenna multiple access and quantify the benefits of low-resolution ADCs on EE. Furthermore, we formulate a joint optimization problem to maximize EE performance. To solve this, we develop an efficient alternating optimization framework. This framework leverages the Dinkelbach algorithm-based fractional programming for power control, alongside novel accelerated projected gradient ascent (APGA) algorithms to optimize both continuous FAS positions and discrete ADC bit allocations. Numerical results reveal that low-resolution ADCs aggressively compress signals to save hardware power, which inevitably degrades SE but maintains EE. However, FASs can recover this SE loss thanks to their spatial flexibility and significantly boost EE by improving the received signal prior to destructive quantization. Furthermore, optimized power control can prevent quantization-induced multi-user interference, while efficient bit allocation can reduce exponential hardware power. Ultimately, our proposed FAS-enabled system, coupled with efficient power control and bit allocation, effectively improves system performance and outperforms traditional fixed-position antennas. It establishes a highly robust and energy-efficient paradigm for 6G networks.

The Well-Tempered Classifier: Some Elementary Properties of Temperature Scaling

2026-05-27T11:05:26Z

Temperature scaling is a simple method that allows to control the uncertainty of probabilistic models. It is mostly used in two contexts: improving the calibration of classifiers and tuning the stochasticity of large language models (LLMs). In both cases, temperature scaling is the most popular method for the job. Despite its popularity, a rigorous theoretical analysis of the properties of temperature scaling has remained elusive. We investigate here some of these properties. For classification, we show that increasing the temperature increases the uncertainty in the model in a very general sense (and in particular increases its entropy). However, for LLMs, we challenge the common claim that increasing temperature increases diversity. Furthermore, we introduce two new characterisations of temperature scaling. The first one is geometric: the tempered model is shown to be the information projection of the original model onto the set of models with a given entropy. The second characterisation clarifies the role of temperature scaling as a submodel of more general linear scalers such as matrix scaling and Dirichlet calibration: we show that temperature scaling is the only linear scaler that does not change the hard predictions of the model.