https://arxiv.org/api/G2Dw1xGd+yTQhxrnBhdt9mOmhRE2026-06-21T10:36:08Z548419015http://arxiv.org/abs/2606.16592v1Rate-Distortion for Reversible Causal Nets under Closure-Preserving Fidelity2026-06-15T11:41:01ZWe develop a semantic rate-distortion theory for reversible logging under a closure-preserving fidelity criterion. An execution history is modeled as a finite set of logged facts, and rollback-relevant meaning is captured by a monotone semantic closure induced by an effective rule system such as Datalog. We introduce a bounded distortion that edits one logged fact and measures the resulting change in closure. A canonical deletion scan decomposes the log into an irredundant core and a redundant remainder; under admissible reconstructions, redundant facts become information-theoretically invisible, yielding a core-only rate-distortion reduction. At perfect fidelity, overlaps among zero-distortion reconstructions induce a confusability hypergraph that determines the minimum rate. We instantiate the framework on reversible causal nets and reversible prime event structures under multiple reversing disciplines, and validate the predictions numerically.2026-06-15T11:41:01ZJianfeng Xuhttp://arxiv.org/abs/2509.07605v2Beyond Rebalancing: Benchmarking Binary Classifiers Under Class Imbalance Without Rebalancing Techniques2026-06-15T11:26:38ZClass imbalance poses a significant challenge to supervised classification, particularly in critical domains like medical diagnostics and anomaly detection where minority class instances are rare. While numerous studies have explored rebalancing techniques to address this issue, less attention has been given to evaluating the performance of binary classifiers under imbalance when no such techniques are applied. Therefore, the goal of this study is to assess the performance of binary classifiers "as-is", without performing any explicit rebalancing. Specifically, we systematically evaluate the robustness of a diverse set of binary classifiers across both real-world and synthetic datasets, under progressively reduced minority class sizes, using one-shot and few-shot scenarios as baselines. Our approach also explores varying data complexities through synthetic decision boundary generation to simulate real-world conditions. In addition to standard classifiers, we include experiments using undersampling, oversampling strategies, and one-class classification (OCC) methods to examine their behavior under severe imbalance. The results confirm that classification becomes more difficult as data complexity increases and the minority class size decreases. While traditional classifiers deteriorate under extreme imbalance, advanced models like TabPFN and boosting-based ensembles retain relatively higher performance and better generalization compared to traditional classifiers. Visual interpretability and evaluation metrics further validate these findings. Our work offers valuable guidance on model selection for imbalanced learning, providing insights into classifier robustness without dependence on explicit rebalancing techniques.2025-09-09T11:28:34ZAli NawazAmir AhmadShehroz S. Khanhttp://arxiv.org/abs/2604.02343v2Haiku to Opus in Just 10 bits: LLMs Unlock Large Compression Gains2026-06-15T11:17:56ZWe study the compression of LLM-generated text across lossless and lossy regimes, characterizing a compression-compute frontier where more compression is possible at the cost of more compute. For lossless compression, domain-adapted LoRA adapters can improve LLM-based arithmetic coding by 2x over compression with the base LLM alone. For lossy compression, prompting a model for a succinct rewrite then applying arithmetic coding can achieve compression ratios of approximately 0.03, a 2x improvement over compressing the original response.
We further introduce Question-Asking compression (QA), an interactive lossy protocol inspired by the game 'Twenty Questions'. A small model iteratively refines its response by asking yes/no questions to a stronger model, transferring exactly one bit per answer. On 8 benchmarks spanning math, science, and code, 10 binary questions recover 23% to 72% of the capability gap between a small and large model on standard benchmarks and 7% to 38% on harder benchmarks, achieving compression ratios of 0.0006 to 0.004. This is over 100x smaller than prior LLM-based compression (Deletang et al., 2024), suggesting that interactive protocols can transfer knowledge far more efficiently than transmitting full responses.2026-02-09T18:52:02ZRoy RinbergAnnabelle Michael CarrellSimon HennigerNicholas CarliniKeri Warrhttp://arxiv.org/abs/2606.17117v1Sensing-Native Over-the-Air Federated Learning2026-06-15T10:23:37ZOver-the-air federated learning (FL) leverages the superposition property of multiple-access channels to enable communication-efficient distributed model training. Existing integrated sensing, communication, and computation (ISCC)-enabled over-the-air FL systems typically require dedicated resources for the sensing module, inevitably compromising FL performance due to resource competition. In this paper, we propose a sensing-native over-the-air FL framework that explores built-in distributed wireless sensing capability with zero overhead per model aggregation. Specifically, the high-dimensional local gradient signals possessing favorable autocorrelation property are concurrently leveraged for target distance estimation, while the gradient statistics already required for over-the-air FL serve as a ready-made gateway to deliver locally-sensed results to the edge server for cooperative localization. To combat inter-device interference, channel fading, and communication noise, we put forth a robust trilateration-based target positioning method building upon an efficient matched-filtering-based distance estimation. Then, by explicitly characterizing the impact of imperfect model aggregation and noisy gradient-statistics transmission on the sensing-native over-the-air FL convergence, we develop a statistics-aware communication-learning co-design approach. We first derive the closed-form optimal power budgets allocated to local gradients and their statistics, based on which an efficient successive convex approximation method is proposed for receiver beamforming optimization. Simulation results show that the proposed framework simultaneously achieves superior learning and sensing performance compared to representative baselines.2026-06-15T10:23:37ZPeiyuan HuangShijian GaoJia YanGeorgios B. Giannakishttp://arxiv.org/abs/2606.16466v1Information aging in massive MIMO systems affected by phase noise2026-06-15T09:34:44ZIn massive MIMO systems, phase noise can spoil the performance of the usual receiver techniques. The problem arises because of the aging of phase-noise information based on pilots. In this paper, in a realistic 5G uplink scenario, we quantify the impact of information aging and we propose an iterative receiver based on expectation-maximization (EM). Simulation results show that the iterative receiver is robust to information aging related to phase noise.2026-06-15T09:34:44Z4 pages, 4 figures, Proceedings of the 2021 XXXIVth General Assembly and Scientific Symposium of the International Union of Radio Science (URSI GASS) Available at https://www.ursi.org/proceedings/procGA21/papers/URSIGASS2021-We-C14-PM3-1.pdf2021 XXXIVth General Assembly and Scientific Symposium of the International Union of Radio Science (URSI GASS), Rome, Italy, 2021, pp. 1-4Alberto TarableFrancisco J. Escribano10.23919/URSIGASS51995.2021.9560263http://arxiv.org/abs/2606.16362v1Input-Dependent Fisher Information for Local Sensitivity Analysis of Medical Image Classifiers2026-06-15T07:57:50ZDeep neural networks have achieved strong performance in medical image classification, but often work like black-box. Commonly used post-hoc interpretation methods often provide heuristic visualizations whose relationship to the classifier's predictive distribution is indirect. This work introduces a local sensitivity analysis framework based on the input-dependent Fisher Information Matrix (iFIM) of a trained classifier. The iFIM characterizes how the classifier's predictive distribution changes under infinitesimal perturbations of the input image. By using a Gram-matrix formulation, the nonzero eigenspectrum of the iFIM can be recovered without explicitly forming the full image-dimensional Fisher matrix. The leading iFIM eigenspace is then used to project an input image into a high local-sensitivity component and its orthogonal component. These components provide a model-intrinsic description of local predictive sensitivity, rather than a conventional pixel-wise attribution heatmap or a causal segmentation of task-relevant anatomy. The framework is evaluated on controlled and clinical medical image classification tasks using multiple classifier architectures. Perturbation-based experiments show that high-sensitivity iFIM components are more strongly coupled to changes in predictive confidence and classification performance than lower-sensitivity complementary components. The results support the iFIM framework as a principled tool for analyzing local decision sensitivity and for complementing existing attribution-based interpretability methods in medical imaging.2026-06-15T07:57:50ZSourya Sengupta. Mark A. Anastasiohttp://arxiv.org/abs/2606.03570v3STC: Reversible Digit-Context Decomposition for BWT-Family Text Compression2026-06-15T01:41:15ZBurrows-Wheeler-transform-based compressors rely on local context regularity, but structured text also contains dates, counters, identifiers, coordinates, and other digit runs whose values vary differently from their surrounding tokens. STC is presented as a new algorithm found by the authors through the self-evolving AI system zeelin. It is a practical BWT-family compressor that separates this source of variation before the component BWT stage. It replaces digit runs in the main stream with an unambiguous placeholder and stores the removed digits in length- and context-conditioned side streams. The side streams use stable bucket ordering and compact digit packing, so the decoder can reconstruct the original run order from the normalized main stream without storing a separate permutation. The resulting components are encoded by a fixed internal BWT/M03-style component coder. On enwik9, STC produces a 157,388,188-byte archive with a 183,174-byte decoder source package, giving a local LTCB-style total of 157,571,362 bytes. A full-enwik9 same-coder ablation shows that the digit-context decomposition reduces the archive by 2,629,561 bytes relative to the no-split control. The result is locally verified by full decode and SHA-256 matching; official benchmark status requires independent maintainer-side verification.2026-06-02T12:38:58Z16 pages, 3 figures, 6 tables. Code and data: https://github.com/thu-nmrc/STC-for-BWT-FamilyText-CompressionJingyang DuYang ShenAnling Xianghttp://arxiv.org/abs/2606.01602v2Estimating Mutual Information between Time Series and Temporal Event Sequences Across Diverse Analysis Tasks2026-06-14T22:45:34ZPairwise dependence measures such as correlation and causality are fundamental to temporal data mining, yet there is still no principled and robust way to quantify dependence between heterogeneous data types, especially between continuous time series and discrete temporal event sequences. Existing approaches rely on ad hoc transformations or mutual-information estimators that are highly sensitive to quantization, repeated values, and event redundancy, leading to biased or unstable results in practice. We propose a nonparametric mutual information estimator that directly measures the dependence between time series and event sequences without data transformation, learning, or ad hoc discretization. Our method models the continuous-discrete duality of real-world time series to handle quantization and repeated-value artifacts and introduces a latent event clustering strategy to mitigate bias from event co-occurrence and redundancy. Together, these yield a robust and unified framework that bridges discrete and continuous mutual information. We evaluate the proposed estimator on four representative tasks: discrete-continuous time-delayed mutual information for causality analysis, global and local temporal repetition discovery, discrete covariate selection for time series forecasting, and continuous feature selection for classification. Experiments on synthetic and real-world datasets show consistent improvements over existing methods in accuracy, robustness, and interpretability, positioning our approach as a general-purpose dependence operator for heterogeneous temporal data, similar to Pearson correlation for homogeneous time series. Code available at: https://github.com/HaojiHu/Multimodal-Temporal-Data-Quantification2026-06-01T02:56:50ZHaoji HuHuaqing MaoYijun LinXiaowei JiaJinwei ZhouMinoh JeongYao-Yi Chiang10.1145/3770855.3817693http://arxiv.org/abs/2606.16028v1The Information-Theoretic Benefit of Shared Representations under Orthogonality Constraints2026-06-14T21:25:58ZModern deep learning architectures are increasingly multi-task and multi-modal, using a pretrained foundation model combined with task-specific, fine-tuned models. Empirically, exploiting similarity across different problems, instead of solving them individually, can significantly improve overall performance. While the generalization and sample complexity properties of multitask learning have been widely studied, the parametric complexity of joint approximation in comparison to separate approximation remains less well understood. The question is particularly relevant in modern deep learning, where models are increasingly required to satisfy structural constraints such as equivariance, conservation laws, or orthogonality. We prove lower and upper bounds on the description-length for separate and joint approximation classes, respectively, in uniform norm. We build a class of orthogonal functions by composing a shared hard feature, realized by a Rademacher-Haar wavelet series, with Sawtooth-Walsh readouts to enforce orthogonality of output coordinates. The dyadic tree structure of the Rademacher-Haar wavelet concentrates the approximation hardness in the common feature component, while the readouts act as task-specific heads. Using an information-theoretic framework, we obtain a sharp gap between the optimal approximation rates achievable by joint and separate coding. Finally, we realize this separation in a neural network model using Heaviside activations via reduction to triangle-wave approximation. Our results show that even under an orthogonality constraint joint approximation requires strictly fewer bits in compositional architectures, provided the tasks share a latent hard feature. This provides theoretical insight into the description-length-efficiency of compositional multi-output architectures and clarifies how neural networks can retain expressivity under geometric constraints.2026-06-14T21:25:58ZThomas DittrichOliver PotockiPhilipp Grohshttp://arxiv.org/abs/2606.15929v1Wavelet Localisation and Local Modulation Freezing in MRW Unwrapping2026-06-14T17:11:00ZWe develop a localised wavelet formulation of multifractal random walk unwrapping based on the local multiplicative modulation freezing. The framework is motivated by the observation that finite-support wavelet localisation may induce approximate local factorisation of multiplicatively modulated stochastic fields, allowing the modulation component to become effectively frozen within sufficiently localised probing domains. Within this regime, logarithmic wavelet amplitudes admit an approximate additive decomposition linking local wavelet statistics directly to the underlying modulation field. This viewpoint reformulates covariance-based MRW unwrapping as a localised multiscale operator problem in which wavelet coefficients act as finite-support probes of multiplicative organisation. The validity of the approximation depends explicitly on support geometry, scale-dependent overlap, and residual multiscale mixing generated by internal modulation variability. We show that these effects naturally produce finite-scale deviations from ideal logarithmic covariance scaling and lead to structured covariance distortions whose form depends on the interaction between the modulation field and the geometry of the wavelet representation. In the resulting framework, localisation itself becomes the operational mechanism enabling multiscale probing of local stochastic organisation. Numerical investigations using orthonormal wavelet decompositions support the proposed interpretation and demonstrate the emergence of scale-dependent freezing regimes, residual covariance mixing, and finite-support breakdown effects consistent with the theory. The proposed framework suggests a broader connection between wavelet localisation, local regularity organisation, and finite-support multiscale stochastic operators. Wavelet localisation becomes an operational mechanism for probing localised multiscale structure.2026-06-14T17:11:00Z48 pages, 12 figuresMateusz PolakowskiZbigniew R. Struzikhttp://arxiv.org/abs/2604.26819v2Sharp One-Dimensional Sub-Gaussian Comparison in Convex Order2026-06-14T15:51:19ZWe prove that any random variable $X$ whose moment generating function is point-wise upper bounded by that of $ G \sim \mathcal{N}(0,1) $ must be dominated by $ G/\mathbb{E}[|G|] $ in convex order, meaning $ \mathbb{E}[f(X)] \le \mathbb{E}[f(G/\mathbb{E}[|G|])] $ for all convex $f$. This is sharp as witnessed by $ X \sim \mathrm{Unif}(\{-1,1\}) $ and $ f(x) = |x| $.2026-04-29T15:47:30ZYihan Zhanghttp://arxiv.org/abs/2606.15826v1Geometrically Constrained Decentralized Independent Vector Analysis for Distributed Microphone Arrays2026-06-14T14:04:18ZThis paper proposes a geometrically constrained decentralized independent vector analysis (GC-Dec-IVA) method for distributed microphone arrays. Recently proposed Dec-IVA method enables source separation by exchanging only power-related statistics to exploit cross-array information. However, this initial attempt often provides negligible improvement over applying IVA locally at each array, mainly due to the potential permutation inconsistency among arrays and the strong cross-array dependency implied by its source model. To address these limitations, we incorporate direction-of-arrival (DOA) information to derive GC-Dec-IVA, which mitigates permutation mismatch across arrays and enhances source alignment. Furthermore, a new source model is introduced to weaken cross-array dependency, improving robustness against permutation inconsistency in noisy environments. Experiments show the proposed method improves both the separation performance and cross-array permutation consistency.2026-06-14T14:04:18ZAccepted to Interspeech 2026Changda ChenYichen YangWei LiuBing ZhuGongping HuangShoji MakinoShuai Wanghttp://arxiv.org/abs/2605.25699v2Volume-Refined Achievability and Converse Approximations for Noisy Permutation Channels2026-06-14T13:39:15ZWe study volume-refined achievability and converse bounds for noisy permutation channels generated by strictly positive DMCs, allowing the reachable output polytope to have arbitrary affine dimension $d\ge 1$. The reachable output polytope may be lower-dimensional than the output simplex, whereas existing refined achievability analyses and fixed-error converses are not adapted to this intrinsic affine geometry. On the achievability side, we develop an affine-coordinate simplex-lattice construction adapted to the reachable output polytope, together with a nearest-neighbor decoder and a geometric error-reduction argument in the same coordinate space. This yields a Gaussian achievability approximation with an $o(1)$ remainder. On the converse side, we first use a meta-converse combined with a KL covering and a local testing estimate to obtain a fixed-error converse with a bounded remainder, which implies the logarithmic $ε$-capacity $d/2$. We then apply the meta-converse with a stratified Jeffreys-mixture auxiliary output distribution. Using a local Laplace approximation and a local likelihood-ratio approximation, this choice identifies the Fisher-volume term and an explicit Gaussian testing constant, yielding a constant-order converse approximation with an $o(1)$ remainder. The achievability and converse constants arise from different constructions and are not claimed to match in general.2026-05-25T10:57:35Z47 pages, 3 figuresLugaoze FengGuocheng LvXunan Li.Ye Jinhttp://arxiv.org/abs/2506.20015v2Neuromorphic Wireless Split Computing with Resonate-and-Fire Neurons2026-06-14T09:25:17ZNeuromorphic computing offers an energy-efficient alternative to conventional deep learning accelerators, particularly for real-time processing of time-series data. However, many edge applications, such as wireless sensing and audio recognition, generate streaming signals with rich spectral features that are not effectively captured by conventional leaky integrate-and-fire (LIF) spiking neurons. This paper investigates a wireless split computing architecture that employs resonate-and-fire (RF) neurons with oscillatory dynamics to process time-domain signals directly, eliminating the need for costly spectral pre-processing. By resonating at tunable frequencies, RF neurons extract time-localized spectral features while maintaining low spiking activity. This temporal sparsity translates into significant savings in both computation and transmission energy. Assuming an OFDM-based analog wireless interface for spike transmission, we present a complete system design and evaluate its performance on audio classification and modulation classification tasks. Experimental results show that the proposed RF-SNN architecture achieves comparable accuracy to conventional LIF-SNNs and ANNs, while substantially reducing spike rates and total energy consumption during inference and communication.2025-06-24T21:14:59ZDengyu WuJiechen ChenH. Vincent PoorBipin RajendranOsvaldo Simeonehttp://arxiv.org/abs/2606.15688v1Twin-in-the-Loop Optimization and Fundamental Limits of Position--Velocity Estimation in Cell-Free ISAC Systems2026-06-14T09:09:30ZDigital twin (DT) networks require tight integration with wireless sensing, yet the fundamental limits of such coupling in cell-free integrated sensing and communication (ISAC) systems remain largely unexplored, particularly in the presence of fluid intelligent metasurfaces (FIM). This paper establishes a joint position-velocity Cramer-Rao bound (CRB) framework, operationalized through a twin-in-the-loop architecture. By leveraging a scatter-matrix decomposition of the velocity Fisher information, we show that single-base-station systems are inherently rank-deficient for two-dimensional velocity estimation, whereas cell-free deployments with multiple access-point pairs achieve full observability. The resulting CRB reveals a spatio-temporal decoupling: FIM shape optimization significantly improves position accuracy but does not affect the velocity CRB under isotropic waveforms while Doppler coupling asymmetrically enhances position estimation accuracy. Building on this analysis, we develop a closed-loop DT framework, deriving the critical mismatch angle in closed form and showing that angular diversity in cell-free systems mitigates DT prediction errors. We further characterize the optimal synchronization period and propose a confidence-aware scheduling strategy that reduces the DT update rate. Numerical results demonstrate substantial performance gains over single-base-station systems, with improvements attributed to angular diversity, Doppler-position coupling, and FIM adaptation.2026-06-14T09:09:30ZChanghao HeXiaojuan ZhangGeoffrey Ye Li