https://arxiv.org/api/VrGl02KuGRYV07rv+ig1VbA6fYU 2026-03-20T16:17:23Z 9966 45 15 http://arxiv.org/abs/2504.06903v2 Network Cross-Validation and Model Selection via Subsampling 2026-03-11T15:18:03Z

Complex and larger networks are becoming increasingly prevalent in scientific applications in various domains. Although a number of models and methods exist for such networks, cross-validation on networks remains challenging due to the unique structure of network data. In this paper, we propose a general cross-validation procedure called NETCROP (NETwork CRoss-Validation using Overlapping Partitions). The key idea is to divide the original network into multiple subnetworks with a shared overlap part, producing training sets consisting of the subnetworks and a test set with the node pairs between the subnetworks. This train-test split provides the basis for a network cross-validation procedure that can be applied on a wide range of model selection and parameter tuning problems for networks. The method is computationally efficient for large networks as it uses smaller subnetworks for the training step. We provide methodological details and theoretical guarantees for several model selection and parameter tuning tasks using NETCROP. Numerical results demonstrate that NETCROP performs accurate cross-validation on a diverse set of network model selection and parameter tuning problems. The results also indicate that NETCROP is computationally much faster while being often more accurate than the existing methods for network cross-validation.

2025-04-09T14:03:40Z Sayan Chakrabarty Srijan Sengupta Yuguo Chen http://arxiv.org/abs/2509.18149v2 Tensor Train Completion from Fiberwise Observations Along a Single Mode 2026-03-11T12:02:52Z

Tensor completion is an extension of matrix completion aimed at recovering a multiway data tensor by leveraging a given subset of its entries (observations) and the pattern of observation. The low-rank assumption is key in establishing a relationship between the observed and unobserved entries of the tensor. The low-rank tensor completion problem is typically solved using numerical optimization techniques, where the rank information is used either implicitly (in the rank minimization approach) or explicitly (in the error minimization approach). Current theories concerning these techniques often study probabilistic recovery guarantees under conditions such as random uniform observations and incoherence requirements. However, if an observation pattern exhibits some low-rank structure that can be exploited, more efficient algorithms with deterministic recovery guarantees can be designed by leveraging this structure. This work shows how to use only standard linear algebra operations to compute the tensor train decomposition of a specific type of ``fiber-wise'' observed tensor, where some of the fibers of a tensor (along a single specific mode) are either fully observed or entirely missing, unlike the usual entry-wise observations. From an application viewpoint, this setting is relevant when it is easier to sample or collect a multiway data tensor along a specific mode (e.g., temporal). The proposed completion method is fast and is guaranteed to work under reasonable deterministic conditions on the observation pattern. Through numerical experiments, we showcase interesting applications and use cases that illustrate the effectiveness of the proposed approach.

2025-09-16T09:42:33Z 26 pages, 12 figures Mathematics 2026, 14(5), 922 Shakir Showkat Sofi Lieven De Lathauwer 10.3390/math14050922 http://arxiv.org/abs/2603.10687v1 A Python implementation of some geometric tools on Kendall 3D shape space for practical applications 2026-03-11T11:57:43Z

This work addresses the challenge of analyzing geometric structures using Kendall's 3D Shape Space. While Riemannian geometry provides a robust framework for shape analysis (independent of scale, position, and orientation) the transition from theoretical manifolds to practical computational workflows remains difficult. Although Geomstats is currently the leading Python library for manifold-based statistics, it lacks specific utilities required for advanced 3D shape analysis. This article introduces tools designed to bridge this gap, translating complex mathematical abstractions into efficient, accessible software solutions for researchers.

2026-03-11T11:57:43Z Jorge Valero Vicent Gimeno i Garcia M. Victoría Ibáñez Pau Martinavarro Amelia Simó http://arxiv.org/abs/2408.09155v2 Learning Robust Treatment Rules for Censored Data 2026-03-11T11:33:58Z

There is a fast-growing literature on estimating optimal treatment rules directly by maximizing the expected outcome. In biomedical studies and operations applications, censored survival outcome is frequently observed, in which case the truncated mean survival time and survival probability are of great interest. In this paper, we propose two robust criteria for learning optimal treatment rules with censored survival outcomes; the former one targets an optimal treatment rule maximizing the truncated mean survival time, where the cutoff is specified by a given quantile such as median; the latter one targets an optimal treatment rule maximizing buffered survival probabilities, where the predetermined threshold is adjusted to account for the truncated mean survival time. We develop a sampling-based difference-of-convex algorithm for learning the proposed optimal treatment rules, and provide theoretical justifications for them. In simulation studies, our estimators show improved performance compared to existing methods. We also demonstrate the proposed method using AIDS clinical trial data.

2024-08-17T09:58:58Z Yifan Cui Junyi Liu Tao Shen Zhengling Qi Xi Chen http://arxiv.org/abs/2508.03059v3 Two-sample comparison through additive tree models for density ratios 2026-03-11T04:12:52Z

The ratio of two densities provides a direct characterization of their differences. We consider the two-sample comparison problem by estimating this ratio given i.i.d. observations from two distributions. To this end, we propose additive tree models for density ratio estimation along with efficient algorithms using a new loss function, the balancing loss. The loss allows tree-based models to be trained using several algorithms originally designed for supervised learning, such as forward-stagewise optimization and gradient boosting. Moreover, the balancing loss resembles an exponential family kernel, and it can serve as a pseudo-likelihood with conjugate priors. This property enables generalized Bayesian inference on the density ratio using backfitting samplers designed for Bayesian additive regression trees (BART). Our Bayesian strategy provides uncertainty quantification for the inferred density ratio, which is critical for applications involving high-dimensional and data-limited distributions with potentially substantial uncertainty. We further show connections of the balancing loss to the exponential loss in binary classification and to the variational form of f-divergence, particularly the squared Hellinger distance. Numerical experiments demonstrate that our method achieves both accuracy and computational efficiency, while uniquely providing uncertainty quantification. Finally, we demonstrate its application to assessing the quality of generative models for microbiome compositional data.

2025-08-05T04:08:49Z Naoki Awaya Yuliang Xu Li Ma http://arxiv.org/abs/2603.10382v1 Gimbal Regression: Orientation-Adaptive Local Linear Regression under Spatial Heterogeneity 2026-03-11T03:51:57Z

Local regression is widely used to explore spatial heterogeneity, but anisotropic or effectively low-dimensional neighborhoods can produce ill-conditioned local solves, causing coefficient variation driven by numerical artifacts rather than substantive structure. Such instability is often hidden when estimation relies on implicit tuning or optimization without exposing local diagnostics. This paper proposes Gimbal Regression (GR), a deterministic, geometry-aware local regression framework for stable and auditable estimation. GR constructs directional weights from neighborhood geometry using explicit orientation objects and deterministic safeguards, and computes local coefficients by a closed-form solve. Theoretical results are stated conditional on the realized neighborhood configuration, under which the estimator is a deterministic linear operator with finite-perturbation stability bounds. Simulations and empirical examples demonstrate predictable computation, transparent diagnostics, and improved numerical stability relative to common local regression baselines.

2026-03-11T03:51:57Z Yuichiro Otani http://arxiv.org/abs/2603.10318v1 Optimising two-block averaging kernels to speed up Markov chains 2026-03-11T01:40:02Z

We study the problem of selecting optimal two-block partitions to accelerate the mixing of finite Markov chains under group-averaging transformations. The main objectives considered are the Kullback-Leibler (KL) divergence and the Frobenius distance to stationarity. We establish explicit connections between these objectives and the induced projection chain. In the case of the KL divergence, this reduction yields explicit decay rates in terms of the log-Sobolev constant. For the Frobenius distance, we identify a Cheeger-type functional that characterises optimal cuts. This formulation recasts two-block selection as a structured combinatorial optimisation problem admitting difference-of-submodular decompositions. We further propose several algorithmic approximations, including majorisation-minimisation and coordinate descent schemes, as computationally feasible alternatives to exhaustive combinatorial search. Our numerical experiments reveal that optimal cuts under the two objectives can substantially reduce total variation distance to stationarity and demonstrate the practical effectiveness of the proposed approximation algorithms.

2026-03-11T01:40:02Z 45 pages, 5 figures Ryan J. Y. Lim Michael C. H. Choi http://arxiv.org/abs/2411.08821v3 Conditional Local Importance by Quantile Expectations 2026-03-10T23:53:48Z

Global variable importance measures are commonly used to interpret the results of machine learning models. Local variable importance techniques assess how variables contribute to individual observations. Current, popular methods, including LIME and SHAP, typically fail to accurately reflect locally dependent relationships between variables and instead focus on marginal importance values. Additionally, they are not natively adapted for multi-class classification problems. We propose a new model-agnostic method for calculating local variable importance, CLIQUE, that captures locally dependent relationships, provides improvements over permutation-based methods, and can be directly applied to multi-class classification problems. Simulated and real-world examples show that CLIQUE emphasizes locally dependent information, captures interaction behavior beyond what can be evaluated by correlations, and properly reduces bias in regions where variables do not affect the response.

2024-11-13T17:59:44Z 22 pages, 18 figures Kelvyn K. Bladen Adele Cutler D. Richard Cutler Kevin R. Moon http://arxiv.org/abs/2505.09828v2 Optimally balancing exploration and exploitation to automate multi-fidelity statistical estimation 2026-03-10T19:03:43Z

Multi-fidelity methods that use an ensemble of models to compute a Monte Carlo estimator of the expectation of a high-fidelity model can significantly reduce computational costs compared to single-model approaches. These methods use oracle statistics, specifically the covariance between models, to optimally allocate samples to each model in the ensemble. However, in practice, the oracle statistics are estimated using additional model evaluations, whose computational cost and induced error are typically ignored. To address this issue, this paper proposes an adaptive algorithm to optimally balance the resources between oracle statistics estimation and final multi-fidelity estimator construction, leveraging ideas from multilevel best linear unbiased estimators in Schaden and Ullmann (2020) and a bandit-learning procedure in Xu et al. (2022). Under mild assumptions, we demonstrate that the multi-fidelity estimator produced by the proposed algorithm exhibits mean-squared error commensurate with that of the best linear unbiased estimator under the optimal allocation computed with oracle statistics. Our theoretical findings are supported by detailed numerical experiments, including a parametric elliptic PDE and an ice-sheet mass-change modeling problem.

2025-05-14T22:15:32Z 40 pages Thomas Dixon Alex Gorodetsky John Jakeman Akil Narayan Yiming Xu http://arxiv.org/abs/2506.09762v2 Parallel computations for Metropolis Markov chains with Picard maps 2026-03-10T07:41:55Z

We develop parallel algorithms for simulating zeroth-order (aka gradient-free) Metropolis Markov chains based on the Picard map. For Random Walk Metropolis Markov chains targeting log-concave distributions $π$ on $\mathbb{R}^d$, our algorithm generates samples close to $π$ in $\mathcal{O}(\sqrt{d})$ parallel iterations with $\mathcal{O}(\sqrt{d})$ processors, therefore speeding up the convergence of the corresponding sequential implementation by a factor $\sqrt{d}$. Furthermore, a modification of our algorithm generates samples from an approximate measure $ π_r$ in $\mathcal{O}(1)$ parallel iterations and $\mathcal{O}(d)$ processors. We empirically assess the performance of the proposed algorithms in high-dimensional regression problems, an epidemic model where the gradient is unavailable and a real-word application in precision medicine. Our algorithms are straightforward to implement and may constitute a useful tool for practitioners seeking to sample from a prescribed distribution $π$ using only point-wise evaluations of $\logπ$ and parallel computing.

2025-06-11T14:03:55Z 37 pages, 9 figures Sebastiano Grazzi Giacomo Zanella http://arxiv.org/abs/2601.05355v2 An AI-powered Bayesian Generative Modeling Approach for Arbitrary Conditional Inference 2026-03-10T04:22:45Z

Modern data analysis increasingly requires flexible conditional inference P(X_B | X_A) where (X_A, X_B) is an arbitrary partition of observed variable X. Existing approaches are either restricted to a fixed conditioning structure or depend strongly on the distribution of conditioning masks during training. To address these limitations, we introduce Bayesian generative modeling (BGM), a unified framework for arbitrary conditional inference. BGM learns a generative model of X via a stochastic iterative Bayesian updating algorithm in which model parameters and latent variables are updated until convergence. Once trained, any conditional distribution can be obtained without retraining. Empirically, BGM achieves superior predictive performance with posterior predictive intervals, demonstrating that a single learned model can serve as a universal engine for conditional prediction with principled uncertainty quantification. We provide theoretical guarantees for convergence of the stochastic iterative algorithm, statistical consistency, and conditional risk bounds. The proposed BGM framework leverages modern AI to capture complex relationships among variables while adhering to Bayesian principles, offering a promising approach for a wide range of applications in modern data science. Code for BGM is available at https://github.com/liuq-lab/bayesgm. Document of BGM is available at https://bayesgm.readthedocs.io.

2026-01-08T20:14:30Z Qiao Liu Wing Hung Wong http://arxiv.org/abs/2603.09089v1 Sampling on Discrete Spaces with Temporal Point Processes 2026-03-10T01:58:49Z

Temporal point processes offer a powerful framework for sampling from discrete distributions, yet they remain underutilized in existing literature. We show how to construct, for any target multivariate count distribution with downward-closed support, a multivariate temporal point process whose event-count vector in a fixed-length sliding window converges in distribution to the target as time tends to infinity. Structured as a system of potentially coupled infinite-server queues with deterministic service times, the sampler exhibits a discrete form of momentum that suppresses random-walk behaviour. The admissible families of processes permit both reversible and non-reversible dynamics. As an application, we derive a recurrent stochastic neural network whose dynamics implement sampling-based computation and exhibit some biologically plausible features, including relative refractory periods and oscillatory dynamics. The introduction of auxiliary randomness reduces the sampler to a birth-death process, establishing the latter as a degenerate case with the same limiting distribution. In simulations on 63 target distributions, our sampler always outperforms these birth-death processes and frequently outperforms Zanella processes in multivariate effective sample size, with further gains when normalized by CPU time.

2026-03-10T01:58:49Z 20 pages, 1 figure Cameron A. Stewart Gatsby Computational Neuroscience Unit, University College London, London, U.K Maneesh Sahani Gatsby Computational Neuroscience Unit, University College London, London, U.K http://arxiv.org/abs/2603.08676v1 Momentum SVGD-EM for Accelerated Maximum Marginal Likelihood Estimation 2026-03-09T17:47:36Z

Maximum marginal likelihood estimation (MMLE) can be formulated as the optimization of a free energy functional. From this viewpoint, the Expectation-Maximisation (EM) algorithm admits a natural interpretation as a coordinate descent method over the joint space of model parameters and probability measures. Recently, a significant body of work has adopted this perspective, leading to interacting particle algorithms for MMLE. In this paper, we propose an accelerated version of one such procedure, based on Stein variational gradient descent (SVGD), by introducing Nesterov acceleration in both the parameter updates and in the space of probability measures. The resulting method, termed Momentum SVGD-EM, consistently accelerates convergence in terms of required iterations across various tasks of increasing difficulty, demonstrating effectiveness in both low- and high-dimensional settings.

2026-03-09T17:47:36Z Accepted to AISTATS 2026 Adam Rozzio Rafael Athanasiades O. Deniz Akyildiz http://arxiv.org/abs/2405.08290v3 MCMC using $\textit{bouncy}$ Hamiltonian dynamics: A unifying framework for Hamiltonian Monte Carlo and piecewise deterministic Markov process samplers 2026-03-09T16:42:11Z

Piecewise-deterministic Markov process (PDMP) samplers constitute a state-of-the-art Markov chain Monte Carlo paradigm in Bayesian computation, with examples including the zig-zag and bouncy particle sampler (bps). Recent work on the zig-zag has indicated its connection to Hamiltonian Monte Carlo (HMC), a version of the Metropolis algorithm that exploits Hamiltonian dynamics. Here we establish that, in fact, the connection between the two paradigms extends far beyond the specific instance. The key lies in (1) the fact that any time-reversible deterministic dynamics provides a valid Metropolis proposal and (2) how PDMPs' characteristic velocity changes constitute an alternative to the usual acceptance-rejection. We turn this observation into a rigorous framework for constructing rejection-free Metropolis proposals based on bouncy Hamiltonian dynamics which simultaneously possess Hamiltonian-like properties and generate discontinuous trajectories similar in appearance to PDMPs. When combined with periodic refreshment of the inertia, the dynamics converge strongly to PDMP equivalents in the limit of increasingly frequent refreshment. We demonstrate the practical implications of this new framework with a sampler based on a bouncy Hamiltonian dynamics closely related to the bps. The resulting sampler exhibits competitive performance on challenging real-data posteriors involving tens of thousands of parameters. As the sampler of choice in modern probabilistic programming languages, HMC plays a critical role in applied Bayesian modeling; by generalizing the paradigm and elucidating its connection to the leading competitor, our framework opens up opportunities for cross-pollination and innovation to further scale Bayesian inference.

2024-05-14T03:13:55Z Andrew Chin Akihiko Nishimura http://arxiv.org/abs/2409.09787v5 BNEM: A Boltzmann Sampler Based on Bootstrapped Noised Energy Matching 2026-03-09T15:51:28Z

Developing an efficient sampler capable of generating independent and identically distributed (IID) samples from a Boltzmann distribution is a crucial challenge in scientific research, e.g. molecular dynamics. In this work, we intend to learn neural samplers given energy functions instead of data sampled from the Boltzmann distribution. By learning the energies of the noised data, we propose a diffusion-based sampler, Noised Energy Matching, which theoretically has lower variance and more complexity compared to related works. Furthermore, a novel bootstrapping technique is applied to NEM to balance between bias and variance. We evaluate NEM and BNEM on a 2-dimensional 40 Gaussian Mixture Model (GMM) and a 4-particle double-well potential (DW-4). The experimental results demonstrate that BNEM can achieve state-of-the-art performance while being more robust.

2024-09-15T16:41:30Z Camera-ready version for TMLR (03/2026) Transactions on Machine Learning Research (TMLR), 2026 RuiKang OuYang Bo Qiang José Miguel Hernández-Lobato