https://arxiv.org/api/o3UGu9oBFttxfMo2f65LmFMo1bE 2026-07-17T19:56:48Z 10391 0 15 http://arxiv.org/abs/2607.15208v1 Delocalization of bias in unadjusted Hamiltonian Monte Carlo and underdamped Langevin 2026-07-16T17:07:42Z

Unadjusted samplers such as unadjusted Hamiltonian Monte Carlo and underdamped Langevin are well-known to be biased. Metropolis--Hastings adjustment has been conventionally incorporated into Hamiltonian Monte Carlo to eliminate the bias. However, this adjustment can significantly increase the iteration complexity due to the small step size required for reasonable Metropolis acceptance rates. In this work, we extend the \emph{delocalization of bias} phenomenon, previously established for the overdamped Langevin algorithm, to these two unadjusted algorithms. We show that to control the $W_2$ bias of any $K$-dimensional marginal of a high-dimensional distribution, $O(\sqrt{K})$ integration steps suffice up to $\log d$ terms, assuming either weak or sparse interactions among variables. The discrete-time integrators here introduce technical difficulties beyond those of the overdamped setting, which we address through a broadly applicable matrix-polynomial framework that characterizes their propagators. Our result for the underdamped Langevin algorithm is valid for all large friction parameters, implying that the Leimkuhler-Matthews integrator for the overdamped Langevin dynamics also exhibits delocalization of bias.

2026-07-16T17:07:42Z Yifan Chen Xiaoou Cheng Jonathan Niles-Weed Jonathan Weare http://arxiv.org/abs/2607.15018v1 cGAP: Generalized Association Plots with HOMALS-Guided Heatmaps for Visualization of High-Dimensional Categorical Data 2026-07-16T14:05:37Z

High-dimensional categorical data arise in genetics, biomedicine, and the social sciences, yet visualization tools for such data remain far less developed than those for continuous variables. Existing methods either scale poorly, rely heavily on low-dimensional displays detached from the original data matrix, or prioritize predictive accuracy over interpretability. To address this gap, we introduce categorical Generalized Association Plots (cGAP), a visualization framework for nominal, ordinal, and binary data that preserves the original data matrix while augmenting it with interpretable geometric structure. cGAP uses Homogeneity Analysis (HOMALS) to embed subjects and category levels in a three-dimensional Euclidean space and maps the embedding to red-green-blue coordinates so that similar patterns receive similar colors. The framework integrates three coordinated views: a HOMALS-guided heatmap of the raw data matrix, a subject proximity matrix, and a variable proximity matrix. Seriation algorithms are then used to reorder rows and columns to reveal coherent clusters, outliers, and local-to-global structure. We also derive barycentric traceability, projection-distortion, and contrast-preservation properties that clarify how embedding geometry is transferred to the display. We demonstrate the versatility of cGAP through applications to student-animal classification data, mammalian dentition profiles, mushroom records from the UCI Machine Learning Repository, and the Clusters of Orthologous Genes database. These examples show that cGAP supports transparent exploratory analysis by maintaining traceability between derived visual structure and the original categorical observations. cGAP provides a full-matrix, heatmap-based visualization environment for investigating complex categorical datasets across scientific domains.

2026-07-16T14:05:37Z 23 pages, 9 figures, 3 tables Chun-houh Chen Shun-Chuan Chang Chiun-How Kao Yi-Ju Lee Shang-Ying Shiu Yin-Jing Tien ShengLi Tzeng Han-Ming Wu http://arxiv.org/abs/2602.14616v2 Higher-Order Hit-&-Run Samplers for Linearly Constrained Densities 2026-07-16T12:20:51Z

Markov chain Monte Carlo (MCMC) sampling of densities restricted to linearly constrained domains is an important task arising in Bayesian treatment of inverse problems in the natural sciences. While efficient algorithms for uniform polytope sampling exist, much less work has dealt with more complex constrained densities. In particular, gradient information as used in unconstrained MCMC is not necessarily helpful in the constrained case, where the gradient may push the proposal's density out of the polytope. In this work, we propose a novel constrained sampling algorithm, which combines strengths of higher-order information, like the target's log-density's gradients and curvature, with the Hit-&-Run proposal, a simple mechanism which guarantees the generation of feasible proposals, fulfilling the linear constraints. Our extensive experiments demonstrate improved sampling efficiency on complex constrained densities over various constrained and unconstrained samplers.

2026-02-16T10:20:52Z Accepted at UAI'26 Richard D. Paul Anton Stratmann Johann F. Jadebeck Martin Beyß Hanno Scharr David Rügamer Katharina Nöh http://arxiv.org/abs/2604.12334v2 On additive averaging kernels for finite Markov chains 2026-07-16T07:40:36Z

We study additive mixtures of Markov kernels of the form $A_α= αP + (1-α)G$, where $α\in [0,1]$, $P$ is a baseline sampler and $G$ is a Gibbs kernel induced by a partition of the state space. We first motivate the study of $A_α$, which can be interpreted as the projection of a lifted Markov chain. We then consider the minimisation of distance to stationarity under two objectives: the squared Frobenius norm and the Kullback-Leibler (KL) divergence. For the Frobenius objective, we derive explicit trace formulae and identify a Cheeger-type functional that characterises optimal two-block partitions. This yields a structured combinatorial optimisation problem admitting a difference-of-submodular decomposition, enabling efficient approximation via majorisation-minimisation. We also obtain geometric decay rates governed by the absolute spectral gap of $P$. For the KL divergence, we establish convexity-based bounds showing that the divergence of $A_α$ is controlled by those of both $P$ and $G$, thereby reducing partition selection to the Gibbs component. Numerical experiments on the Curie-Weiss model demonstrate that suitable choice of both the partition and the parameter $α$ can significantly accelerate convergence in total variation distance. We observe a consistent trade-off between local exploration and global averaging, with intermediate values of $α$ achieving the best performance across regimes.

2026-04-14T06:16:57Z 32 pages, 5 figures Ryan J. Y. Lim Michael C. H. Choi http://arxiv.org/abs/2607.14511v1 Custom-made Gauss quadrature: an introduction for statisticians 2026-07-16T03:02:21Z

An $n$-point Gauss quadrature rule approximates the weighted integral of a function by a weighted average of $n$ evaluations of this function and is exact for polynomials of degree at most $2n-1$. Such rules can be highly accurate with relatively few evaluations. For weight functions that are associated with classical orthogonal polynomials of a continuous variable (such as Legendre, Hermite and Laguerre), these rules are readily available. We suppose that this is not the case, so that these rules must be custom-made. The two most easily understood methods for the computation of these rules are (a) moment determinants and (b) the Stieltjes procedure. We implement them in the Julia package CustomGaussQuadrature, which uses type-generic numerical programming and adaptive high-precision arithmetic to assess the approximation error due to roundoff. We describe access from R via JuliaConnectoR.

2026-07-16T03:02:21Z Paul Kabaila http://arxiv.org/abs/2505.05961v3 GEORCE: A Fast New Control Algorithm for Computing Geodesics 2026-07-15T20:34:56Z

Computing geodesics for Riemannian manifolds is a difficult task that often relies on numerical approximations. However, these approximations tend to be either numerically unstable, have slow convergence, or scale poorly with manifold dimension and number of grid points. We introduce a new algorithm called GEORCE that computes geodesics in a local chart via a transformation into a discrete control problem. We show that GEORCE has global convergence and quadratic local convergence. In addition, we show that it extends to Finsler manifolds. For both Finslerian and Riemannian manifolds, we thoroughly benchmark GEORCE against several alternative optimization algorithms and show empirically that it has a much faster and more accurate performance for a variety of manifolds, including key manifolds from information theory and manifolds that are learned using generative models.

2025-05-09T11:21:51Z This updated version corrects an error in the proof of local quadratic convergence and establishes that GEORCE exhibits asymptotic local quadratic convergence with respect to the number of grid points Frederik Möbius Rygaard Søren Hauberg http://arxiv.org/abs/2601.07944v2 Neural Architectures for Amortized Bayesian Inference: Statistical Foundations and Empirical Assessments 2026-07-15T18:55:12Z

Since the turn of the century, approximate Bayesian inference has steadily evolved as new computational techniques have been incorporated to handle increasingly complex, large-scale predictive problems. The recent success of deep neural networks and foundation models has now given rise to a new paradigm in statistical modeling, in which Bayesian inference can be amortized through large-scale learned predictors. In amortized inference, substantial computation is required at the beginning to train a neural network, but it can subsequently produce approximate posteriors or predictions at much lower computational cost across a wide range of tasks. While the typical Bayesian inference procedures are computationally expensive due to repeated likelihood calculations and Monte Carlo steps for each new dataset, amortized inference provides a much lower computational cost at deployment. Despite the growing popularity of amortized inference, its statistical interpretation and position within Bayesian inference remain poorly explored. In this paper, we present a statistical perspective on several major neural architectures, including feedforward networks, Deep Sets, and Transformers, and examine how they naturally support amortized Bayesian inference. We explore how these models perform structured approximation and also probabilistic reasoning in ways that yield controlled generalization error throughout a wide range of deployment scenarios, and how these properties can be harnessed for Bayesian computation. Via simulation studies, we evaluate the accuracy, robustness, and uncertainty quantification of amortized inference across varying sample sizes, varying noise distributional families, varying sparsity levels, and multimodality, highlighting its strengths and limitations.

2026-01-12T19:21:51Z 32 pages, 8 figures, 3 tables Roy Shivam Ram Shreshtth Arnab Hazra Gourab Mukherjee http://arxiv.org/abs/2607.14274v1 Model Uncertainty under Non-Gaussian Errors: Bayesian Model Averaging and Selection in Stochastic Frontier Models 2026-07-15T18:30:49Z

The paper investigates Bayesian Model Averaging and Selection (BMA/S) under non-standard stochastic assumptions, focusing on stochastic frontier analysis (SFA). We propose fast, reliable procedures for inference in the normal-exponential stochastic frontier model and examine whether accounting for asymmetric disturbances affects model averaging and/or selection outcomes relative to the conventional Gaussian-error BMA/S. Particular attention is given to moderate-dimensional covariate selection problems typical in SFA applications. We demonstrate that, with appropriate search strategies and parallelization techniques, exhaustive model search can be computationally feasible and, in some cases, more practical than stochastic search alternatives. A Monte Carlo simulation study is used to compare the proposed SF-BMA/S procedure with standard Gaussian-error BMA/S under varying levels of inefficiency-to-noise ratio and signal strength with respect to the data generating process. The results show that accounting for stochastic frontier structures may affect posterior inference and model averaging outcomes, especially in scenarios where efficiency analysis is most sensible.

2026-07-15T18:30:49Z 23 pages, 6 tables, 2 figure, 1 appendix (2 tables) Kamil Makieła http://arxiv.org/abs/2607.10735v2 GNet: A scalable and flexible Gaussian process network with nonparametric neurons 2026-07-15T16:53:27Z

We develop GNet, a scalable and flexible Gaussian process network with nonparametric activation functions modeled by Gaussian processes. To reduce computational and storage costs, we introduce the jointly inverse Kalman filter, a fast algorithm together with closed-form expressions of gradients for accelerating model training and predictions without the need to form covariance matrices. Using a unified optimization setting, GNet shows competitive performance across a diverse range of test problems, including predicting nonlinear functions, nonparametric regression of real-world data, and predicting one-body direct correlation functions with high-dimensional inputs in classical density function theory. The strong performance of GNet, accelerated by the jointly inverse Kalman filter, suggests broad applicability to large-scale predictive modeling with substantially reduced computational and storage costs.

2026-07-12T12:40:55Z Mengyang Gu http://arxiv.org/abs/2606.10593v2 Data compression for fast dimension reduction and clustering of high-dimensional discrete data 2026-07-15T16:08:14Z

High-dimensional discrete data are common in genomics, microbiomics, survey research, and digital behavioral analysis. Clustering such data is challenging because many existing methods are computationally expensive, sensitive to sparsity and discreteness, or designed for specific data types. We introduce a deterministic dimension-reduction framework for clustering high-dimensional discrete observations. The approach compresses observations into a low-dimensional continuous representation using weighted sums derived from a scaled positional encoding, yielding a numerically stable transformation applicable to both binary and count data. Several theoretical properties are established. The compression mapping is injective, ensuring that distinct observations remain distinguishable after transformation. Under mild regularity conditions, the compressed variables are approximately Gaussian, supporting the use of model-based clustering in the reduced space. We further show that separation between cluster centroids is preserved, indicating that location-based cluster structure remains identifiable following dimension reduction. Simulation studies demonstrate accurate cluster recovery across diverse settings, while achieving substantial computational savings compared with commonly used dimension-reduction techniques. Applications to microbiome data and United Nations rolling call voting data highlight the method's practical utility. Overall, the framework offers a scalable, efficient, and broadly applicable solution for clustering high-dimensional discrete data.

2026-06-09T08:58:42Z Silvia D'Angelo Michael Fop http://arxiv.org/abs/2607.13828v1 Online Random Sampling with Real Probabilities 2026-07-15T13:40:27Z

We develop an efficient online algorithm to sample a sequence of discrete random variables using an entropy source of i.i.d. fair coin flips, in a standard model of real computation where real-valued probabilities are represented by rational approximations. For any sequence $F_1, F_2, \dots$ of probability distributions, our sampler generates $n$ outputs $X_1 \sim F_1, \dots, X_n \sim F_n$ using at most $\mathbb{E}\left[H(F_1) +\dots + H(F_n)\right] + O(\log n)$ coin flips in expectation while carrying $O(\log n)$ bits of persistent space, where $H$ is the Shannon entropy. Under standard assumptions, we prove that the space used by our sampler to achieve this information-theoretically optimal entropy rate is asymptotically optimal. The key idea is to replace the global arithmetic-decoding sampling scheme of Han and Hoshi (1997) with a local discrete uniform state, yielding an exponential reduction in space for a given entropy loss. Our approach applies to distributions with irrational probabilities and countably infinite supports, generalizing recent randomness-recycling methods beyond finite rational distributions with bounded denominator.

2026-07-15T13:40:27Z Thomas L. Draper David G. Harris Feras A. Saad http://arxiv.org/abs/2509.14028v2 Formalising Sample Size Calculations for the Development of Risk Prediction Models: The Importance of Accounting for Performance Variability 2026-07-15T11:00:12Z

Sample size calculations for developing prediction models typically aim to ensure that the expected value of a performance measure meets a prespecified target. For example, a key measure is the calibration slope (CS) which quantifies model overfitting; the sample size is often chosen so that the expected CS equals 0.9, close to the ideal value of 1. However, because of sampling variability, model performance can vary substantially across development samples of the recommended size. When variability is high, the probability of obtaining a model with performance close to the target may be unacceptably low. We propose a framework which enables sample size calculations that incorporate both the expected value and the variability for a given performance measure. The framework is illustrated for binary outcomes and logistic regression but applies to other outcomes and model types. To explicitly account for variability, we introduce the probability of acceptable performance (PrAP). For example, a model may be considered acceptable if the CS lies within a prespecified range (e.g., 0.85 to 1.15) and the sample size might be chosen to ensure that PrAP exceeds some target (e.g., 80%). Under existing approaches PrAP can be low, especially when the specified number of predictors is small, which can also translate into large variability for individual predicted probabilities. The use of shrinkage tends to improve PrAP. Our findings highlight the importance of accounting for performance variability to ensure robust model development.

2025-09-17T14:30:17Z Menelaos Pavlou Rumana Z. Omar Gareth Ambler http://arxiv.org/abs/2603.17466v5 A Full-Density Approach to Simulating Random Iteration Equations with Applications 2026-07-15T09:01:09Z

The goal of this study is to introduce a unified framework for simulating random iteration equations (RIE), understood as iteration equations containing random variables. The main idea is to propagate approximations of the full state density from one iteration to the next, rather than estimating it from many repeated pathwise Monte Carlo simulations. The presentation of the RIE modeling framework is conceptually simple based on recent work on static random equations and designed to be accessible. The modeling requirements for RIEs allow for potential nonsmooth nonlinearities and stochasticities in the transfer function. Additionally, the RIE computational strategy for full-density propagation is presented based on iterative likelihood / posterior calculations. As results, illustrative applications of nonlinear random and stochastic differential equation simulations, a new full-density gradient descent method (FDGD) for global optimization under uncertainty and examples of chaotic mappings are presented in order to demonstrate the breadth of the utility of this framework. In total, the character of the presentation is explorative and encourages new applications and theoretical studies.

2026-03-18T08:19:41Z Wolfgang Hoegele http://arxiv.org/abs/2509.25753v3 Quasi-Monte Carlo methods for uncertainty quantification of tumor growth modeled by a parametric semi-linear parabolic reaction-diffusion equation 2026-07-14T21:12:41Z

We study the application of a quasi-Monte Carlo (QMC) method to a class of semi-linear parabolic reaction-diffusion partial differential equations used to model tumor growth. Mathematical models of tumor growth are largely phenomenological in nature, capturing infiltration of the tumor into surrounding healthy tissue, proliferation of the existing tumor, and patient response to therapies, such as chemotherapy and radiotherapy. Considerable inter-patient variability, inherent heterogeneity of the disease, sparse and noisy data collection, and model inadequacy all contribute to significant uncertainty in the model parameters. It is crucial that these uncertainties can be efficiently propagated through the model to compute quantities of interest (QoIs), which in turn may be used to inform clinical decisions. We show that QMC methods can be successful in computing expectations of meaningful QoIs. Well-posedness results are developed for the model and used to show a theoretical error bound for the case of uniform random fields. The theoretical linear error rate, which is superior to that of standard Monte Carlo, is verified numerically. Encouraging computational results are also provided for lognormal random fields, prompting further theoretical development.

2025-09-30T04:18:44Z Alexander D. Gilbert Frances Y. Kuo Dirk Nuyens Graham Pash Ian H. Sloan Karen E. Willcox http://arxiv.org/abs/2607.12902v1 Accelerated Mixing Time of Randomized Hamiltonian Monte Carlo 2026-07-14T15:38:04Z

We show the Randomized Hamiltonian Monte Carlo (RHMC) algorithm has accelerated mixing time guarantees for sampling from log-concave probability distributions. RHMC proceeds by repeatedly simulating the continuous-time Hamiltonian dynamics for some random integration times, and resetting the velocity to be an independent Gaussian random variable between each simulation. We show that when the target distribution is log-concave and satisfies an $α$-Talagrand inequality (for example, if the target distribution is $α$-strongly log-concave), if we use a random integration time from either the triangular or the exponential distribution with mean $Θ(α^{-1/2})$, then RHMC converges exponentially fast in KL divergence, and the total integration time to reach error $\varepsilon$ in KL divergence scales as $O(α^{-1/2} \log(\varepsilon^{-1}))$. We also show that when the target distribution is log-concave, if we use a sequence of random integration times from the triangular distribution with exponentially increasing means, then the total integration time to reach error $\varepsilon$ in KL divergence scales as $O(\varepsilon^{-1/2})$. Our analysis relies on a bound on the average KL divergence along Hamiltonian dynamics, which is inspired by an analogous result on accelerated optimization methods based on Hamiltonian dynamics.

2026-07-14T15:38:04Z 74 pages Siddharth Mitra Vishwak Srinivasan Xiuyuan Wang Andre Wibisono