https://arxiv.org/api/OXCuWYTxYm32emfIaOP5m/yC9WQ 2026-06-11T03:17:34Z 36146 420 15 http://arxiv.org/abs/2601.20386v2 SCORE: A Unified Framework for Overshoot Refund in Online FDR Control 2026-05-29T15:32:17Z

We propose a unified framework to enhance the power of online multiple hypothesis testing procedures based on $e$-values. While $e$-value-based methods offer robust online False Discovery Rate (FDR) control under minimal assumptions, they often suffer from power loss by discarding evidence that exceeds the rejection threshold. We address this inefficiency via the Sequential Control with Overshoot Refund for E-values (SCORE) framework, which leverages the inequality $\mathbb{I}(y \ge 1) \le y - (y-1)_+$, valid for all $y\ge 0$, to reclaim this otherwise wasted evidence. This simple yet powerful insight yields a unified principle for improving a broad class of online testing algorithms. Building on this framework, we develop SCORE-enhanced versions of several state-of-the-art procedures, including SCORE-LOND, SCORE-LORD, and SCORE-SAFFRON, all of which strictly dominate their original counterparts while preserving valid finite-sample FDR control. Furthermore, under mild assumptions, SCORE permits retroactive updates of alpha-wealth by using the latest decision twice: first to determine its reward or loss, and then to refresh past wealth. Such a mechanism enables more aggressive testing strategies while maintaining valid FDR control, thereby further improving statistical power. The effectiveness of the proposed methods is validated through extensive simulation and real-data experiments.

2026-01-28T08:52:02Z Qi Kuang Bowen Gang Yin Xia http://arxiv.org/abs/2405.07836v5 Forecasting with Hyper-Trees 2026-05-29T15:23:42Z

We introduce Hyper-Trees as a novel framework for modeling time series data using gradient boosted trees. Unlike conventional tree-based approaches that forecast time series directly, Hyper-Trees learn the parameters of a target time series model, such as ARIMA or Exponential Smoothing, as functions of features. These parameters are then used by the target model to generate the final forecasts. Our framework combines the effectiveness of decision trees on tabular data with classical forecasting models, thereby inducing a time series inductive bias into tree-based models. To resolve the scaling limitations of boosted trees when estimating a high-dimensional set of target model parameters, we combine decision trees and neural networks within a unified framework. In this hybrid approach, the trees generate informative representations from the input features, which a shallow network then uses as input to learn the parameters of a time series model. With our research, we explore the effectiveness of Hyper-Trees across a range of forecasting tasks and extend tree-based modeling beyond its conventional use in time series analysis.

2024-05-13T15:22:15Z Gradient Boosted Trees, Hyper Models, Hybrid Models, Time Series Forecasting, Time-Varying Parameters Alexander März Kashif Rasul http://arxiv.org/abs/2605.29603v2 Learning study similarity to investigate heterogeneity in meta-analysis using LLMs and triplet loss 2026-05-29T15:05:17Z

Meta-analyses of observational studies often show substantial between-study heterogeneity, limiting the interpretability of pooled estimates. Meta-regression can be used to explore heterogeneity, but it is often underpowered to handle multiple effect modifiers. We propose a novel framework that integrates large language models (LLMs) with deep metric learning to infer study-level similarity prior to meta-analysis. Study-level clinical and methodological characteristics were processed by an LLM to generate study triplets (anchor, similar, dissimilar). These triplets were constructed by treating each study as an anchor and comparing it with pairs of other studies to identify, in each instance, the study most similar to the anchor. Then, the triplets were used into an embedding model trained with triplet loss; a deep learning approach that learns an embedding space where clinically and methodologically similar studies are clustered together. We apply our framework to a meta-analysis dataset of 58 observational studies comparing cognitive outcomes between preterm- and term-born children. Subsequently, we fit meta-analysis models within the identified study clusters and compare the results with those of the overall analysis. Results suggested three clusters two of which retained considerable between-study heterogeneity. The remaining cluster comprised the most homogeneous group of studies and exhibited a more extreme pooled effect estimate together with a narrower prediction interval compared with the overall analysis. This work presents a novel approach for exploring heterogeneity in meta-analysis by incorporating study characteristics prior to model fitting. By transforming study information into a similarity space, the framework identifies coherent subgroups and supports more precise inference in heterogeneous real-world evidence.

2026-05-28T08:40:47Z 17 pages, 4 figures Kanella Panagiotopoulou Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center- University of Freiburg, Freiburg im Breisgau, Germany Harald Binder Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center- University of Freiburg, Freiburg im Breisgau, Germany Theodoros Evrenoglou Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center- University of Freiburg, Freiburg im Breisgau, Germany http://arxiv.org/abs/2605.31394v1 A Dynamic Latent Space Model for Healthcare Mobility Networks: the Italian National Health Service case 2026-05-29T14:58:50Z

Healthcare mobility -- patients seeking treatment outside their territory of residence -- represents a major source of inequality and financial imbalance in decentralised health systems. In Italy, persistent north-south asymmetries in patient flows among Local Health Authorities (ASLs) have reinforced existing disparities within the National Health Service; yet the structural organisation and temporal dynamics of these flows remain poorly understood at the sub-regional level. We propose a Bayesian dynamic latent space model for directed weighted networks with a hurdle negative binomial likelihood, and apply it to administrative discharge records on mobility for hip replacement procedures among 109 Italian ASLs over 2018-2024. The model jointly addresses excess zeros, overdispersion and network dependence, while capturing directional heterogeneity through multiplicative sender and receiver effects and controlling for differences in territorial size via an appropriate exposure term. Applied to Italian mobility data, the model reveals the evolving geometry of the healthcare system, quantifies the disruption induced by the COVID-19 pandemic, and uncovers structural asymmetries in outward propensity and ASLs attractiveness. The framework provides a flexible tool for the statistical analysis of dynamic healthcare mobility networks with direct relevance to the monitoring and evaluation of territorial healthcare provision.

2026-05-29T14:58:50Z Cecilia Manente Marco Alfò Silvia D'Angelo http://arxiv.org/abs/2606.00181v1 Infinite-Dimensional Spherical Kernel ridge Regression 2026-05-29T14:46:07Z

We introduce a novel regression framework designed to model non-linear responses situated on a sphere $\mathbb{S}$ of finite or infinite dimension. Unlike traditional tangent-space regressions, which lift responses to a tangent space $T_o \mathbb{S}$ and thereby violate intrinsic spherical distances, our proposed method employs an intrinsic approach. We model the conditional mean through an intercept $o \in \mathbb{S}$ and a linear predictor function $f: \mathfrak{X} \to T_o \mathbb{S}$. This formulation transforms the estimation problem into finding a linear predictor within a function space, but utilizing a metric defined by spherical geometry rather than standard Euclidean distance. Leveraging vector-valued reproducing kernel Hilbert space theory, our approach reduces the infinite-dimensional estimation challenge to a manageable finite-dimensional problem via the representer theorem, leading to an efficient BFGS-based estimation algorithm. We establish convergence rates and analyze the finite-sample behavior of our estimator, concluding with a practical application to density regression. The full implementation is available in R.

2026-05-29T14:46:07Z Beatrice Matteo Almond Stoecker Shahin Tavakoli http://arxiv.org/abs/2605.31345v1 Log-Ratio Propagation on the Simplex: A Theory of Cellwise Contamination for Compositional Data 2026-05-29T14:22:23Z

Compositional data must be analysed through log-ratios: scale invariance, the defining axiom of the field, leaves no alternative. The centred log-ratio divides by the geometric mean of every part, so a single contaminated component shifts every centred-log-ratio coordinate at once, displacing the log-ratio vector by a fixed amount that no choice of coordinates can reduce. We develop a theory of cellwise contamination on the simplex around this observation. A scale-invariant contamination model built from multiplicative perturbation combines with a propagation theorem showing that corruption of a single raw part induces a rank-one shift of the log-ratio vector, with direction determined by the contrast matrix. The resulting perturbation pattern is not equivalent to any independent cellwise contamination model in log-ratio coordinates -- so standard Euclidean cellwise methods applied to log-ratios are ill-posed under the simplex contamination mechanism. For estimators whose Euclidean cellwise breakdown is witnessed by a column-concentrated configuration -- a class including MCD, $S$-, $τ$-, and coordinate-wise $M$-estimators of location and scatter -- the cellwise breakdown value on the simplex is reduced by the factor $(D-1)/D$ relative to its Euclidean counterpart, a reduction that is tight and arises purely from the normalisation mismatch between $nD$ raw cells and $n(D-1)$ ilr cells. The cellwise influence function for the variation matrix carries a diagnostic fingerprint: contamination of a single part inflates exactly one row and column, identifying the responsible component. These results form the theoretical foundation for cellwise-robust methods on the simplex; a companion paper develops a cellwise-robust PCA estimator that exploits the propagation geometry and demonstrates it on simulated and geochemical data.

2026-05-29T14:22:23Z 50 pages, no figures; 11-page supplement included as an ancillary file. A companion methods paper (cellPcaCoDa: cellwise-robust PCA for compositional data) is forthcoming Matthias Templ http://arxiv.org/abs/2605.31341v1 BEND: An R Package for the Bayesian Estimation of Nonlinear Longitudinal Data 2026-05-29T14:19:06Z

Longitudinal data are useful for capturing and analyzing patterns of change over time. Often, these patterns follow a nonlinear form. One useful and commonly applied nonlinear function is the piecewise function, which assumes growth occurs in distinct phases, each with its own functional form. Past literature has established that Bayesian inference is preferred over likelihood-based methods for estimating piecewise models. To address this, we developed the R package BEND - Bayesian Estimation of Nonlinear Data (available on CRAN). The purpose of BEND is to provide a user friendly software for estimating nonlinear longitudinal models using a Bayesian inference approach. Given the flexibility and practicality of the piecewise models, BEND includes several extensions of it to accommodate various types of complex longitudinal datasets and applications. Bayes_PREM() can empirically identify the number and location of random changepoints in a piecewise random effects model. This function can also model multiple latent classes with different longitudinal growth patterns and incorporate covariates to predict the outcome and latent class membership. Bayes_BPREM() can jointly model the longitudinal piecewise trajectories of two interrelated outcomes. Lastly, Bayes_CREM() can estimate the impact of group membership on longitudinal growth. This paper provides an overview of the functions included in BEND and empirical examples of how to apply these models in practice.

2026-05-29T14:19:06Z 38 pages, 7 figures Corissa T. Rohloff Rik Lamm Yadira Peralta Nidhi Kohli Eric F. Lock http://arxiv.org/abs/2504.19043v3 MiniMax Learning of Interpretable Factored Stochastic Policies from Conjoint Data, with Uncertainty Quantification 2026-05-29T14:17:38Z

We study offline policy optimization over exponentially large factorial action spaces from randomized preference data, showing how conjoint experiments can estimate interpretable stochastic policies with asymptotically valid uncertainty under regularity conditions. Conjoint analyses typically report Average Marginal Component Effects (AMCEs) by averaging over opponent attributes and thus ignore strategic interdependence. We instead learn stochastic interventions -- product-of-Categorical policies over factor levels -- that (i) optimize expected outcomes in an average-case setting and (ii) extend to a two-player minimax (adversarial) setting that realistically captures simultaneous strategic candidate selection. Methodologically, we derive a closed-form optimizer for a tractable two-way interaction regime with L2 variance regularization, and provide a general gradient-based procedure for richer model classes. Uncertainty from the outcome model propagates asymptotically to both the optimal policy and its value via a Delta method approximation. We further model institutional details (e.g., primaries) inside the minimax objective and introduce a data-driven measure of strategic divergence between parties. On synthetic data, we empirically characterize finite-sample error and coverage as dimensionality and $n$ vary. On a U.S. presidential conjoint, adversarially learned policies produce restricted-equilibrium vote shares that align with historical election ranges in our data, in stark contrast to non-adversarial (averaging) optimizers.

2025-04-26T22:35:58Z ICML 2026 Connor T. Jerzak Priyanshi Chandra Rishi Hazra http://arxiv.org/abs/2407.20819v2 Design and inference for multi-arm clinical trials with informational borrowing: the interacting urns design 2026-05-29T14:03:59Z

This paper deals with a new design methodology for stratified comparative experiments based on a system of interacting urns. The key idea is to model the interaction between urns for borrowing information across strata and to use it in the design phase in order to i) enhance the information exchange at the beginning of the study, when only few subjects have been enrolled and the stratum-specific information on treatments' efficacy could be scarce, ii) let the information sharing adaptively evolve via an update mechanism based on the observed outcomes, for skewing at each step the allocations towards the stratum-specific most promising treatment and iii) make the contribution of the strata with different treatment efficacy vanishing as the stratum information grows. In particular, we introduce the Interacting Urns Design, namely a new Covariate-Adjusted Response-Adaptive procedure, that randomizes the treatment allocations according to the evolution of the urn system. The theoretical properties of this proposal are described and the corresponding asymptotic inference is provided. Moreover, by a functional central limit theorem, we obtain the asymptotic joint distribution of the Wald-type sequential test statistics, which allows to sequentially monitor the suggested design in the clinical practice

2024-07-30T13:33:56Z Giacomo Aletti Alessandro Baldi Antognini Irene Crimaldi Rosamarie Frieri Andrea Ghiglietti http://arxiv.org/abs/2605.31306v1 Posterior and Likelihood Sensitivity in Bayesian Distributionally Robust Optimization 2026-05-29T13:39:12Z

We introduce the notion of worst-case posterior and worst-case likelihood sensitivity. These measure, respectively, the sensitivity of the expected cost to worst-case perturbations of the posterior distribution and worst-case perturbations of the likelihood of a Bayesian model. Each defines a quantitative measure of robustness. A decision maker concerned about the sensitivity of the out-of-sample expected cost to deviations from her assumptions will want a decision for which both sensitivities are small. We derive posterior and likelihood sensitivities for uncertainty sets defined in terms of deviation measures. Posterior sensitivity vanishes when the posterior variance shrinks to zero, which occurs when parameter uncertainty is eliminated from learning. Parameter learning does not eliminate likelihood sensitivity. A distributionally robust formulation of a Bayesian optimization problem makes a near-Pareto-optimal tradeoff between performance (expected cost) and robustness (posterior and likelihood sensitivity).

2026-05-29T13:39:12Z Jun-ya Gotoh Andrew E. B. Lim Michael Jong Kim http://arxiv.org/abs/2605.31305v1 Consensus-level substitution rates are distinct from the virion-level rate 2026-05-29T13:39:11Z

Estimating viral substitution rates is central to evolutionary epidemiology, and recent interest in within-host evolution has sharpened the question of what such rates measure. I distinguish two classes of evolutionary rate estimand that are rarely separated in phylogenetic analysis: the virion-level substitution rate (VLSR), a molecular quantity counting mutational events along lineages, and consensus-level substitution rates (CLSRs), population-summary quantities counting changes in the consensus sequences. CLSRs are indexed by the consensus-generation rule. The VLSR and CLSRs are both biologically meaningful, but not interchangeable. Because the consensus-generation rule defines a given CLSR, it should be a routine reporting requirement. This reflection should help analysts make more informed methodological choices when working with sets of virus sequences.

2026-05-29T13:39:11Z David J Pascall http://arxiv.org/abs/2604.02969v2 Inversion-Free Natural Gradient Descent on Riemannian Manifolds 2026-05-29T11:39:51Z

The natural gradient method is a central tool for statistical optimisation, but its broader application is hindered by the assumption of a Euclidean parameter space, the repeated estimation of the Fisher information matrix (FIM), and the computational cost of its subsequent inversion. This paper proposes an intrinsic, inversion-free natural gradient method for statistical models whose parameters lie on general Riemannian manifolds. Formulating statistical optimisation in this non-Euclidean setting allows for the natural enforcement of parameter constraints, the elimination of non-identifiable parameters, and the exploitation of geodesic convexity. Our algorithm is based on a moving approximation of the inverse FIM, which is maintained directly on the manifold. This approximation is efficiently updated with new score vectors using low-rank matrix identities. We prove almost-sure convergence rates of $O(\log s / s^α)$ for the sequence of iterates, and a similar rate for the approximate FIM. A limited-memory variant with sub-quadratic storage complexity is further proposed for large-scale applications. We demonstrate the efficacy of our method on variational Bayes within the Bures-Wasserstein manifold, normalising flows on the Stiefel manifold, and reduced-rank logistic regression.

2026-04-03T11:08:59Z 80 pages, 4 figures. Updated empirical examples Dario Draca Takuo Matsubara Minh-Ngoc Tran http://arxiv.org/abs/2605.10225v2 Increasing domain asymptotics for covariate-based nonparametric Bayesian intensity estimation with Gaussian and Besov-Laplace priors 2026-05-29T11:13:05Z

We study the problem of estimating the intensity function of a covariate-driven point process based on observations of the points and covariates over a large window. We consider the nonparametric Bayesian approach, and show that a wide class of Gaussian priors, combined with flexible link functions, achieves minimax-optimal posterior contraction rates in the increasing domain asymptotics and under the assumption that the covariates be ergodic. We also employ Besov-Laplace priors, which are popular in imaging and inverse problems due to their edge-preserving and sparsity-promoting properties. We prove that these yield optimal estimation of spatially inhomogeneous intensities belonging to Besov spaces with low integrability index. These results are based on a general concentration theorem that extends recent findings from the literature. To corroborate the theory, we provide extensive numerical simulations, implementing the considered procedures via suitable posterior sampling schemes. Further, we present two real data analyses motivated by applications in forestry and the environmental sciences.

2026-05-11T09:05:10Z 40 pages (Appendices included), 13 figures, 5 tables, to appear in the Special Issue of Journal of Multivariate Analysis "Statistical Methods for Functional Data Analysis" Patric Dolmeta Matteo Giordano http://arxiv.org/abs/2605.31130v1 Debiased inference for stochastic treatment interventions with survival outcomes 2026-05-29T10:40:26Z

Estimating the causal effect of a time-dependent treatment on time to death is challenging. In this paper, we formulate the problem using the illness-death model and focus on a stochastic intervention that modifies the hazard governing the transition from no treatment to treatment initiation. Such an intervention can only be implemented at the level of the observed data, whereas the causally valid intervention is defined at the level of the true data-generating process. We provide conditions under which the practically feasible intervention corresponds to the desired causal intervention in the specific setting. We first consider an intervention in which treatment is initiated at a fixed time point, which may subsequently be varied across the relevant time span. However, the resulting estimand is not pathwise differentiable, preventing the development of assumption-lean inference. To address this, we instead consider a smoothed intervention that assigns treatment within a time window around the target time point, thereby yielding a parameter amenable to semiparametric analysis. We derive the corresponding efficient influence function and propose a debiased one-step estimator with desirable robustness properties. We investigate its finite-sample performance in a simulation study and apply the method to the classical Stanford Heart Transplant data, as well as to data on treatment delay among couples with unexplained subfertility seeking intrauterine insemination.

2026-05-29T10:40:26Z Torben Martinussen Mark Bech Knudsen Helene Rytgaard http://arxiv.org/abs/2512.03116v3 Assessing Extrapolation of Peaks Over Thresholds with Martingale Testing 2026-05-29T09:20:05Z

We present the winning strategy for the EVA2025 Data Challenge, which aimed to estimate the probability of extreme precipitation events. These events occurred at most once in the dataset making the challenge fundamentally one of extrapolating extreme values. Given the scarcity of extreme events, we argue that a simple, robust modeling approach is essential. We adopt univariate models instead of multivariate ones and model Peaks Over Thresholds using Extreme Value Theory. Specifically, we fit an exponential distribution to model exceedances of the target variable above a high quantile (after seasonal adjustment). The novelty of our approach lies in using martingale testing to evaluate the extrapolation power of the procedure and to agnostically select the level of the high quantile. While this method has several limitations, we believe that framing extrapolation as a game opens the door to other agnostic approaches in Extreme Value Analysis.

2025-12-02T10:38:25Z Joseph de Vilmarest LPSM Olivier Wintenberger LPSM