https://arxiv.org/api/OXCuWYTxYm32emfIaOP5m/yC9WQ2026-06-11T03:17:34Z3614642015http://arxiv.org/abs/2601.20386v2SCORE: A Unified Framework for Overshoot Refund in Online FDR Control2026-05-29T15:32:17ZWe propose a unified framework to enhance the power of online multiple hypothesis testing procedures based on $e$-values. While $e$-value-based methods offer robust online False Discovery Rate (FDR) control under minimal assumptions, they often suffer from power loss by discarding evidence that exceeds the rejection threshold. We address this inefficiency via the Sequential Control with Overshoot Refund for E-values (SCORE) framework, which leverages the inequality $\mathbb{I}(y \ge 1) \le y - (y-1)_+$, valid for all $y\ge 0$, to reclaim this otherwise wasted evidence. This simple yet powerful insight yields a unified principle for improving a broad class of online testing algorithms. Building on this framework, we develop SCORE-enhanced versions of several state-of-the-art procedures, including SCORE-LOND, SCORE-LORD, and SCORE-SAFFRON, all of which strictly dominate their original counterparts while preserving valid finite-sample FDR control. Furthermore, under mild assumptions, SCORE permits retroactive updates of alpha-wealth by using the latest decision twice: first to determine its reward or loss, and then to refresh past wealth. Such a mechanism enables more aggressive testing strategies while maintaining valid FDR control, thereby further improving statistical power. The effectiveness of the proposed methods is validated through extensive simulation and real-data experiments.2026-01-28T08:52:02ZQi KuangBowen GangYin Xiahttp://arxiv.org/abs/2405.07836v5Forecasting with Hyper-Trees2026-05-29T15:23:42ZWe introduce Hyper-Trees as a novel framework for modeling time series data using gradient boosted trees. Unlike conventional tree-based approaches that forecast time series directly, Hyper-Trees learn the parameters of a target time series model, such as ARIMA or Exponential Smoothing, as functions of features. These parameters are then used by the target model to generate the final forecasts. Our framework combines the effectiveness of decision trees on tabular data with classical forecasting models, thereby inducing a time series inductive bias into tree-based models. To resolve the scaling limitations of boosted trees when estimating a high-dimensional set of target model parameters, we combine decision trees and neural networks within a unified framework. In this hybrid approach, the trees generate informative representations from the input features, which a shallow network then uses as input to learn the parameters of a time series model. With our research, we explore the effectiveness of Hyper-Trees across a range of forecasting tasks and extend tree-based modeling beyond its conventional use in time series analysis.2024-05-13T15:22:15ZGradient Boosted Trees, Hyper Models, Hybrid Models, Time Series Forecasting, Time-Varying ParametersAlexander MärzKashif Rasulhttp://arxiv.org/abs/2605.29603v2Learning study similarity to investigate heterogeneity in meta-analysis using LLMs and triplet loss2026-05-29T15:05:17ZMeta-analyses of observational studies often show substantial between-study heterogeneity, limiting the interpretability of pooled estimates. Meta-regression can be used to explore heterogeneity, but it is often underpowered to handle multiple effect modifiers. We propose a novel framework that integrates large language models (LLMs) with deep metric learning to infer study-level similarity prior to meta-analysis. Study-level clinical and methodological characteristics were processed by an LLM to generate study triplets (anchor, similar, dissimilar). These triplets were constructed by treating each study as an anchor and comparing it with pairs of other studies to identify, in each instance, the study most similar to the anchor. Then, the triplets were used into an embedding model trained with triplet loss; a deep learning approach that learns an embedding space where clinically and methodologically similar studies are clustered together. We apply our framework to a meta-analysis dataset of 58 observational studies comparing cognitive outcomes between preterm- and term-born children. Subsequently, we fit meta-analysis models within the identified study clusters and compare the results with those of the overall analysis. Results suggested three clusters two of which retained considerable between-study heterogeneity. The remaining cluster comprised the most homogeneous group of studies and exhibited a more extreme pooled effect estimate together with a narrower prediction interval compared with the overall analysis. This work presents a novel approach for exploring heterogeneity in meta-analysis by incorporating study characteristics prior to model fitting. By transforming study information into a similarity space, the framework identifies coherent subgroups and supports more precise inference in heterogeneous real-world evidence.2026-05-28T08:40:47Z17 pages, 4 figuresKanella PanagiotopoulouInstitute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center- University of Freiburg, Freiburg im Breisgau, GermanyHarald BinderInstitute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center- University of Freiburg, Freiburg im Breisgau, GermanyTheodoros EvrenoglouInstitute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center- University of Freiburg, Freiburg im Breisgau, Germanyhttp://arxiv.org/abs/2605.31394v1A Dynamic Latent Space Model for Healthcare Mobility Networks: the Italian National Health Service case2026-05-29T14:58:50ZHealthcare mobility -- patients seeking treatment outside their territory of residence -- represents a major source of inequality and financial imbalance in decentralised health systems. In Italy, persistent north-south asymmetries in patient flows among Local Health Authorities (ASLs) have reinforced existing disparities within the National Health Service; yet the structural organisation and temporal dynamics of these flows remain poorly understood at the sub-regional level. We propose a Bayesian dynamic latent space model for directed weighted networks with a hurdle negative binomial likelihood, and apply it to administrative discharge records on mobility for hip replacement procedures among 109 Italian ASLs over 2018-2024. The model jointly addresses excess zeros, overdispersion and network dependence, while capturing directional heterogeneity through multiplicative sender and receiver effects and controlling for differences in territorial size via an appropriate exposure term. Applied to Italian mobility data, the model reveals the evolving geometry of the healthcare system, quantifies the disruption induced by the COVID-19 pandemic, and uncovers structural asymmetries in outward propensity and ASLs attractiveness. The framework provides a flexible tool for the statistical analysis of dynamic healthcare mobility networks with direct relevance to the monitoring and evaluation of territorial healthcare provision.2026-05-29T14:58:50ZCecilia ManenteMarco AlfòSilvia D'Angelohttp://arxiv.org/abs/2606.00181v1Infinite-Dimensional Spherical Kernel ridge Regression2026-05-29T14:46:07ZWe introduce a novel regression framework designed to model non-linear responses situated on a sphere $\mathbb{S}$ of finite or infinite dimension. Unlike traditional tangent-space regressions, which lift responses to a tangent space $T_o \mathbb{S}$ and thereby violate intrinsic spherical distances, our proposed method employs an intrinsic approach. We model the conditional mean through an intercept $o \in \mathbb{S}$ and a linear predictor function $f: \mathfrak{X} \to T_o \mathbb{S}$. This formulation transforms the estimation problem into finding a linear predictor within a function space, but utilizing a metric defined by spherical geometry rather than standard Euclidean distance. Leveraging vector-valued reproducing kernel Hilbert space theory, our approach reduces the infinite-dimensional estimation challenge to a manageable finite-dimensional problem via the representer theorem, leading to an efficient BFGS-based estimation algorithm. We establish convergence rates and analyze the finite-sample behavior of our estimator, concluding with a practical application to density regression. The full implementation is available in R.2026-05-29T14:46:07ZBeatrice MatteoAlmond StoeckerShahin Tavakolihttp://arxiv.org/abs/2605.31345v1Log-Ratio Propagation on the Simplex: A Theory of Cellwise Contamination for Compositional Data2026-05-29T14:22:23ZCompositional data must be analysed through log-ratios: scale invariance, the defining axiom of the field, leaves no alternative. The centred log-ratio divides by the geometric mean of every part, so a single contaminated component shifts every centred-log-ratio coordinate at once, displacing the log-ratio vector by a fixed amount that no choice of coordinates can reduce. We develop a theory of cellwise contamination on the simplex around this observation. A scale-invariant contamination model built from multiplicative perturbation combines with a propagation theorem showing that corruption of a single raw part induces a rank-one shift of the log-ratio vector, with direction determined by the contrast matrix. The resulting perturbation pattern is not equivalent to any independent cellwise contamination model in log-ratio coordinates -- so standard Euclidean cellwise methods applied to log-ratios are ill-posed under the simplex contamination mechanism. For estimators whose Euclidean cellwise breakdown is witnessed by a column-concentrated configuration -- a class including MCD, $S$-, $τ$-, and coordinate-wise $M$-estimators of location and scatter -- the cellwise breakdown value on the simplex is reduced by the factor $(D-1)/D$ relative to its Euclidean counterpart, a reduction that is tight and arises purely from the normalisation mismatch between $nD$ raw cells and $n(D-1)$ ilr cells. The cellwise influence function for the variation matrix carries a diagnostic fingerprint: contamination of a single part inflates exactly one row and column, identifying the responsible component. These results form the theoretical foundation for cellwise-robust methods on the simplex; a companion paper develops a cellwise-robust PCA estimator that exploits the propagation geometry and demonstrates it on simulated and geochemical data.2026-05-29T14:22:23Z50 pages, no figures; 11-page supplement included as an ancillary file. A companion methods paper (cellPcaCoDa: cellwise-robust PCA for compositional data) is forthcomingMatthias Templhttp://arxiv.org/abs/2605.31341v1BEND: An R Package for the Bayesian Estimation of Nonlinear Longitudinal Data2026-05-29T14:19:06ZLongitudinal data are useful for capturing and analyzing patterns of change over time. Often, these patterns follow a nonlinear form. One useful and commonly applied nonlinear function is the piecewise function, which assumes growth occurs in distinct phases, each with its own functional form. Past literature has established that Bayesian inference is preferred over likelihood-based methods for estimating piecewise models. To address this, we developed the R package BEND - Bayesian Estimation of Nonlinear Data (available on CRAN). The purpose of BEND is to provide a user friendly software for estimating nonlinear longitudinal models using a Bayesian inference approach. Given the flexibility and practicality of the piecewise models, BEND includes several extensions of it to accommodate various types of complex longitudinal datasets and applications. Bayes_PREM() can empirically identify the number and location of random changepoints in a piecewise random effects model. This function can also model multiple latent classes with different longitudinal growth patterns and incorporate covariates to predict the outcome and latent class membership. Bayes_BPREM() can jointly model the longitudinal piecewise trajectories of two interrelated outcomes. Lastly, Bayes_CREM() can estimate the impact of group membership on longitudinal growth. This paper provides an overview of the functions included in BEND and empirical examples of how to apply these models in practice.2026-05-29T14:19:06Z38 pages, 7 figuresCorissa T. RohloffRik LammYadira PeraltaNidhi KohliEric F. Lockhttp://arxiv.org/abs/2504.19043v3MiniMax Learning of Interpretable Factored Stochastic Policies from Conjoint Data, with Uncertainty Quantification2026-05-29T14:17:38ZWe study offline policy optimization over exponentially large factorial action spaces from randomized preference data, showing how conjoint experiments can estimate interpretable stochastic policies with asymptotically valid uncertainty under regularity conditions. Conjoint analyses typically report Average Marginal Component Effects (AMCEs) by averaging over opponent attributes and thus ignore strategic interdependence. We instead learn stochastic interventions -- product-of-Categorical policies over factor levels -- that (i) optimize expected outcomes in an average-case setting and (ii) extend to a two-player minimax (adversarial) setting that realistically captures simultaneous strategic candidate selection. Methodologically, we derive a closed-form optimizer for a tractable two-way interaction regime with L2 variance regularization, and provide a general gradient-based procedure for richer model classes. Uncertainty from the outcome model propagates asymptotically to both the optimal policy and its value via a Delta method approximation. We further model institutional details (e.g., primaries) inside the minimax objective and introduce a data-driven measure of strategic divergence between parties. On synthetic data, we empirically characterize finite-sample error and coverage as dimensionality and $n$ vary. On a U.S. presidential conjoint, adversarially learned policies produce restricted-equilibrium vote shares that align with historical election ranges in our data, in stark contrast to non-adversarial (averaging) optimizers.2025-04-26T22:35:58ZICML 2026Connor T. JerzakPriyanshi ChandraRishi Hazrahttp://arxiv.org/abs/2407.20819v2Design and inference for multi-arm clinical trials with informational borrowing: the interacting urns design2026-05-29T14:03:59ZThis paper deals with a new design methodology for stratified comparative experiments based on a system of interacting urns. The key idea is to model the interaction between urns for borrowing information across strata and to use it in the design phase in order to i) enhance the information exchange at the beginning of the study, when only few subjects have been enrolled and the stratum-specific information on treatments' efficacy could be scarce, ii) let the information sharing adaptively evolve via an update mechanism based on the observed outcomes, for skewing at each step the allocations towards the stratum-specific most promising treatment and iii) make the contribution of the strata with different treatment efficacy vanishing as the stratum information grows. In particular, we introduce the Interacting Urns Design, namely a new Covariate-Adjusted Response-Adaptive procedure, that randomizes the treatment allocations according to the evolution of the urn system. The theoretical properties of this proposal are described and the corresponding asymptotic inference is provided. Moreover, by a functional central limit theorem, we obtain the asymptotic joint distribution of the Wald-type sequential test statistics, which allows to sequentially monitor the suggested design in the clinical practice2024-07-30T13:33:56ZGiacomo AlettiAlessandro Baldi AntogniniIrene CrimaldiRosamarie FrieriAndrea Ghigliettihttp://arxiv.org/abs/2605.31306v1Posterior and Likelihood Sensitivity in Bayesian Distributionally Robust Optimization2026-05-29T13:39:12ZWe introduce the notion of worst-case posterior and worst-case likelihood sensitivity. These measure, respectively, the sensitivity of the expected cost to worst-case perturbations of the posterior distribution and worst-case perturbations of the likelihood of a Bayesian model. Each defines a quantitative measure of robustness. A decision maker concerned about the sensitivity of the out-of-sample expected cost to deviations from her assumptions will want a decision for which both sensitivities are small. We derive posterior and likelihood sensitivities for uncertainty sets defined in terms of deviation measures. Posterior sensitivity vanishes when the posterior variance shrinks to zero, which occurs when parameter uncertainty is eliminated from learning. Parameter learning does not eliminate likelihood sensitivity. A distributionally robust formulation of a Bayesian optimization problem makes a near-Pareto-optimal tradeoff between performance (expected cost) and robustness (posterior and likelihood sensitivity).2026-05-29T13:39:12ZJun-ya GotohAndrew E. B. LimMichael Jong Kimhttp://arxiv.org/abs/2605.31305v1Consensus-level substitution rates are distinct from the virion-level rate2026-05-29T13:39:11ZEstimating viral substitution rates is central to evolutionary epidemiology, and recent interest in within-host evolution has sharpened the question of what such rates measure. I distinguish two classes of evolutionary rate estimand that are rarely separated in phylogenetic analysis: the virion-level substitution rate (VLSR), a molecular quantity counting mutational events along lineages, and consensus-level substitution rates (CLSRs), population-summary quantities counting changes in the consensus sequences. CLSRs are indexed by the consensus-generation rule. The VLSR and CLSRs are both biologically meaningful, but not interchangeable. Because the consensus-generation rule defines a given CLSR, it should be a routine reporting requirement. This reflection should help analysts make more informed methodological choices when working with sets of virus sequences.2026-05-29T13:39:11ZDavid J Pascallhttp://arxiv.org/abs/2604.02969v2Inversion-Free Natural Gradient Descent on Riemannian Manifolds2026-05-29T11:39:51ZThe natural gradient method is a central tool for statistical optimisation, but its broader application is hindered by the assumption of a Euclidean parameter space, the repeated estimation of the Fisher information matrix (FIM), and the computational cost of its subsequent inversion. This paper proposes an intrinsic, inversion-free natural gradient method for statistical models whose parameters lie on general Riemannian manifolds. Formulating statistical optimisation in this non-Euclidean setting allows for the natural enforcement of parameter constraints, the elimination of non-identifiable parameters, and the exploitation of geodesic convexity. Our algorithm is based on a moving approximation of the inverse FIM, which is maintained directly on the manifold. This approximation is efficiently updated with new score vectors using low-rank matrix identities. We prove almost-sure convergence rates of $O(\log s / s^α)$ for the sequence of iterates, and a similar rate for the approximate FIM. A limited-memory variant with sub-quadratic storage complexity is further proposed for large-scale applications. We demonstrate the efficacy of our method on variational Bayes within the Bures-Wasserstein manifold, normalising flows on the Stiefel manifold, and reduced-rank logistic regression.2026-04-03T11:08:59Z80 pages, 4 figures. Updated empirical examplesDario DracaTakuo MatsubaraMinh-Ngoc Tranhttp://arxiv.org/abs/2605.10225v2Increasing domain asymptotics for covariate-based nonparametric Bayesian intensity estimation with Gaussian and Besov-Laplace priors2026-05-29T11:13:05ZWe study the problem of estimating the intensity function of a covariate-driven point process based on observations of the points and covariates over a large window. We consider the nonparametric Bayesian approach, and show that a wide class of Gaussian priors, combined with flexible link functions, achieves minimax-optimal posterior contraction rates in the increasing domain asymptotics and under the assumption that the covariates be ergodic. We also employ Besov-Laplace priors, which are popular in imaging and inverse problems due to their edge-preserving and sparsity-promoting properties. We prove that these yield optimal estimation of spatially inhomogeneous intensities belonging to Besov spaces with low integrability index. These results are based on a general concentration theorem that extends recent findings from the literature. To corroborate the theory, we provide extensive numerical simulations, implementing the considered procedures via suitable posterior sampling schemes. Further, we present two real data analyses motivated by applications in forestry and the environmental sciences.2026-05-11T09:05:10Z40 pages (Appendices included), 13 figures, 5 tables, to appear in the Special Issue of Journal of Multivariate Analysis "Statistical Methods for Functional Data Analysis"Patric DolmetaMatteo Giordanohttp://arxiv.org/abs/2605.31130v1Debiased inference for stochastic treatment interventions with survival outcomes2026-05-29T10:40:26ZEstimating the causal effect of a time-dependent treatment on time to death is challenging. In this paper, we formulate the problem using the illness-death model and focus on a stochastic intervention that modifies the hazard governing the transition from no treatment to treatment initiation. Such an intervention can only be implemented at the level of the observed data, whereas the causally valid intervention is defined at the level of the true data-generating process. We provide conditions under which the practically feasible intervention corresponds to the desired causal intervention in the specific setting. We first consider an intervention in which treatment is initiated at a fixed time point, which may subsequently be varied across the relevant time span. However, the resulting estimand is not pathwise differentiable, preventing the development of assumption-lean inference. To address this, we instead consider a smoothed intervention that assigns treatment within a time window around the target time point, thereby yielding a parameter amenable to semiparametric analysis. We derive the corresponding efficient influence function and propose a debiased one-step estimator with desirable robustness properties. We investigate its finite-sample performance in a simulation study and apply the method to the classical Stanford Heart Transplant data, as well as to data on treatment delay among couples with unexplained subfertility seeking intrauterine insemination.2026-05-29T10:40:26ZTorben MartinussenMark Bech KnudsenHelene Rytgaardhttp://arxiv.org/abs/2512.03116v3Assessing Extrapolation of Peaks Over Thresholds with Martingale Testing2026-05-29T09:20:05ZWe present the winning strategy for the EVA2025 Data Challenge, which aimed to estimate the probability of extreme precipitation events. These events occurred at most once in the dataset making the challenge fundamentally one of extrapolating extreme values. Given the scarcity of extreme events, we argue that a simple, robust modeling approach is essential. We adopt univariate models instead of multivariate ones and model Peaks Over Thresholds using Extreme Value Theory. Specifically, we fit an exponential distribution to model exceedances of the target variable above a high quantile (after seasonal adjustment). The novelty of our approach lies in using martingale testing to evaluate the extrapolation power of the procedure and to agnostically select the level of the high quantile. While this method has several limitations, we believe that framing extrapolation as a game opens the door to other agnostic approaches in Extreme Value Analysis.2025-12-02T10:38:25ZJoseph de VilmarestLPSMOlivier WintenbergerLPSM