https://arxiv.org/api/9yxV08BymOvc99ibnWoPQgOXHcE2026-06-13T21:05:26Z3617166015http://arxiv.org/abs/2605.23791v1Joint Bayesian models for validating spatial health-event databases against a gold standard: separating global and local discrepancies2026-05-22T15:52:51ZThe reuse of medico-administrative and synthetic spatial data may overcome some limitations of population-based registries, provided rigorous validation is performed. However, no tool exists to spatially validate a candidate-for-reuse database (CFRD) against a gold standard (GS). We propose a Bayesian framework for two-dimensional (global and local) map-to-map validation of spatial health-event databases. We consider an error-model family (random [REM] and structured [SEM]) in which the CFRD is modelled as a departure from the GS. Both are compared with a shared component model (SCM). Global disagreement is assessed using the database-specific intercept difference ($RR_{\mathrm{global}}$), while local disagreement is measured by the exceedance probability of the database-specific error term. Disturbance scenarios included null, uniform, clustered, and random perturbations in the CFRD. Sensitivity, specificity, false detection rate, and Matthews Correlation Coefficient assessed detection performance. $RR_{\mathrm{global}}$ accurately recovered map-wide shifts across all models and scenarios. REM and SEM behaved were both sensitive and specific to local discrepancies. SCM was more conservative. Applied to Crohn's disease data from the EPIMAD registry and a CFRD, all models reached the same conclusion: the CFRD reproduced global and local spatial structures with an overall signal about 7\% lower. Extensions to other outcome distributions, spatio-temporal models and calibration constitute natural next steps.
\textit{Keywords:} data reuse; spatial database validation; Bayesian hierarchical models; disease mapping; shared component model.2026-05-22T15:52:51ZMathias BrugelFlorine KempfCamille TernynckMarta BlangiardoMichaël Géninhttp://arxiv.org/abs/2605.23760v1Global Sensitivity Analysis: a novel generation of mighty estimators based on rank statistics2026-05-22T15:32:44ZWe propose a new statistical estimation framework for a large family of global sensitivity analysis indices. Our approach is based on rank statistics and uses an empirical correlation coefficient recently introduced by Chatterjee [9]. We show how to apply this approach to compute not only the Cram{é}r-von-Mises indices, which are directly related to Chatterjee's notion of correlation, but also first-order Sobol indices, general metric space indices and higher-order moment indices. We establish consistency of the resulting estimators and demonstrate their numerical efficiency, especially for small sample sizes. In addition, we prove a central limit theorem for the estimators of the first-order Sobol indices.2026-05-22T15:32:44ZErratum for Global Sensitivity Analysis: a novel generation of mighty estimators based on rank statistics. Fabrice Gamboa, Thierry Klein, Agn{è}s Lagnoux, and Paul Rochet. arXiv admin note: substantial text overlap with arXiv:2003.01772Bernoulli, 2022, 28 (4), pp.2345-2374Fabrice GamboaIMTPierre GremaudNC StateThierry KleinIMT, ENACAgnès LagnouxIMThttp://arxiv.org/abs/2605.23691v1Joint Estimation of Marginal and Heterogeneous Treatment Effects2026-05-22T14:47:47ZRandomized clinical trials typically aim to estimate a marginal treatment effect. While covariate adjustment can improve precision, it may change the estimand in nonlinear models due to noncollapsibility, leading to conditional rather than marginal treatment effects. At the same time, identifying prognostic and predictive covariates is important for understanding treatment effect heterogeneity and informing clinical decision-making. Keeping marginal interpretability while allowing efficiency gains and assessment of heterogeneity remains a methodological challenge.
In this work, we extend nonparanormal adjusted marginal inference to allow for heterogeneous treatment effects. The proposed framework embeds the marginal treatment effect directly in a joint model for the outcome and baseline covariates. This construction preserves marginal interpretability while adjusting for potentially prognostic and/or predictive covariates. The method applies to continuous, binary, ordinal, and time-to-event outcomes and allows explicit estimation and ranking of prognostic and predictive covariates on a common scale.
For continuous outcomes, we show that the asymptotic variance of the marginal treatment effect measured as Cohen's $d$ is never worse and often better under covariate adjustment than without adjustment. Efficiency gains are primarily driven by prognostic effects, with realistic predictive effects contributing little additional improvement. Simulation studies confirm these findings across outcome types and demonstrate unbiased and more efficient estimation of marginal effects for Cohen's d, log-odds ratios, and log-hazard ratios. Application to an acupuncture trial demonstrates that the method reproduces the original trial findings while improving efficiency and allowing ranking of prognostic and predictive covariates.2026-05-22T14:47:47ZLeticia WuethrichTorsten Hothornhttp://arxiv.org/abs/2605.21893v2Sequential Sensitivity Analysis for Multiple Assumptions: A Framework for Understanding Racial Disparity in Police Use of Force2026-05-22T14:22:02ZInferring racial discrimination in police use of force -- the average causal effect of civilian race on use of force -- requires two assumptions about policing prior to potential use of force: that officers do not discriminate in whom they would stop (no discrimination in stops) and that, conditional on patrol context, the probability that an encounter is with a minority rather than a white civilian does not vary across encounters (no bias in encounters). As Knox et al. (2020) show, violations of the first can mask racial disparity in force. Whether it reflects discrimination in force also depends on the second. Existing sensitivity analyses address one assumption at a time. We develop a framework that varies both sequentially and apply it to NYPD Stop, Question, and Frisk data (2003--2013). Under plausible levels of discrimination in stops, we find substantial racial disparity in force. However, the conclusion that this disparity reflects discrimination is fragile to modest departures from no bias in encounters that census-based calibration suggests are demographically feasible. By jointly addressing both confounding channels, the framework reveals how they interact in ways that separate analyses cannot, contributing to understanding what generates racial disparities and how they might be addressed.2026-05-21T02:04:38ZThomas LeavittJake BowersLuke Miratrixhttp://arxiv.org/abs/2605.23664v1A note on closed-form solutions for estimating sample size when externally validating a binary prediction model based on $C$-statistic precision2026-05-22T14:13:57ZExternal validation of clinical prediction models is crucial for assessing whether they are fit for use. The $C$-statistic is a widely used measure of discriminative performance of such models predicting a binary outcome. A method for obtaining the minimum sample size required for the precise estimation of the $C$-statistic during validation, based on the rearrangement of Newcombe's formula for the standard error of the $C$-statistic {SE($C$)}, was recently proposed and implemented in R and Stata software via an iterative computational approach. We present seven novel closed-form solutions, derived using different computer algebra systems and artificial intelligence models, to the algebraic rearrangement of Newcombe's formula. We present these distinct forms to demonstrate how different computational tools yield structurally distinct but mathematically equivalent solutions, and to evaluate their practical differences in computational performance. Our closed-form solutions yield identical sample size estimates to the iterative method when applied to illustrative examples. In a benchmarking analysis, the closed-form solutions were on average 148,000 to 264,000 times faster in median execution time than the current iterative implementation, while also exhibiting minor efficiency differences among themselves. This work provides a validated, highly efficient computational tool applicable to sample size calculation for external validation studies. R code functions implementing the closed-form solutions are provided.2026-05-22T14:13:57Z8 pages, 2 figuresDenis A. ShahErick D. De WolfPierce A. PaulLaurence V. Maddenhttp://arxiv.org/abs/2605.16606v2Beyond the Composite: Enhancing Trial Analysis through a Divide & Conquer Approach to 'Days Alive and at Home': Insights from the NOTACS trial2026-05-22T14:10:43Z"Days alive and at home" (DAH) is a recent patient-centered outcome measure for perioperative trials, defined as the number of days a patient spends at home during the follow-up period. DAH typically follows a zero-inflated, left-skewed, bi-modal distribution. Other increasingly used complex endpoints, such as days alive without a ventilator, share these statistical features arising from combining survival with another clinically relevant count outcome into a single, comprehensive measure. A key challenge for DAH and similar endpoints is the lack of a readily identifiable distributional form, which complicates the statistical design of trials using it as the primary endpoint, particularly regarding the robustness of sample size calculations and final analyses where the central limit theorem might not be suitable. Using 200 data points from the interim data of the NOTACS trial (ISRCTN14092678), whose primary endpoint was DAH, we developed a novel 'Divide & Conquer' model that breaks DAH into distinct parts modeled individually. To our knowledge, such a model has not been used before for DAH. We demonstrate that our approach significantly improves model fit compared to existing alternatives, enabling more suitable DAH data generation that can be used for simulation-based sample size calculations and evaluation of operating characteristics of the statistical test(s). Beyond NOTACS, our work has large potential to inform the design and analysis of other trials using DAH or similar complex endpoints.2026-05-15T20:14:17Z35 pages, 8 figures, 2 tablesLetao YuanSofía S. VillarDominique-Laurent Couturierhttp://arxiv.org/abs/2411.15713v3Bayesian High-dimensional Grouped-regression using Sparse Projection-posterior2026-05-22T13:58:38ZWe present a novel Bayesian approach for high-dimensional grouped regression under sparsity. We leverage a sparse projection method that uses a sparsity-inducing map to derive an induced posterior on a lower-dimensional parameter space. Our method introduces three distinct projection maps based on popular penalty functions: the Group LASSO Projection Posterior, Group SCAD Projection Posterior, and Adaptive Group LASSO Projection Posterior. Each projection map is constructed to immerse dense posterior samples into a structured, sparse space, allowing for effective group selection and estimation in high-dimensional settings. We derive optimal posterior contraction rates for estimation and prediction, proving that the methods are model selection consistent. Additionally, we propose a Debiased Group LASSO Projection Map, which ensures exact coverage of credible sets. Our methodology is particularly suited for applications in nonparametric additive models, where we apply it with B-spline expansions to capture complex relationships between covariates and response. Extensive simulations validate our theoretical findings, demonstrating the robustness of our approach across different settings. Finally, we illustrate the practical utility of our method with an application to brain MRI volume data from the Alzheimer's Disease Neuroimaging Initiative (ADNI), where our model identifies key brain regions associated with Alzheimer's progression.2024-11-24T04:50:02ZSamhita PalSubhashis Ghosal10.5705/ss.202025.0071http://arxiv.org/abs/2605.23614v1The frame problem in quantitative practice: ontological uncertainty and epistemic humility in an age of automated inference2026-05-22T13:21:05ZQuantitative practice across statistics, engineering, and machine learning has been transformed by the automation of inference. Predictions are produced, validated, and deployed at scale and speed that human-mediated reasoning could not match. This shift intersects with a structural limit of reasoning that no methodological refinement dissolves: every inference rests on a finite specification of conditions, and what falls outside the specification does not appear as a widened uncertainty band -it does not appear at all. The choice of specification -the frame -is upstream of the inference and cannot be audited from inside the system that uses it. This paper offers a synthetic, application-oriented review. We argue that three categories of uncertainty operate in quantitative practice -aleatory, epistemic, and frame (or ontological) -and that the third, the residue of finite specification, is structurally invisible to formal analysis within the chosen frame and is the locus of most consequential failures. We trace why the limit applies equally to deductive and inductive reasoning, why no meta-level procedure dissolves the regress, and why current conditions of automated inference make epistemic humility -the practical disposition this argument supports -more, not less, important. We articulate the argument's specific resonances for five typical figures of contemporary quantitative work -the engineer, the statistician, the mathematician, the machine-learning practitioner, and the non-specialist recipient of expert claims -showing how the structural argument bears on each practice's natural defenses. The argument is not against rigor or against quantification; it is for distinguishing rigor earned within a frame from rigor with respect to the frame.2026-05-22T13:21:05ZWilliam FauriatDAM/DIFhttp://arxiv.org/abs/2604.19353v2Asymptotic e-processes2026-05-22T13:03:41ZWe investigate the concept of an asymptotic e-process, which is a doubly-indexed stochastic process $(E_{m,n})_{m,n\in\mathbb{N}}$ that possesses, asymptotically for an approximation index $m\to\infty$, the properties of an e-process along a monitoring time index $n$. This constitutes the first in-depth study of this recently introduced concept, which is relevant in asymptotic sequential anytime-valid inference. Our theory is motivated by practical applications in sequential hypothesis testing, in which e-variables and e-processes can only be constructed approximately from observations due to model misspecification or estimation errors. Technically, asymptotic e-processes satisfy an asymptotic version of Ville's inequality, which bounds excursion probabilities of $(E_{m,n})_{m,n\in\mathbb{N}}$ uniformly over $n$ up to a monitoring time horizon $r_m$. We show the necessity of allowing for finite values of $r_m$, recovering truly anytime-valid guarantees asymptotically if $r_m\to\infty$. We derive various properties of asymptotic e-processes, and study their connections to asymptotic supermartingales. We also investigate general methods for their construction such as calibration, the cumulative product of asymptotic e-variables, and the monitoring an of an e-process that depends on an estimated parameter. The latter construction constitutes a generalization of a recent approach within the context of asymptotic post-hoc inference.2026-04-21T11:34:24Z49 pages, 3 figures. Under review, may be subject to changesPierre-François MassianiSebastian SchulzeMattes Mollenhauerhttp://arxiv.org/abs/2510.27011v4Refined thresholds for inconsistency: The effect of the graph associated with incomplete pairwise comparisons2026-05-22T12:40:46ZThe inconsistency of pairwise comparisons remains difficult to interpret in the absence of acceptability thresholds. The popular 10% cut-off rule proposed by Saaty has recently been applied to incomplete pairwise comparison matrices, which contain some unknown comparisons. This paper refines these inconsistency thresholds: we uncover that they depend not only on the size of the matrix and the number of missing entries, but also on the undirected graph whose edges represent the known pairwise comparisons. Therefore, using our exact thresholds is especially important if the filling in patterns coincide for a large number of matrices, as has been recommended in the literature. The strong association between the new threshold values and the spectral radius of the representing graph is also demonstrated. Our results can be integrated into software to continuously monitor inconsistency during the collection of pairwise comparisons and immediately detect potential errors.2025-10-30T21:24:34Z25 pages, 6 figures, 6 tablesExpert Systems with Applications, 328: 132938, 2026Kolos Csaba ÁgostonLászló Csató10.1016/j.eswa.2026.132938http://arxiv.org/abs/2507.05064v4Vecchia-Inducing-Points Full-Scale Approximations for Gaussian Processes2026-05-22T08:53:32ZGaussian processes are flexible, probabilistic, non-parametric models widely used in machine learning and statistics. However, their scalability to large data sets is limited by computational constraints. To overcome these challenges, we propose Vecchia-inducing-points full-scale (VIF) approximations combining the strengths of global inducing points and local Vecchia approximations. Vecchia approximations excel in settings with low-dimensional inputs and moderately smooth covariance functions, while inducing point methods are better suited to high-dimensional inputs and smoother covariance functions. Our VIF approach bridges these two regimes by using an efficient correlation-based neighbor-finding strategy for the Vecchia approximation of the residual process, implemented via a modified cover tree algorithm. We further extend our framework to non-Gaussian likelihoods by introducing iterative methods that substantially reduce computational costs for training and prediction by several orders of magnitudes compared to Cholesky-based computations when using a Laplace approximation. In particular, we propose and compare novel preconditioners and provide theoretical convergence results. Extensive numerical experiments on simulated and real-world data sets show that VIF approximations are both computationally efficient as well as more accurate and numerically stable than state-of-the-art alternatives. All methods are implemented in the open source C++ library GPBoost with high-level Python and R interfaces.2025-07-07T14:49:06ZTim GygerReinhard FurrerFabio Sigristhttp://arxiv.org/abs/2502.07646v3Causal Additive Models with Unobserved Causal Paths and Backdoor Paths2026-05-22T07:57:31ZCausal additive models provide a tractable yet expressive framework for causal discovery in the presence of hidden variables. When unobserved backdoor or causal paths exist between two variables, their causal relationship is often unidentifiable under existing theories. We establish sufficient conditions under which causal directions can be identified in many such cases. These conditions rely on new characterizations of regression sets to determine independence among regression residuals and conditional independencies among observed variables. Building on these results, we introduce a search algorithm that incorporates these innovations and prove its soundness and completeness. Empirical evaluations demonstrate its competitive performance against state-of-the-art methods.2025-02-11T15:35:15Z23 pagesProceedings of AISTATS 2026Thong PhamTakashi Nicholas MaedaShohei Shimizuhttp://arxiv.org/abs/2605.23318v1Generalized Rank Regression2026-05-22T07:36:08ZRank regression offers robustness to outliers and heavy-tailed response distributions, invariance to monotonic transformations, and improved efficiency under non-Gaussian errors, making it a versatile tool for analyzing complex data. This paper introduces Generalized Rank Regression (GRR), an extension of classical rank-based methods that accommodates non-monotonic score functions. While aimed at enhancing the statistical efficiency of robust estimators, this generalization results in a potentially non-convex and non-smooth objective function, presenting challenges for both theoretical analysis and algorithmic implementation. We derive a non-asymptotic Bahadur representation of the proposed estimator and establish its asymptotic normality under mild conditions. To address the optimization challenges, we propose a new two-stage sub-gradient descent algorithm that enables efficient computation of GRR estimators with desirable statistical properties. Furthermore, we develop a multiplier bootstrap procedure for conducting statistical inference. A close connection between GRR and variants of quantile regression is uncovered, which demonstrates that GRR and composite quantile regression share asymptotically equivalent variances. The advantages of GRR are illustrated through extensive simulation studies and a real data application.2026-05-22T07:36:08Z29 pages, 10 figuresJiyuan TuSuqi WuYichen ZhangWen-Xin Zhouhttp://arxiv.org/abs/2601.20192v2Online Change Point Detection for Multivariate Inhomogeneous Poisson Processes Time Series2026-05-22T05:02:07ZWe study online change point detection for multivariate inhomogeneous Poisson point process time series. This setting arises commonly in applications such as earthquake seismology, climate monitoring, and epidemic surveillance, yet remains underexplored in the machine learning and statistics literature. We propose a method that uses low-rank matrices to represent the multivariate Poisson intensity functions, resulting in an adaptive nonparametric detection procedure. Our algorithm is single-pass and requires only constant computational cost per new observation, independent of the elapsed length of the time series. We provide theoretical guarantees to control the overall false alarm probability and characterize the detection delay under temporal dependence. We also develop a new Matrix Bernstein inequality for temporally dependent Poisson point process time series, which may be of independent interest.
Numerical experiments demonstrate that our method is both statistically robust and computationally efficient.2026-01-28T02:42:33ZXiaokai LuoHaotian XuCarlos Misael Madrid PadillaOscar Hernan Madrid Padillahttp://arxiv.org/abs/2605.23210v1Fundamental Bounds and Efficient Estimation for Dead-Time-Constrained Event Detection, with Application to Single-Photon Lidar2026-05-22T03:55:30ZWe develop an asymptotic statistical theory for parameter estimation from a class of non-i.i.d. periodic binary event-detection processes subject to nonparalyzable dead time and gating, which we call "dead-time event detection" (DED) processes. Such processes arise in single-photon lidar, fluorescence lifetime imaging, X-ray astronomy, and particle or radiation flux measurements in nuclear physics, where each detection renders the radiation/particle detector inactive for a recovery interval. Our theory quantifies how dead time and gating affect the fundamental lower bounds of estimation and identifies practical estimators that attain these bounds. First, we identify a sufficient statistic, showing in particular that activation counts can carry statistically useful information discarded by conventional histogramming hardware. We then prove local asymptotic normality and derive the corresponding Fisher-information rate, thereby obtaining fundamental lower bounds for estimation from DED processes. We prove that the maximum likelihood estimator (MLE), widely used in DED applications, attains these lower bounds. Since computing the MLE typically requires solving a nonconvex optimization problem, we also propose Le Cam one-step estimators, which attain the same asymptotic bounds with only a single local correction rather than iterative optimization. We illustrate the validity of our asymptotic theory and the practical usefulness of one-step estimators through the example of single-photon lidar in both simulations and real-data experiments.2026-05-22T03:55:30Z24 pages, 5 figuresFrederic J. N. JorgensenSteven G. Johnson