https://arxiv.org/api/Ly5kZ/xaDaPjK/d8Et1vMAFTbQU2026-06-18T14:34:10Z3629684015http://arxiv.org/abs/2605.24167v1Modified treatment policies that depend on the natural history of treatment2026-05-22T19:37:15ZLongitudinal modified treatment policies (LMTP) are a class of interventions that allow the definition, identification, and estimation of causal effects in general settings, such as with continuous or multivariate exposures, treatment regimens that require grace periods. Targeted machine learning estimators (i.e., double/debiased) have been formulated for LMTPs that assign the exposure at time $t$ as a function of the natural value of treatment at time $t$. However, important applications such as estimating the effect of a delay in the start of a treatment require formulating LMTPs that depend not only on the natural value of treatment at time $t$ but also on the \textit{history} of the natural value of treatment prior to time $t$. This paper develops targeted learning estimators for this general case. We discuss the definition of the effects, and propose estimators that use an augmented-data version of the sequential regression form of the longitudinal g-computation formula. Our estimators are based on the efficient influence function and provide $\sqrt{n}$ inference under standard doubly robust rate assumptions on the convergence of the outcome and treatment regressions. We apply the new estimators to assess the effect of delaying a risky pain treatment by one month on 12-month incidence of opioid use disorder.2026-05-22T19:37:15ZIván DíazNicholas T. WilliamsPaweł MorzywołekKara E. Rudolphhttp://arxiv.org/abs/2405.07026v4Selective Randomization Inference for Adaptive Experiments2026-05-22T18:51:47ZAdaptive experiments use preliminary analyses of the data to inform further course of action and are commonly used in many disciplines including medical and social sciences. Because the null hypothesis and experimental design are data-dependent, it has long been recognized that statistical inference for adaptive experiments is not straightforward. Most existing methods only apply to specific adaptive designs and rely on strong assumptions. In this work, we propose selective randomization inference as a general framework for analysing adaptive experiments. In a nutshell, our approach applies conditional post-selection inference to randomization tests. By using directed acyclic graphs to describe the data generating process, we derive a selective randomization p-value that controls the selective type-I error. As inference only relies on the randomness in the treatment assignment, no modelling assumptions or independent and identically distributed data are needed. We elaborate on conditions that render the proposed p-value computable and provide rejection sampling and MCMC algorithms to find a Monte Carlo approximation. Moreover, this article shows how to estimate and construct confidence intervals for a homogeneous treatment effect. Lastly, we demonstrate our method and compare it with other randomization tests using synthetic and real-world data.2024-05-11T14:56:27ZTobias FreidlingQingyuan ZhaoZijun Gaohttp://arxiv.org/abs/2605.24118v1PCA score regression: the art of losing power2026-05-22T18:24:46ZThe regression of principal component scores (RPCS) on covariates is a widely used analytic approach to detect and test for associations between functional measurements and study participant characteristics. Here we show that: (1) RPCS loses power relative to Function on Scalar Regression (FoSR); (2) the amount of power loss depends on the correlation between the PCs and the true effect; (3) if not corrected for multiplicity, RPCS has inflated $α$-level; and (4) current RPCS methods do not provide valid inference for the true effect. In contrast, we show that Function on Scalar Regression (FoSR) can avoid these problems using a particular combination of modeling tools. We validate these theoretical findings through extensive simulations and illustrate their practical implications using minute-level accelerometry data from the National Health and Nutrition Examination Survey (NHANES).2026-05-22T18:24:46ZYu LuNidhi PaiErjia CuiCiprian Crainiceanuhttp://arxiv.org/abs/2605.05428v3Parameter estimation for kappa distributions using the EM algorithm in the superstatistical framework2026-05-22T16:49:36ZKappa distributions are widely used in space plasma physics to model velocity distribution functions with heavy tails. Parameter estimation in these distributions is, however, complicated by the fact that the kappa distribution does not belong to the exponential family, so it admits no sufficient statistics and direct maximum likelihood requires numerical optimization without analytically closed-form update equations. Working within the Beck-Cohen superstatistics framework, where a gamma-distributed inverse temperature \(β\) generates the kappa distribution upon marginalization, we treat \(β\) as a latent variable. This hierarchical description restores the exponential family structure that the marginal kappa distribution lacks, and yields an analytically tractable implementation of the expectation-maximization (EM) algorithm whose E-step and M-step admit closed-form expressions in terms of sufficient statistics. Applied to synthetic data drawn from the model, the algorithm converges monotonically to a stationary point of the marginal kappa log-likelihood and recovers the generating parameters consistently across the explored range of \(κ\). EM thus offers a tractable and transparent route to inference in superstatistical systems with local temperature fluctuations.2026-05-06T20:37:35ZLeonardo Herrera-FuenzalidaSergio Davishttp://arxiv.org/abs/2605.11138v2Field Theory of Data: Anomaly Detection via the Functional Renormalization Group. The 2D Ising Model as a Benchmark2026-05-22T16:26:23ZWe establish a correspondence between anomaly detection in high-noise regimes and the renormalization group flow of non-equilibrium field theories. We provide a physical grounding for this framework by proving that the detection of phase transitions in interacting non-equilibrium systems maps to the study of an effective equilibrium field theory near its Gaussian fixed point, which we identify with the universal Marchenko-Pastur distribution. Applying the Functional Renormalization Group to the two-dimensional Model A, we demonstrate that the noise-to-signal ratio acts as a physical temperature, where the signal emerges as ordered domains within a thermalized background of fluctuations. Using the exact Onsager solution as a benchmark, we show that this approach identifies critical thresholds with an error below 4%, significantly outperforming standard information-theoretic metrics such as the Kullback-Leibler divergence. Our results provide a universal strategy for resolving structures in complex datasets near criticality, bridging the gap between statistical mechanics and statistical inference.2026-05-11T18:43:14Z15 pages, 2 appendixes; correction of typos and captions, improved clarityRiccardo FinotelloVincent LahocheParham RadpayDine Ousmane Samaryhttp://arxiv.org/abs/2605.23791v1Joint Bayesian models for validating spatial health-event databases against a gold standard: separating global and local discrepancies2026-05-22T15:52:51ZThe reuse of medico-administrative and synthetic spatial data may overcome some limitations of population-based registries, provided rigorous validation is performed. However, no tool exists to spatially validate a candidate-for-reuse database (CFRD) against a gold standard (GS). We propose a Bayesian framework for two-dimensional (global and local) map-to-map validation of spatial health-event databases. We consider an error-model family (random [REM] and structured [SEM]) in which the CFRD is modelled as a departure from the GS. Both are compared with a shared component model (SCM). Global disagreement is assessed using the database-specific intercept difference ($RR_{\mathrm{global}}$), while local disagreement is measured by the exceedance probability of the database-specific error term. Disturbance scenarios included null, uniform, clustered, and random perturbations in the CFRD. Sensitivity, specificity, false detection rate, and Matthews Correlation Coefficient assessed detection performance. $RR_{\mathrm{global}}$ accurately recovered map-wide shifts across all models and scenarios. REM and SEM behaved were both sensitive and specific to local discrepancies. SCM was more conservative. Applied to Crohn's disease data from the EPIMAD registry and a CFRD, all models reached the same conclusion: the CFRD reproduced global and local spatial structures with an overall signal about 7\% lower. Extensions to other outcome distributions, spatio-temporal models and calibration constitute natural next steps.
\textit{Keywords:} data reuse; spatial database validation; Bayesian hierarchical models; disease mapping; shared component model.2026-05-22T15:52:51ZMathias BrugelFlorine KempfCamille TernynckMarta BlangiardoMichaël Géninhttp://arxiv.org/abs/2605.23760v1Global Sensitivity Analysis: a novel generation of mighty estimators based on rank statistics2026-05-22T15:32:44ZWe propose a new statistical estimation framework for a large family of global sensitivity analysis indices. Our approach is based on rank statistics and uses an empirical correlation coefficient recently introduced by Chatterjee [9]. We show how to apply this approach to compute not only the Cram{é}r-von-Mises indices, which are directly related to Chatterjee's notion of correlation, but also first-order Sobol indices, general metric space indices and higher-order moment indices. We establish consistency of the resulting estimators and demonstrate their numerical efficiency, especially for small sample sizes. In addition, we prove a central limit theorem for the estimators of the first-order Sobol indices.2026-05-22T15:32:44ZErratum for Global Sensitivity Analysis: a novel generation of mighty estimators based on rank statistics. Fabrice Gamboa, Thierry Klein, Agn{è}s Lagnoux, and Paul Rochet. arXiv admin note: substantial text overlap with arXiv:2003.01772Bernoulli, 2022, 28 (4), pp.2345-2374Fabrice GamboaIMTPierre GremaudNC StateThierry KleinIMT, ENACAgnès LagnouxIMThttp://arxiv.org/abs/2605.23691v1Joint Estimation of Marginal and Heterogeneous Treatment Effects2026-05-22T14:47:47ZRandomized clinical trials typically aim to estimate a marginal treatment effect. While covariate adjustment can improve precision, it may change the estimand in nonlinear models due to noncollapsibility, leading to conditional rather than marginal treatment effects. At the same time, identifying prognostic and predictive covariates is important for understanding treatment effect heterogeneity and informing clinical decision-making. Keeping marginal interpretability while allowing efficiency gains and assessment of heterogeneity remains a methodological challenge.
In this work, we extend nonparanormal adjusted marginal inference to allow for heterogeneous treatment effects. The proposed framework embeds the marginal treatment effect directly in a joint model for the outcome and baseline covariates. This construction preserves marginal interpretability while adjusting for potentially prognostic and/or predictive covariates. The method applies to continuous, binary, ordinal, and time-to-event outcomes and allows explicit estimation and ranking of prognostic and predictive covariates on a common scale.
For continuous outcomes, we show that the asymptotic variance of the marginal treatment effect measured as Cohen's $d$ is never worse and often better under covariate adjustment than without adjustment. Efficiency gains are primarily driven by prognostic effects, with realistic predictive effects contributing little additional improvement. Simulation studies confirm these findings across outcome types and demonstrate unbiased and more efficient estimation of marginal effects for Cohen's d, log-odds ratios, and log-hazard ratios. Application to an acupuncture trial demonstrates that the method reproduces the original trial findings while improving efficiency and allowing ranking of prognostic and predictive covariates.2026-05-22T14:47:47ZLeticia WuethrichTorsten Hothornhttp://arxiv.org/abs/2605.21893v2Sequential Sensitivity Analysis for Multiple Assumptions: A Framework for Understanding Racial Disparity in Police Use of Force2026-05-22T14:22:02ZInferring racial discrimination in police use of force -- the average causal effect of civilian race on use of force -- requires two assumptions about policing prior to potential use of force: that officers do not discriminate in whom they would stop (no discrimination in stops) and that, conditional on patrol context, the probability that an encounter is with a minority rather than a white civilian does not vary across encounters (no bias in encounters). As Knox et al. (2020) show, violations of the first can mask racial disparity in force. Whether it reflects discrimination in force also depends on the second. Existing sensitivity analyses address one assumption at a time. We develop a framework that varies both sequentially and apply it to NYPD Stop, Question, and Frisk data (2003--2013). Under plausible levels of discrimination in stops, we find substantial racial disparity in force. However, the conclusion that this disparity reflects discrimination is fragile to modest departures from no bias in encounters that census-based calibration suggests are demographically feasible. By jointly addressing both confounding channels, the framework reveals how they interact in ways that separate analyses cannot, contributing to understanding what generates racial disparities and how they might be addressed.2026-05-21T02:04:38ZThomas LeavittJake BowersLuke Miratrixhttp://arxiv.org/abs/2605.23664v1A note on closed-form solutions for estimating sample size when externally validating a binary prediction model based on $C$-statistic precision2026-05-22T14:13:57ZExternal validation of clinical prediction models is crucial for assessing whether they are fit for use. The $C$-statistic is a widely used measure of discriminative performance of such models predicting a binary outcome. A method for obtaining the minimum sample size required for the precise estimation of the $C$-statistic during validation, based on the rearrangement of Newcombe's formula for the standard error of the $C$-statistic {SE($C$)}, was recently proposed and implemented in R and Stata software via an iterative computational approach. We present seven novel closed-form solutions, derived using different computer algebra systems and artificial intelligence models, to the algebraic rearrangement of Newcombe's formula. We present these distinct forms to demonstrate how different computational tools yield structurally distinct but mathematically equivalent solutions, and to evaluate their practical differences in computational performance. Our closed-form solutions yield identical sample size estimates to the iterative method when applied to illustrative examples. In a benchmarking analysis, the closed-form solutions were on average 148,000 to 264,000 times faster in median execution time than the current iterative implementation, while also exhibiting minor efficiency differences among themselves. This work provides a validated, highly efficient computational tool applicable to sample size calculation for external validation studies. R code functions implementing the closed-form solutions are provided.2026-05-22T14:13:57Z8 pages, 2 figuresDenis A. ShahErick D. De WolfPierce A. PaulLaurence V. Maddenhttp://arxiv.org/abs/2605.16606v2Beyond the Composite: Enhancing Trial Analysis through a Divide & Conquer Approach to 'Days Alive and at Home': Insights from the NOTACS trial2026-05-22T14:10:43Z"Days alive and at home" (DAH) is a recent patient-centered outcome measure for perioperative trials, defined as the number of days a patient spends at home during the follow-up period. DAH typically follows a zero-inflated, left-skewed, bi-modal distribution. Other increasingly used complex endpoints, such as days alive without a ventilator, share these statistical features arising from combining survival with another clinically relevant count outcome into a single, comprehensive measure. A key challenge for DAH and similar endpoints is the lack of a readily identifiable distributional form, which complicates the statistical design of trials using it as the primary endpoint, particularly regarding the robustness of sample size calculations and final analyses where the central limit theorem might not be suitable. Using 200 data points from the interim data of the NOTACS trial (ISRCTN14092678), whose primary endpoint was DAH, we developed a novel 'Divide & Conquer' model that breaks DAH into distinct parts modeled individually. To our knowledge, such a model has not been used before for DAH. We demonstrate that our approach significantly improves model fit compared to existing alternatives, enabling more suitable DAH data generation that can be used for simulation-based sample size calculations and evaluation of operating characteristics of the statistical test(s). Beyond NOTACS, our work has large potential to inform the design and analysis of other trials using DAH or similar complex endpoints.2026-05-15T20:14:17Z35 pages, 8 figures, 2 tablesLetao YuanSofía S. VillarDominique-Laurent Couturierhttp://arxiv.org/abs/2411.15713v3Bayesian High-dimensional Grouped-regression using Sparse Projection-posterior2026-05-22T13:58:38ZWe present a novel Bayesian approach for high-dimensional grouped regression under sparsity. We leverage a sparse projection method that uses a sparsity-inducing map to derive an induced posterior on a lower-dimensional parameter space. Our method introduces three distinct projection maps based on popular penalty functions: the Group LASSO Projection Posterior, Group SCAD Projection Posterior, and Adaptive Group LASSO Projection Posterior. Each projection map is constructed to immerse dense posterior samples into a structured, sparse space, allowing for effective group selection and estimation in high-dimensional settings. We derive optimal posterior contraction rates for estimation and prediction, proving that the methods are model selection consistent. Additionally, we propose a Debiased Group LASSO Projection Map, which ensures exact coverage of credible sets. Our methodology is particularly suited for applications in nonparametric additive models, where we apply it with B-spline expansions to capture complex relationships between covariates and response. Extensive simulations validate our theoretical findings, demonstrating the robustness of our approach across different settings. Finally, we illustrate the practical utility of our method with an application to brain MRI volume data from the Alzheimer's Disease Neuroimaging Initiative (ADNI), where our model identifies key brain regions associated with Alzheimer's progression.2024-11-24T04:50:02ZSamhita PalSubhashis Ghosal10.5705/ss.202025.0071http://arxiv.org/abs/2605.23614v1The frame problem in quantitative practice: ontological uncertainty and epistemic humility in an age of automated inference2026-05-22T13:21:05ZQuantitative practice across statistics, engineering, and machine learning has been transformed by the automation of inference. Predictions are produced, validated, and deployed at scale and speed that human-mediated reasoning could not match. This shift intersects with a structural limit of reasoning that no methodological refinement dissolves: every inference rests on a finite specification of conditions, and what falls outside the specification does not appear as a widened uncertainty band -it does not appear at all. The choice of specification -the frame -is upstream of the inference and cannot be audited from inside the system that uses it. This paper offers a synthetic, application-oriented review. We argue that three categories of uncertainty operate in quantitative practice -aleatory, epistemic, and frame (or ontological) -and that the third, the residue of finite specification, is structurally invisible to formal analysis within the chosen frame and is the locus of most consequential failures. We trace why the limit applies equally to deductive and inductive reasoning, why no meta-level procedure dissolves the regress, and why current conditions of automated inference make epistemic humility -the practical disposition this argument supports -more, not less, important. We articulate the argument's specific resonances for five typical figures of contemporary quantitative work -the engineer, the statistician, the mathematician, the machine-learning practitioner, and the non-specialist recipient of expert claims -showing how the structural argument bears on each practice's natural defenses. The argument is not against rigor or against quantification; it is for distinguishing rigor earned within a frame from rigor with respect to the frame.2026-05-22T13:21:05ZWilliam FauriatDAM/DIFhttp://arxiv.org/abs/2604.19353v2Asymptotic e-processes2026-05-22T13:03:41ZWe investigate the concept of an asymptotic e-process, which is a doubly-indexed stochastic process $(E_{m,n})_{m,n\in\mathbb{N}}$ that possesses, asymptotically for an approximation index $m\to\infty$, the properties of an e-process along a monitoring time index $n$. This constitutes the first in-depth study of this recently introduced concept, which is relevant in asymptotic sequential anytime-valid inference. Our theory is motivated by practical applications in sequential hypothesis testing, in which e-variables and e-processes can only be constructed approximately from observations due to model misspecification or estimation errors. Technically, asymptotic e-processes satisfy an asymptotic version of Ville's inequality, which bounds excursion probabilities of $(E_{m,n})_{m,n\in\mathbb{N}}$ uniformly over $n$ up to a monitoring time horizon $r_m$. We show the necessity of allowing for finite values of $r_m$, recovering truly anytime-valid guarantees asymptotically if $r_m\to\infty$. We derive various properties of asymptotic e-processes, and study their connections to asymptotic supermartingales. We also investigate general methods for their construction such as calibration, the cumulative product of asymptotic e-variables, and the monitoring an of an e-process that depends on an estimated parameter. The latter construction constitutes a generalization of a recent approach within the context of asymptotic post-hoc inference.2026-04-21T11:34:24Z49 pages, 3 figures. Under review, may be subject to changesPierre-François MassianiSebastian SchulzeMattes Mollenhauerhttp://arxiv.org/abs/2510.27011v4Refined thresholds for inconsistency: The effect of the graph associated with incomplete pairwise comparisons2026-05-22T12:40:46ZThe inconsistency of pairwise comparisons remains difficult to interpret in the absence of acceptability thresholds. The popular 10% cut-off rule proposed by Saaty has recently been applied to incomplete pairwise comparison matrices, which contain some unknown comparisons. This paper refines these inconsistency thresholds: we uncover that they depend not only on the size of the matrix and the number of missing entries, but also on the undirected graph whose edges represent the known pairwise comparisons. Therefore, using our exact thresholds is especially important if the filling in patterns coincide for a large number of matrices, as has been recommended in the literature. The strong association between the new threshold values and the spectral radius of the representing graph is also demonstrated. Our results can be integrated into software to continuously monitor inconsistency during the collection of pairwise comparisons and immediately detect potential errors.2025-10-30T21:24:34Z25 pages, 6 figures, 6 tablesExpert Systems with Applications, 328: 132938, 2026Kolos Csaba ÁgostonLászló Csató10.1016/j.eswa.2026.132938