https://arxiv.org/api/e21IuD+QuC7zlUz+Dj+tq093Aeo2026-06-11T11:18:09Z3614652515http://arxiv.org/abs/2605.27794v1Learning to target with network interference2026-05-27T00:28:52ZThis paper studies adaptive targeting under network interference in a bandit setting, where treatments applied to one individual may affect others through spillover effects. We consider a linear model in a sparse regime, where each individual's outcome can be affected by at most a few others. We first establish a regret lower bound showing that ignoring the network structure and reducing the problem to a standard linear bandit inevitably leads to inefficient learning, particularly in large populations. To understand how structural information can be leveraged, we analyze regimes with varying levels of knowledge of the interference structure: (1) full support knowledge, (2) knowledge of the column support sizes, and (3) no prior knowledge. For each regime, we establish regret lower bounds characterizing the fundamental limits of learning, and develop algorithms that achieve near-optimal regret. Together, our results provide a unified view of how knowledge of the interference structure governs the efficiency of online learning under interference, and offer practical adaptive targeting algorithms in each setting. Numerical experiments on synthetic and real-world data demonstrate the practical benefits of our algorithms.2026-05-27T00:28:52ZXiaomeng WangHamsa BastaniOsbert BastaniZhimei Renhttp://arxiv.org/abs/2605.27718v1Robust Moment-Based Estimation via Spectral Gradient Reweighting2026-05-26T21:44:02ZMoment-based estimation is a theoretically attractive approach to parametric inference, especially when likelihood-based estimation is unavailable, misspecified, or computationally inconvenient. However, the moment equations involve sample averages, which makes moment-based estimation sensitive to outliers. We propose the SGR-GMM algorithm, a robust generalized method of moments (GMM) procedure that uses a spectral gradient reweighting (SGR) primitive to soft-reweight the per-observation gradients during the moment-matching optimization. Our analysis has three layers. First, for a fixed center, the SGR primitive is formulated as an entropy-regularized spectral game between a sample-weight player and a density-matrix player, which is analyzed using classical multiplicative-weights and matrix-multiplicative-weights regret bounds. Second, we establish explicit convergence radius and finite termination bound for the fixed-center updates in the SGR primitive. Third, we prove a local finite-sample parameter estimation error bound with explicit dependence on the contamination fraction, inlier gradient stability, local GMM identification strength, and optimization accuracy. We further specialize the SGR-GMM algorithm to obtain a robust diagonally-weighted GMM (DGMM) estimator for estimating heteroscedastic low-rank Gaussian mixtures observed under additive Gaussian noise and strong contamination. In the numerical experiments, the SGR primitive produces nearly-oracle gradient estimation and the robust DGMM specialization substantially improves over non-robust moment baselines. The code and data are available at https://github.com/liu-lzhang/sgr-gmm.2026-05-26T21:44:02ZLiu ZhangAmit Singerhttp://arxiv.org/abs/2605.27711v1Improving Power in Randomized Controlled Trials with Time-to-Event Endpoints: A Risk-Free Approach2026-05-26T21:36:00ZLeveraging external or historical data to improve the efficiency of randomized clinical trials without introducing bias or inflating the Type I error rate remains challenging. Recent work on externally trained prognostic scores, such as PROCOVA for continuous endpoint, has demonstrated a risk-free approach via covariate adjustment. However, extending this paradigm to time-to-event endpoints is nontrivial due to the non-collapsibility of the marginal hazard ratio (HR). In this paper, we address this challenge by proposing a unified framework for incorporating complex, high-dimensional prognostic information learned from external data into the primary analysis of RCTs with time- to-event endpoints, while targeting the marginal hazard ratio. The proposed procedure proceeds in two steps. First, a prognostic score is estimated from external or historical data by regressing martingale residuals on baseline covariates using flexible supervised learning methods. Second, the fitted score is included as an additional covariate in the nonparametric covariate-adjusted log-rank test and the associated marginal HR estimator of Ye et al. [2024]. The proposed method controls Type I error and provides asymptotic unbiased estimation of the marginal HR, irrespective of prognostic model misspecification, or population heterogeneity between external/historical and trial data. We show that the variance reduction, and corresponding event count savings, are approximately equal to the squared correlation between the prognostic score and the martingale pseudo-outcome in the trial. Extensions to stratified randomization are straightforward. Simulation studies demonstrate satisfactory finite-sample performance and meaningful efficiency gains when historical prognostic information is informative.2026-05-26T21:36:00ZJunyi ZhouQing LiuMay MoAmy Xiahttp://arxiv.org/abs/2206.15475v3Causal Machine Learning: A Survey and Open Problems2026-05-26T21:14:13ZCausal Machine Learning (CausalML) is an umbrella term for machine learning methods that formalize the data-generation process as a structural causal model (SCM). This perspective enables us to reason about the effects of changes to this process (interventions) and what would have happened in hindsight (counterfactuals). We categorize work in CausalML into five groups according to the problems they address: (1) causal supervised learning, (2) causal generative modeling, (3) causal explanations, (4) causal fairness, and (5) causal reinforcement learning. We systematically compare the methods in each category and point out open problems. Further, we review data-modality-specific applications in computer vision, natural language processing, and graph representation learning. Finally, we provide an overview of causal benchmarks and a critical discussion of the state of this nascent field, including recommendations for future work.2022-06-30T17:59:15Zv03. Work in progress. Feedback and comments are highly appreciated!Jean KaddourAengus LynchQi LiuMatt J. KusnerRicardo Silvahttp://arxiv.org/abs/2605.27664v1BOOST: Power-Optimal Strong-FWER Testing for Block-Structured Multiplicity2026-05-26T20:30:36ZStructured multiple-testing problems (gatekeeping trials, dose-finding, multi-tissue eQTL mapping, bundled-challenger A/B experiments) organize hypotheses into design-imposed blocks and demand strong family-wise error rate (FWER) control for confirmatory claims. Practitioners currently use objective-agnostic stepwise rules (Bonferroni, Holm, Hochberg, Hommel), closed-testing and graphical extensions, or hierarchical and resampling methods; none is power-optimal within the block-separable class these designs induce. We introduce BOOST (Block-Optimal Objective-driven Strong-FWER Testing), the power-optimal strong-FWER procedure for block size three, with three guarantees: (i) finite-sample strong-FWER validity at $O(K)$ cost (versus $O(K^2)$ for general closed testing) without independence assumptions, with a strict Sidak improvement under cross-block independence; (ii) power-optimal allocation across heterogeneous blocks via an equalized-marginal KKT condition, solvable by bisection in $O(B\log(1/\varepsilon))$; and (iii) a sample-split plug-in variant for unknown alternative density $g$, attaining $α$-control up to $O(B_T \mathbb E\|g-\widehat g\|_\infty)$ inflation with per-hypothesis power deficit independent of $B_T$. Simulations across independent, equicorrelated, sparse, and mis-specified regimes show 1.4-1.7$\times$ power gains over the strongest existing baseline at calibrated FWER. On two published datasets (BLUEPRINT cross-lineage cis-eQTL and Upworthy bundled-challenger A/B experiments), BOOST certifies an order of magnitude more full-block discoveries than existing baselines at controlled FWER.2026-05-26T20:30:36ZPrasanjit DubeyXiaoming Huohttp://arxiv.org/abs/2606.07578v1MST-Direct at Scale: Multivariate and Conditional Geostatistical Simulation via Sinkhorn Optimal Transport2026-05-26T20:20:06ZThis paper extends MST-Direct, a Matching-via-Sinkhorn-Transport approach for multivariate geostatistical simulation, from the original bivariate, unconditional, small-grid formulation to multivariate, conditional, and large-grid settings. We address the three main limitations identified in the original work: (i) scalability beyond a few thousand nodes through a sparse, candidate-restricted Sinkhorn matcher with O(nC) memory complexity; (ii) extension to multiple variables by matching target value tuples onto an independent FFT-MA Gaussian backbone that reproduces a prescribed variogram; and (iii) hard-data conditioning by fixing observed data tuples at their spatial locations while conditioning the backbone through kriging. Because the transport plan remains a permutation of the target tuples, the multivariate joint distribution is preserved exactly.
The method is validated using the same six-variate, heteroscedastic, strongly nonlinear reference distribution employed in Direct Multivariate Simulation (DMS), under both unconditional (200x200) and conditional (100x100, 200 hard-data samples) scenarios, and is benchmarked against the Projection Pursuit Multivariate Transform (PPMT). Results show that MST-Direct reproduces the joint distribution with zero histogram error, exactly honours hard data, and accurately reproduces the prescribed spatial correlation structure, whereas PPMT remains an approximation.
Index Terms-Optimal transport, Sinkhorn algorithm, geostatistical simulation, multivariate simulation.2026-05-26T20:20:06ZTcharlies Bachmann Schmitzhttp://arxiv.org/abs/2509.22446v2Rescuing double robustness: safe estimation under complete misspecification2026-05-26T20:15:50ZDouble robustness is a major selling point of semiparametric and missing data methodology. Its virtues lie in protection against partial nuisance misspecification and asymptotic semiparametric efficiency under correct nuisance specification. However, in many applications, complete nuisance misspecification should be regarded as the norm (or at the very least the expected default), and thus doubly robust estimators may behave fragilely. In fact, it has been amply verified empirically that these estimators can perform poorly when all nuisance functions are misspecified. Here, we first characterize this phenomenon of double fragility, and then propose a solution based on adaptive correction clipping (DR+ACC). We argue that our DR+ACC proposal is safe, in that it inherits the favorable properties of doubly robust estimators under correct nuisance specification, but its error is guaranteed to be bounded by a convex combination of the individual nuisance model errors, which prevents the instability caused by the compounding product of errors of doubly robust estimators. We also show that our proposal comes with no reduction in semiparametric efficiency compared to doubly robust estimators, and thus valid inference based on asymptotic normality can be conducted when nuisances are well-specified. We showcase the efficacy of our DR+ACC estimator both through extensive simulations and by applying it to the analysis of Alzheimer's disease proteomics data.2025-09-26T15:03:18Z23 pages, 4 figuresLorenzo TestaFrancesca ChiaromonteKathryn Roederhttp://arxiv.org/abs/2605.27655v1Implementing the principal stratum strategy for intercurrent events with survival outcomes: a tutorial2026-05-26T20:14:51ZThe International Council for Harmonization (ICH) E9 (R1) addendum provides the estimand framework to formulate treatment effects in a clinical trial. One of the attributes of an estimand the framework describes is intercurrent events. Among the five strategies to intercurrent events the guidance lists, the principal stratum strategy is the most conceptually and technically challenging because it defines treatment effects on unobserved strata. Its application to survival outcomes is particularly inaccessible to practitioners. This tutorial reviews the methodology and implementation of the estimand framework with the principal stratum strategy to address intercurrent events with survival outcomes. We illustrate using a clinical trial in oncology and focus on a simple case with binary treatment and a single binary intercurrent event of discontinuation of the assigned treatment. We define the causal effects and review two main methods for estimating the effects: the mixture model method and the weighting method. For each method, we elaborate the associated assumptions, models, sensitivity analysis, software and provide example R code. We conduct simulation studies that mimic the real study to study the operation characteristics of these methods.2026-05-26T20:14:51ZXiaoxiao ZhouJoyce ChenPallavi Mishra-KalyaniXiaoxue LiYuan Li ShenShu WangSusan HalabiFan Lihttp://arxiv.org/abs/2605.27650v1Bayesian Imputation for Unplayed Games in Round-Robin Chess Tournaments: Application to Grand Chess Tour, Bucharest 20262026-05-26T20:10:05ZWhen a player withdraws mid-tournament from a round-robin chess event, organizers face a fundamental problem: how should scores be assigned for games that were never played? Current FIDE guidelines specify annulment if withdrawal occurs before 50% of games are completed, and forfeit (awarding unplayed opponents a full point) thereafter. This dichotomous rule creates arbitrary discontinuities and can substantially distort final standings. We develop a Bayesian framework based on best linear unbiased prediction (BLUP) that optimally combines pre-tournament ratings with observed performance, producing imputed scores that reflect both the withdrawn player's current form and the strength differentials among unplayed opponents. The estimator is consistent, point-conserving, and minimizes mean squared error among linear unbiased predictors. A Monte Carlo simulation study on 180,000 simulated tournaments demonstrates that Bayesian BLUP imputation reduces prediction error by 26% overall compared to FIDE's current rule, with improvements of 41% over forfeit and 12% over annulment. The largest gains occur when the withdrawn player is underperforming, the most common withdrawal scenario. We further show that annulment achieves 15-45% lower RMSE than forfeit across all scenarios. The methodology is applied to GM Alireza Firouzja's withdrawal at Grand Chess Tour, Bucharest 2026, where Bayesian imputation would have awarded unplayed opponents 0.55-0.70 points rather than the 1.0 awarded under forfeit rules. An open-source R Shiny application is provided for tournament organizers. We recommend that FIDE adopt Bayesian imputation for World Championship cycle events, or at minimum replace the current dichotomous rule with uniform annulment.2026-05-26T20:10:05ZRavi Varadhanhttp://arxiv.org/abs/2504.10092v2Bayesian optimal experimental design with Wasserstein information criteria2026-05-26T19:43:43ZBayesian optimal experimental design (OED) provides a principled framework for selecting observations or experiments. We introduce new Bayesian design criteria based on the expected Wasserstein-$p$ distance between the prior and posterior distributions, termed Wasserstein information criteria. These criteria have many parallels with the widely used expected information gain (EIG) criterion, which instead relies on the Kullback--Leibler divergence. We show that the Wasserstein-$2$ criterion admits a closed-form solution in the linear-Gaussian setting, a property which can be used for more general approximation schemes, and contrast this solution with classical notions of Bayesian alphabetic optimality. Then we develop a stability analysis of the Wasserstein-$1$ criterion, wherein we bound errors induced by perturbations of the prior or likelihood. We partially extend this analysis to the Wasserstein-$2$ criterion. In particular, these results yield error rates for empirical approximations of the prior. We then illustrate the computability of the Wasserstein-$2$ criterion and demonstrate our approximation rates through simulations.2025-04-14T10:56:42Z28 pages, 5 figuresTapio HelinYoussef MarzoukJose Rodrigo Rojo-Garciahttp://arxiv.org/abs/2605.27496v1Model--based clustering for spherical and hyper--spherical data using elliptically symmetric distributions2026-05-26T17:44:04ZModel--based clustering for directional data data has attracted a lot of interest, but most methods utilize rotationally symmetric distributions. This paper suggests the use of elliptically symmetric distributions, namely the elliptically symmetric angular Gaussian and the spherical elliptically symmetric projected Cauchy distributions that were recently proposed in the literature for modelling spherical data. The expectation--maximization algorithm is employed and the inclusion of covariates is also examined. Simulation studies compare the two distributions in terms of choosing the optimal number of clusters and computational cost. We use the mixtures of these two distributions to cluster two datasets on the sphere (earthquake locations) and two hyper--spherical datasets.2026-05-26T17:44:04ZTheodoros PerdikisNader AlharbiMichail Tsagrishttp://arxiv.org/abs/2605.27330v1Two-Phase Sampling Designs and Analysis Approaches for Ordinal Outcomes2026-05-26T17:38:19ZModern clinical trials and cohort studies gather low-cost data on all participants but may have limited resources to assess expensive exposures such as biomarkers or genomic data. When interest lies in associations involving expensive exposures, two-phase designs provide a cost-effective framework by using information available on all participants to guide the targeted selection of a subset for additional measurements. We extend this framework to studies with ordinal outcomes, a common yet previously unexplored setting. We propose three outcome-informed phase 2 sampling designs -- outcome-dependent sampling (ODS), covariate-stratified ODS, and residual-dependent sampling -- that leverage phase 1 data to enrich phase 2 selection with informative subjects. We then develop analysis methods for valid and efficient estimation/inference, including conditional likelihood methods with ascertainment-corrected maximum likelihood estimation, multiple imputation, and a full likelihood method using sieve maximum likelihood estimation. Across a range of scenarios, simulation studies show that the proposed methods substantially improve efficiency over simple random sampling with standard maximum likelihood estimation. We further demonstrate their practical utility by examining the association between interleukin-6 and a four-level clinical status outcome -- discharged, hospitalized but not in the ICU, hospitalized in the ICU, and death -- 14 days after randomization into the Crystalloid Liberal or Vasopressors Early Resuscitation in Sepsis trial.2026-05-26T17:38:19ZYunbi NamNathan I. ShapiroEric P. SchmidtWesley H. SelfRan TaoJonathan S. Schildcrouthttp://arxiv.org/abs/2605.25852v2A Post-Processing Conformal Prediction Approach for Conditional Coverage via Pivotal Scores2026-05-26T17:02:36ZWhile Conformal Prediction (CP) has proven to be a powerful framework for uncertainty quantification, guaranteeing conditional coverage remains a central challenge. Although finite-sample, distribution-free conditional validity is known to be impossible without structural assumptions, we show that it is fundamentally equivalent to constructing a nonconformity score whose distribution is independent of the features. This theoretical characterization motivates PIT-CP, a new post-processing correction that maps any base nonconformity score to an approximately invariant one while preserving its geometry, interpretability, and marginal coverage. This perspective is particularly appealing in practice, since it may be neither economical nor time-effective to retrain a full generative model when a strong prediction-driven model already provides highly accurate point estimates. Our procedure reduces the problem to one-dimensional conditional density estimation on the induced score, rather than full conditional density estimation on the original outcome space. We show how to estimate this transform in practice and derive bounds on the conditional coverage gap, alongside volumetric and symmetric-difference bounds. We present known minimax-optimal conditional estimation techniques while also motivating the use of modern conditional density estimators, including Mixture Density Networks and Conditional Normalizing Flows. Finally, we empirically demonstrate on various datasets that our PIT-CP procedure matches or outperforms many state-of-the-art conformal prediction strategies with minimal effort and computational cost.2026-05-25T13:44:25Z33 pages, 4 figuresFélix Laplantehttp://arxiv.org/abs/2605.27272v1Causally-interpretable meta-analysis using aggregate data2026-05-26T16:48:01ZEvidence syntheses and meta-analyses are used to inform clinical practice guidelines and health economic evaluations. However, heterogeneity of treatment effects poses a significant challenge. Conventional meta-analysis addresses heterogeneity through random-effect assumptions, which are not supported by design and lead to estimates that may not apply to any real-world population. Causally-interpretable meta-analysis (CIMA) offers a rigorous framework for specification, identification, and estimation of causal effects when combining information from multiple randomized trials. Initial development of CIMA focused on using individual data from randomized trials, but such data are often unavailable in practice. Here, we propose a new version of CIMA that only requires aggregate data from trials, addressing the limitations of traditional meta-analysis methods while relying only on aggregate data. The method leverages the trials' reported estimates of marginal and one-at-a-time subgroup treatment effects and descriptive statistics for baseline covariates to build moment equations for identifying and estimating a parametric conditional average treatment effect (CATE) function. The average treatment effect in a new target population is obtained by marginalizing the CATE function over the individual covariate data that defines the target population. The method can also be used to obtain causally-interpretable indirect treatment comparisons in the target population. We establish the asymptotic properties of the method, assess its finite-sample performance in simulation studies, and illustrate the application of the method by re-analyzing a published meta-analysis for SGLT2 inhibitors in patients with heart failure.2026-05-26T16:48:01ZQingyang ShiWouter van AmsterdamSacha la Bastide-van GemertTalitha FeenstraIssa J. Dahabrehhttp://arxiv.org/abs/2605.27248v1Space-filling foldover designs for order-of-addition experiments under Kendall tau distance criteria2026-05-26T16:26:25ZOrder-of-addition experiments arise when the response depends on the order in which a set of components is added. Since the number of possible orders increases factorially with the number of components, full permutation designs are rarely feasible except for small problems. This paper studies space-filling fractional designs for order-of-addition experiments based on the Kendall tau distance, a natural metric for comparing permutations through pairwise ordering disagreements. We consider the maximin Kendall tau distance criterion and related dispersion criteria, and establish their connections with statistical optimality under the pairwise ordering model and a Gaussian process model with the Mallows kernel. To construct such designs, we propose an efficient foldover simulated annealing algorithm, denoted by FSA-KD, based on swap moves in the permutation space, together with foldover and incremental updating strategies. Numerical studies show that the resulting FSA-KD designs have large minimum pairwise Kendall tau distances, denoted by k_min(D), and stable pairwise distance distributions, and perform well in surrogate modeling and permutation-based optimization tasks.2026-05-26T16:26:25ZHui ShaoYaping WangQian Xiao