Bayesian optimal experimental design with Wasserstein information criteria

2026-05-26T19:43:43Z

Bayesian optimal experimental design (OED) provides a principled framework for selecting observations or experiments. We introduce new Bayesian design criteria based on the expected Wasserstein-$p$ distance between the prior and posterior distributions, termed Wasserstein information criteria. These criteria have many parallels with the widely used expected information gain (EIG) criterion, which instead relies on the Kullback--Leibler divergence. We show that the Wasserstein-$2$ criterion admits a closed-form solution in the linear-Gaussian setting, a property which can be used for more general approximation schemes, and contrast this solution with classical notions of Bayesian alphabetic optimality. Then we develop a stability analysis of the Wasserstein-$1$ criterion, wherein we bound errors induced by perturbations of the prior or likelihood. We partially extend this analysis to the Wasserstein-$2$ criterion. In particular, these results yield error rates for empirical approximations of the prior. We then illustrate the computability of the Wasserstein-$2$ criterion and demonstrate our approximation rates through simulations.

Model--based clustering for spherical and hyper--spherical data using elliptically symmetric distributions

2026-05-26T17:44:04Z

Model--based clustering for directional data data has attracted a lot of interest, but most methods utilize rotationally symmetric distributions. This paper suggests the use of elliptically symmetric distributions, namely the elliptically symmetric angular Gaussian and the spherical elliptically symmetric projected Cauchy distributions that were recently proposed in the literature for modelling spherical data. The expectation--maximization algorithm is employed and the inclusion of covariates is also examined. Simulation studies compare the two distributions in terms of choosing the optimal number of clusters and computational cost. We use the mixtures of these two distributions to cluster two datasets on the sphere (earthquake locations) and two hyper--spherical datasets.

Two-Phase Sampling Designs and Analysis Approaches for Ordinal Outcomes

2026-05-26T17:38:19Z

Modern clinical trials and cohort studies gather low-cost data on all participants but may have limited resources to assess expensive exposures such as biomarkers or genomic data. When interest lies in associations involving expensive exposures, two-phase designs provide a cost-effective framework by using information available on all participants to guide the targeted selection of a subset for additional measurements. We extend this framework to studies with ordinal outcomes, a common yet previously unexplored setting. We propose three outcome-informed phase 2 sampling designs -- outcome-dependent sampling (ODS), covariate-stratified ODS, and residual-dependent sampling -- that leverage phase 1 data to enrich phase 2 selection with informative subjects. We then develop analysis methods for valid and efficient estimation/inference, including conditional likelihood methods with ascertainment-corrected maximum likelihood estimation, multiple imputation, and a full likelihood method using sieve maximum likelihood estimation. Across a range of scenarios, simulation studies show that the proposed methods substantially improve efficiency over simple random sampling with standard maximum likelihood estimation. We further demonstrate their practical utility by examining the association between interleukin-6 and a four-level clinical status outcome -- discharged, hospitalized but not in the ICU, hospitalized in the ICU, and death -- 14 days after randomization into the Crystalloid Liberal or Vasopressors Early Resuscitation in Sepsis trial.

A Post-Processing Conformal Prediction Approach for Conditional Coverage via Pivotal Scores

2026-05-26T17:02:36Z

While Conformal Prediction (CP) has proven to be a powerful framework for uncertainty quantification, guaranteeing conditional coverage remains a central challenge. Although finite-sample, distribution-free conditional validity is known to be impossible without structural assumptions, we show that it is fundamentally equivalent to constructing a nonconformity score whose distribution is independent of the features. This theoretical characterization motivates PIT-CP, a new post-processing correction that maps any base nonconformity score to an approximately invariant one while preserving its geometry, interpretability, and marginal coverage. This perspective is particularly appealing in practice, since it may be neither economical nor time-effective to retrain a full generative model when a strong prediction-driven model already provides highly accurate point estimates. Our procedure reduces the problem to one-dimensional conditional density estimation on the induced score, rather than full conditional density estimation on the original outcome space. We show how to estimate this transform in practice and derive bounds on the conditional coverage gap, alongside volumetric and symmetric-difference bounds. We present known minimax-optimal conditional estimation techniques while also motivating the use of modern conditional density estimators, including Mixture Density Networks and Conditional Normalizing Flows. Finally, we empirically demonstrate on various datasets that our PIT-CP procedure matches or outperforms many state-of-the-art conformal prediction strategies with minimal effort and computational cost.

Causally-interpretable meta-analysis using aggregate data

2026-05-26T16:48:01Z

Evidence syntheses and meta-analyses are used to inform clinical practice guidelines and health economic evaluations. However, heterogeneity of treatment effects poses a significant challenge. Conventional meta-analysis addresses heterogeneity through random-effect assumptions, which are not supported by design and lead to estimates that may not apply to any real-world population. Causally-interpretable meta-analysis (CIMA) offers a rigorous framework for specification, identification, and estimation of causal effects when combining information from multiple randomized trials. Initial development of CIMA focused on using individual data from randomized trials, but such data are often unavailable in practice. Here, we propose a new version of CIMA that only requires aggregate data from trials, addressing the limitations of traditional meta-analysis methods while relying only on aggregate data. The method leverages the trials' reported estimates of marginal and one-at-a-time subgroup treatment effects and descriptive statistics for baseline covariates to build moment equations for identifying and estimating a parametric conditional average treatment effect (CATE) function. The average treatment effect in a new target population is obtained by marginalizing the CATE function over the individual covariate data that defines the target population. The method can also be used to obtain causally-interpretable indirect treatment comparisons in the target population. We establish the asymptotic properties of the method, assess its finite-sample performance in simulation studies, and illustrate the application of the method by re-analyzing a published meta-analysis for SGLT2 inhibitors in patients with heart failure.

Space-filling foldover designs for order-of-addition experiments under Kendall tau distance criteria

2026-05-26T16:26:25Z

Order-of-addition experiments arise when the response depends on the order in which a set of components is added. Since the number of possible orders increases factorially with the number of components, full permutation designs are rarely feasible except for small problems. This paper studies space-filling fractional designs for order-of-addition experiments based on the Kendall tau distance, a natural metric for comparing permutations through pairwise ordering disagreements. We consider the maximin Kendall tau distance criterion and related dispersion criteria, and establish their connections with statistical optimality under the pairwise ordering model and a Gaussian process model with the Mallows kernel. To construct such designs, we propose an efficient foldover simulated annealing algorithm, denoted by FSA-KD, based on swap moves in the permutation space, together with foldover and incremental updating strategies. Numerical studies show that the resulting FSA-KD designs have large minimum pairwise Kendall tau distances, denoted by k_min(D), and stable pairwise distance distributions, and perform well in surrogate modeling and permutation-based optimization tasks.

Posterior Quantification of Borrowing from Multiple Historical Control Data in Bayesian Dynamic Borrowing Methods: A Scoping Review

2026-05-26T15:36:28Z

Bayesian dynamic borrowing methods incorporate historical control data into current clinical trial analyses while allowing the degree of borrowing to depend on the compatibility between historical and current data. Although many methods have been proposed, the degree of borrowing is often difficult to interpret, especially when multiple historical control sources are available. This scoping review focuses on posterior quantification of borrowing from multiple historical controls. We discuss overall borrowing summaries based on effective historical sample size, together with method-specific source-level summaries of borrowing, information contribution, or compatibility arising from power priors, unit information priors, multisource exchangeability models, Dirichlet process mixture models, and potential bias models. We distinguish posterior borrowing measures from quantities describing prior information allocation or source-specific conflict. Two case studies, one with a binary endpoint and one with a continuous endpoint, illustrate that methods with broadly similar posterior treatment effect estimates may differ in both the overall amount and source-specific pattern of borrowing. These examples show that large overall borrowing may reflect selective borrowing from compatible historical sources rather than uniform borrowing from all sources. We recommend reporting treatment effect estimates together with overall and source-specific borrowing summaries, when available, to improve transparency in posterior inference.

Bernstein-von Mises Theorem for Sparse Generalized Linear Model

2026-05-26T15:06:11Z

We study spike-and-slab priors for generalized linear models with possible grouped sparsity. The main result is an oracle Bernstein--von Mises theorem for the fractional posterior under supportwise likelihood assumptions. The proof develops sparse local asymptotic normality and Laplace approximation around support-specific pseudo-true centers, and combines them with fixed-prior mass, support penalization, recovery geometry, and beta-min separation to obtain contraction, support recovery, Gaussian mixture approximation, and collapse to the oracle Gaussian law. Model-entry verifications are given for Gaussian regression and for logistic, Poisson, probit, Gamma log-link, and negative-binomial log-link regression under stated sufficient conditions. The ordinary posterior is treated only through restricted Gaussian and canonical-link extensions, with coverage under additional active-dimension and moment conditions.

Copula and spatial-regularized variational autoencoder for mapping disease comorbidity in West Africa

2026-05-26T14:55:51Z

Geospatial health disproportionality remains a critical public health concern, as communities face heterogeneous illness risks due to varying exposures to adverse socioeconomic and environmental conditions. While statistical models have been adopted to identify risk factors, studies that account for the complex, non-linear dependencies and spatial regularities inherent in comorbid disease patterns are underdeveloped. In this work, we propose a novel spatially regularized variational autoencoder (VAE) to characterize and map the geospatial disproportion of childhood comorbidity in West Africa, focusing on diarrhea, fever, and acute respiratory infection (ARI). To model dependence between these conditions, this study integrates a bivariate Gumbel copula into the VAE framework, enabling flexible modeling of asymmetric dependence and quantification of joint and conditional morbidity risks. Additionally, covariate effects within the framework were quantified to facilitate epidemiological interpretation of risk factors. The proposed method was benchmarked against commonly used methods and applied to characterize comorbidity in West Africa using the Demographic and Health Survey data. Findings reveal pronounced spatial heterogeneity in the likelihood of comorbidity among West African children, with the strongest co-occurrence observed between fever and ARI. Household wealth, maternal education, and access to improved water sources were associated with the likelihood of comorbidity. These patterns highlight high-risk areas and underscore the need for targeted, location-specific public health interventions.

Estimation and Inference for Win Measures with Multiple Ordinal Endpoints Subject to Missingness

2026-05-26T14:34:45Z

Win measures, including the win ratio (WR), win odds (WO), net benefit (NB), and desirability of outcome ranking (DOOR), are increasingly used in randomized clinical trials with multiple hierarchical ordinal endpoints. In practice, however, one or more component endpoints may have missing data. The standard pairwise-comparison approach, which treats pairs with missing outcomes as ties, can produce biased estimates, even if the data are missing completely at random (MCAR). Although inverse probability of censoring weighting (IPCW) methods have been developed for censored survival endpoints, corresponding methods for addressing missing hierarchical ordinal endpoints are not yet available. To address this gap, we develop inverse probability weighting (IPW) and augmented IPW (AIPW) estimators for win measures with hierarchical ordinal endpoints subject to missing data, allowing missingness to depend on treatment assignment and baseline covariates. The IPW estimator corrects bias by reweighting complete observed outcomes using joint non-missingness probabilities involved in estimating the joint cell probabilities that define the win measures. The AIPW estimator additionally incorporates outcome modeling, improving efficiency and achieving double robustness. For inference, we derive closed-form variance estimators for both methods based on influence functions. Simulation studies show that the standard approach can be substantially biased, whereas the proposed IPW and AIPW estimators remain consistent with near-nominal coverage. Furthermore, the AIPW estimator is generally more efficient than IPW estimator. Applications to the SCOUT-CAP and ACTT-1 trials illustrate the practical utility of the proposed methods. An R package, WinMO, is provided for implementation.

Causal Representation Learning for Generalisable Recommendation

2026-05-26T13:58:36Z

Predictive models trained on observational data often fail to generalise to the distributions they encounter when deployed, especially when the training data is a product of the system being optimised. Recommender systems are a canonical example: they are trained on interaction logs confounded by the deployed policy, past user behaviour, and platform filtering. As a result, the training distribution differs substantially from the candidate distribution scored at serving time, a gap that makes offline metrics unreliable predictors of online performance. We address the distribution shift problem with a method motivated by causal representation learning (CRL). We propose an information-theoretic disentanglement criterion and prove that its optimum depends only on the causal components of the input. We then derive a tractable variational lower bound that makes the criterion optimisable from finite observational data alone. The scope of our method is narrower than that of much of the CRL literature, in that we target better generalisation under distribution shift, not full identification of all latent causal factors. This narrower target is what makes the method practical, requiring only the existing confounded logs, applying to any standard supervised model, and adding no inference-time cost. Our headline evaluation is an A/B test with millions of users on Spotify, applied to a production ranker for personalised playlist generation. A capacity-matched CRL variant performed on par offline but delivered substantial online gains in listener engagement. Complementary evidence on the public KuaiRand recommendation dataset and a synthetic benchmark with known causal structure shows the same pattern: offline parity with baseline, gains under distribution shift. Across all three settings, adding our causal disentanglement objective yields meaningfully better out-of-distribution generalisation.

Conformalized Large-Scale Selective Inference with Informative and Trustworthy Prediction Sets

2026-05-26T13:29:40Z

In large-scale prediction problems, exhaustively following up on all test units is often impractical and inefficient, motivating a selective reporting strategy that fulfills the dual requirements of informativeness and trustworthiness. Within the InfoFCR (Informative prediction with False Coverage Rate control) framework, we propose SCIP (Selective Conformal Inference for Informative Predictions), a procedure built on three key components: (i) an informative set constructor that tailors prediction sets to individual test units according to user-specified informativeness constraints; (ii) a trust score that provides a principled quantification of the trustworthiness of candidate informative sets; and (iii) generalized conformal p-values that are used to perform FCR analysis for selecting the most promising candidates. We establish that SCIP guarantees finite-sample FCR control and is asymptotically anti-conservative, achieving higher statistical power than existing methods. The framework is highly versatile, accommodating a wide range of error metrics across both regression and classification tasks. Extensive numerical experiments on simulated and real data demonstrate the effectiveness of our approach.

Towards Continuous-time Causal Foundation Models

2026-05-26T12:06:04Z

Extending discrete-time causal Prior-data Fitted Networks for time series to continuous time invites writing the mechanism as a stochastic differential equation (SDE) -- but if the SDE is integrated \emph{once per observation gap}, the trajectory law depends on when it is observed, and the prior remains a discrete-time Markov model in SDE clothing. We propose a precise continuity criterion -- trajectory-law invariance to the observation schedule -- together with a three-tier taxonomy (discrete; naive observation-grid integration; fine-grid integration with decoupled observation) and a construction realising the top tier on a random DAG with OU or small-MLP nonlinear drifts, irregular observation schedules, and hard / soft / time-varying interventions. A $2 \times 2$ encoder $\times$ integrator ablation, run independently on a linear and a nonlinear prior, finds fine-grid integration beats naive on 8/8 cells (sign-consistency $p < 1/256$) with the gap growing as the eval grid refines; the encoder axis is null with fine integration but time-aware-leading with naive. We release the prior and a preliminary zero-shot protocol on pharmacokinetic and physical-system data.

Robust ensemble Kalman filtering under observation noise misspecification via diffusion score matching

2026-05-26T11:42:55Z

We address the problem of observation noise misspecification in Bayesian filtering of dynamical systems via recent advances in generalised Bayesian inference. Mis-match in tail decay between the true data generating process and an assumed observation model, often showing via frequent outliers, can strongly impact Bayesian updates and analysis in Kalman filtering. Existing approaches often employ detect-and-delete-schemes or covariance inflation to avoid assimilation of influential instances of mis-specification. In challenging settings where the analysis updates are barely sufficient to counteract the induced forecast uncertainty, these strategies may destabilize or struggle to provide reliable uncertainty quantification. We consider a novel Kalman filter adjusting information processing in the analysis step by employing diffusion score matching for inference to obtain robustness while maintaining well-quantified uncertainties. We provide theoretical properties of the diffusion score matching Kalman filter in linear Gaussian state space systems covering conjugacy and closed form parameter update in the analysis step, robustness, covariance stability, and tuning as well as high-dimensional consistency. We derive ensemble approximations via stochastic and deterministic coupling as well as implementing localization to obtain EnKF, ESRF and LETKF varieties. We evaluate the methods in appropriate simulation studies on target-tracking, the chaotic Lorenz 63 system and the Lorenz 96 system in 40 dimensions. Our insights highlight a critical trade-off between robustness and stability in Bayesian filtering. Methods employing generalized Bayesian inference can navigate this balance and improve data assimilation in challenging environments combining non-linear dynamics and potentially non-Gaussian observation noise.

A warning system for risk prediction of metabolic syndrome in a healthy population of blood donors

2026-05-26T10:56:30Z

Metabolic syndrome is a complex clinical condition characterized by the simultaneous presence of multiple metabolic risk factors and represents a major public health concern. The syndrome develops silently and may remain undiagnosed for long periods, highlighting the importance of investigating early metabolic alterations before overt disease onset. Longitudinal monitoring of predominantly healthy individuals may help identify metabolic risk early. The paper proposes a Bayesian statistical model to estimate the probability of metabolic syndrome among blood donors during pre-donation screening, incorporating information collected at previous visits. Using longitudinal data from one of the main blood donor associations in Italy, AVIS Milan, we analyze repeated clinical and lifestyle measurements from a predominantly healthy population of donors. In particular, we fit a Bayesian multivariate model that jointly represents the logarithm of the five diagnostic components of metabolic syndrome. The model accounts for within-donor dependence across repeated visits and provides probabilistic estimates of individual risk. Our framework aims to provide clinicians at AVIS Milan with an interpretable traffic-light warning system (low, intermediate, high risk) during pre-donation screening to facilitate the identification of individuals at risk of metabolic syndrome at future visits and to support targeted preventive interventions during routine donor assessment, ultimately contributing to a long-term reduction in healthcare costs for the Italian national healthcare system.