A Mixed Model Approach for Estimating Regional Functional Connectivity from Voxel-level BOLD Signals

2026-06-05T16:28:55Z

Resting-state brain functional connectivity quantifies the synchrony between activity patterns of different brain regions. In functional magnetic resonance imaging, each region comprises a set of spatially contiguous voxels at which blood-oxygen-level-dependent signals are acquired. The ubiquitous Correlation of Averages (CA) estimator, and other similar metrics, are computed from spatially aggregated signals within each region, and remain the quantifications of inter-regional connectivity most used by neuroscientists. Their popularity is primarily due to computational simplicity despite their demonstrable bias and lack of statistically principled justification. By leveraging linear mixed-effects models, both inter-regional and intra-regional correlation and measurement error can be explicitly modeled as signal variability sources. A novel computational pipeline, focused on subject-level inter-regional correlation parameters of interest, is developed to address the challenges of applying maximum likelihood estimation to such structured, high-dimensional spatiotemporal data. Simulation results confirm the superiority of the proposed estimator relative to CA in terms of both decreased bias and accurate confidence interval coverage across simulation settings. The proposed method is also applied to construct individual human brain networks for subjects from a Human Connectome Project test-retest database. Concordances between inter-regional correlation estimates demonstrate the potentially substantial scientific benefits of the proposed approach that reliably produces more consistent results than CA for test-retest scans of the same subject.

Deriving the Variance-Minimizing Design for Standard Addition via c-Optimality

2026-06-05T15:48:56Z

Knowledge about optimal designs for standard addition seems to be scattered among literature and is also, at least partially, only available in mathematical literature that is not quickly accessible for readers not skilled in the field of design optimality theory. Therefore, the idea for this work was to summarize what is already available in analytical literature and to apply the respective results from optimality theory, where needed, to the special case of standard addition. It is shown, for measurement errors that are non-decreasing, e.g., are constant or increase linearly or quadratically with increasing analyte concentration, that the optimal design in the case of a linear response is a two-point design irrespective of the particular behavior of measurement error variance. In addition, it is demonstrated that the optimal allocation of measurements depends on the concrete setting, which means that the optimal distribution of measurements may deviate significantly from a 50:50 ratio. It is also investigated how the range, i.e., the largest added concentration influences the result. Last but not least, also the question of applying weighted regression is discussed and it is shown, that, in contrast to designs using more than two spiked concentrations, no weighting is necessary to achieve optimal results, when a two-point design is used. While the focus lies on the precision of the concentration estimate also the implications for the bias are investigated.

Robustly estimating heterogeneity in factorial data using Rashomon Partitions

2026-06-05T15:48:04Z

In both observational data and randomized control trials, researchers select statistical models to articulate how the outcome of interest varies with combinations of observable covariates. Choosing a model that is too simple can obfuscate important heterogeneity in outcomes between covariate groups, while too much complexity risks identifying spurious patterns. In this paper, we propose a novel Bayesian framework for model uncertainty called Rashomon Partition Sets (RPSs). The RPS consists of all models that have posterior density close to the maximum a posteriori (MAP) model. We construct the RPS by enumeration, rather than sampling, which ensures that we explore all models with high evidence in the data, even if they offer dramatically different substantive explanations. We use a l0 prior, which allows the allows us to capture complex heterogeneity without imposing strong assumptions about the associations between effects, showing this prior is minimax optimal from an information-theoretic perspective. We characterize the approximation error of (functions of) parameters computed conditional on being in the RPS relative to the entire posterior. We propose an algorithm to enumerate the RPS from the class of models that are interpretable and unique, then provide bounds on the size of the RPS. We give simulation evidence along with three empirical examples: price effects on charitable giving, heterogeneity in chromosomal structure, and the introduction of microfinance.

Learning Collapsed Patterns in Compositional Data: A Bayesian Heterogeneous Relative-Shift Approach

2026-06-05T15:17:57Z

Relative-shift regression provides a principled framework for modeling compositional covariates by quantifying how the response changes when mass is reallocated from one component to another. Yet many emerging compositional data problems extend beyond this classical setting, involving high-dimensional predictors and regression effects that vary across latent subpopulations. This complexity poses a dual challenge unmet by existing methods: recovering latent cluster structure while simultaneously achieving dimension reduction within each cluster. We propose a Bayesian heterogeneous relative-shift regression model that jointly learns latent clusters and parsimonious effect structures. Methodologically, we combine a projection-based shrinkage prior on identifiable contrasts, which induces exact coefficient ties within mixture components, with a mixture of finite mixtures prior that infers the number of clusters. Computationally, we develop a scalable hybrid MCMC algorithm that embeds a deterministic surrogate collapse operator within NUTS. Theoretically, we establish posterior consistency for both the latent partition and cluster-specific effect structures. Simulations confirm accurate recovery and strong predictive performance, and applications to cross-country macroeconomic data and spatial transcriptomics demonstrate the method's interpretability and practical utility.

Testing identification in mediation and dynamic treatment models

2026-06-05T14:07:20Z

We propose a test for the identification of causal effects in mediation and dynamic treatment models that is based on two sets of observed variables, namely covariates to be controlled for and suspected instruments, building on the test by Huber and Kueck (2022) for single treatment models. We consider models with a sequential assignment of a treatment and a mediator to assess the direct treatment effect (net of the mediator), the indirect treatment effect (via the mediator), or the joint effect of both treatment and mediator. We establish testable conditions for identifying such effects in observational data. These conditions jointly imply (1) the exogeneity of the treatment and the mediator conditional on covariates and (2) the validity of distinct instruments for the treatment and the mediator, meaning that the instruments do not directly affect the outcome (other than through the treatment or mediator) and are unconfounded given the covariates. Our framework extends to post-treatment sample selection or attrition problems when replacing the mediator by a selection indicator for observing the outcome, enabling joint testing of the selectivity of treatment and attrition. We propose a machine learning-based test to control for covariates in a data-driven manner and analyze its finite sample performance in a simulation study. Additionally, we apply our method to Slovak labor market data and find that our testable implications are not rejected for a sequence of training programs typically considered in dynamic treatment evaluations.

Principal Component Analysis for Multivariate Extremes

2026-06-05T12:25:56Z

This chapter explores ways to reduce the dimensionality of the data while preserving key information relevant to the analysis of multivariate extreme values.

One-step Outcome Imputation: An Alternative to Multiple Imputation

2026-06-05T11:41:15Z

Missing outcomes in randomized controlled trials are often handled by multiple imputation (MI). Rubin's rules are routinely used to estimate standard errors but can fail to provide valid standard error estimates for some commonly used procedures, such as reference-based imputation. We propose a one-step alternative by explicitly targeting the treatment effect implied by a given imputation model and constructing an efficient one-step estimator for that treatment effect via its influence function. Unlike Rubin's rules, this approach yields asymptotically valid inference. Moreover, the proposed method circumvents the stochastic component and computational burden of MI. We illustrate the approach with examples spanning a range of imputation models, including reference-based imputation and intercurrent-event-dependent imputation.

When can a posterior predictive check identify the learning rate? Exact degeneracy in Gaussian models and implications for Generalised Bayesian Inerence

2026-06-05T11:38:48Z

Generalised Bayesian inference tempers the likelihood by a learning rate $η$ to mitigate model misspecification, and the choice of $η$ is consequential. Zafar and Nicholls (2024) proposed selecting $η$ by a posterior predictive check (PPC): one chooses the smallest $η$ at which a log-likelihood PPC $p$-value is not rejected. An exact, finite-sample analysis of this selector on the Gaussian linear model is given. With known variance and a flat prior, the PPC $p$-value equals $P(χ^2_n > \mathrm{RSS}/σ_0^2)$ for every $η$, so the selector is $η$-invariant; under variance misspecification it is two-sided non-identifying. With unknown variance and the reference prior, the $p$-value depends only on $(n,d,η)$ and not on the realised data or the data-generating process. Consequently the selector's output is fixed before any data are seen, typically collapsing to the smallest grid value, which over-tempers and inflates predictive intervals relative to held-out selection. The phenomenon is a pivotality property specific to the Gaussian scale--location family and the reference prior; it disappears under informative priors. These results delineate the selector's scope, identify a canonical class on which it cannot identify the learning rate, and motivate a cheap, data-free pre-screening diagnostic.

Detecting Model Misspecification in Bayesian Inverse Problems via Variational Gradient Descent

2026-06-05T09:58:48Z

Bayesian inference is optimal when the statistical model is well-specified, while outside this setting Bayesian inference can catastrophically fail; accordingly a wealth of post-Bayesian methodologies have been proposed. Predictively oriented (PrO) approaches lift the statistical model $P_θ$ to an (infinite) mixture model $\int P_θ\; \mathrm{d}Q(θ)$ and fit this predictive distribution via minimising an entropy-regularised objective functional. In the well-specified setting one expects the mixing distribution $Q$ to concentrate around the true data-generating parameter in the large data limit, while such singular concentration will typically not be observed if the model is misspecified. Our contribution is to demonstrate that one can empirically detect model misspecification by comparing the standard Bayesian posterior to the PrO `posterior' $Q$, providing a novel and widely-applicable diagnostic tool for the standard Bayesian workflow. To operationalise this, we present an efficient numerical algorithm based on variational gradient descent. A simulation study, and a more detailed case study involving a Bayesian inverse problem in seismology, confirm that model misspecification can be automatically detected using this framework.

A Regularised Latent-Class Item Response Model for Detecting Measurement Non-Invariance in Ordinal Response Scales

2026-06-05T09:51:42Z

Measurement non-invariance arises when the psychometric properties of a scale differ across subgroups, undermining the validity of group comparisons. At the item level, this manifests as differential item functioning (DIF), where item responses differ across groups after controlling for the latent trait. This paper develops a framework for detecting DIF in ordinal scales without requiring known group labels or anchor items. We formulate a proportional-odds latent-class item response model in which individuals are assigned probabilistically to latent classes. DIF is captured through class-specific intercept and slope shifts, allowing both uniform and non-uniform DIF. Identification is achieved through an $\ell_1$-penalised marginal likelihood under a sparsity assumption, with estimation implemented using a tailored EM algorithm. Because class-specific slopes leave both the location and scale of each latent class unidentified, sparsity anchors the latent metric while selecting DIF effects. Simulation studies demonstrate accurate recovery of item parameters and both types of DIF. An empirical application to a personality test reveals latent subgroups with distinct response patterns and identifies items displaying potential class-specific measurement non-invariance. The framework provides a flexible approach for assessing measurement invariance in ordinal scales when comparison groups are unobserved or poorly defined.

ARMA approximation of a Non-separable Spatio-Temporal Model with Fractional Smoothnesses in Space and Time

2026-06-05T09:21:07Z

The Matérn covariance model is ubiquitous in spatial modelling, but there is no default choice for spatio-temporal modelling. In this paper, we consider the recently proposed ``diffusion-based'' extension of the spatial Matérn covariance model to a spatio-temporal non-separable covariance model that allows fractional smoothnesses in space and in time. The model is described in terms of a space-time fractional stochastic partial differential equation, but currently proposed computational approaches have strong restrictions on the possible smoothnesses in time. We propose a discretization method based on rational approximations in time to handle arbitrary smoothnesses, which leads to a vector autoregressive moving average process (VARMA). We prove that the covariance function of the approximation converges pointwise, determine explicit convergence rates as a function of spatial and temporal resolutions and the accuracy of the rational approximation, and conduct numerical verification to demonstrate small pointwise error for low orders of the VARMA process. Through a simulation study, we demonstrate that the parameters can be estimated back and that correctly specifying the temporal smoothness is especially important for forecasting. The approach is illustrated for three months of daily mean temperatures in mainland France.

Ising Models on Inhomogeneous Random Graphs: Inference, Local Asymptotic Minimaxity, and Limit of Experiments

2026-06-05T09:05:29Z

In this paper, we develop an inferential framework with sharp asymptotic optimality guarantees for Ising models on inhomogeneous random graphs in the subcritical parameter regime. We begin by characterizing the asymptotic distribution of the maximum likelihood (ML) estimate of the natural parameter, based on a single sample from the underlying model, covering both sparse and dense network regimes. Next, to overcome the computational intractability of the ML method, we propose a simple closed-form estimate obtained from a one-step approximation to the likelihood equation. We show that this estimate attains the same asymptotic distribution and variance as the ML estimate, thereby yielding a computationally efficient and asymptotically valid confidence interval for the natural parameter. We complement these inferential results by establishing a Hájek--Le Cam-type local asymptotic minimax theorem, showing that the proposed estimate achieves the smallest possible asymptotic maximum risk, both in rate and in leading constant, over shrinking neighborhoods of the true parameter. We also derive the corresponding limit of experiments. To the best of our knowledge, these are among the first sharp asymptotic optimality results for network-dependent data. Finally, we study goodness-of-fit testing for the natural parameter, deriving the local power of the likelihood ratio test and minimax detection rates. Our analysis relies on new fluctuation results for the sufficient statistic (Hamiltonian) and for the random partition function of Ising models on inhomogeneous random graphs, which are of independent interest.

Influence of continuous predictor modelling methods on prediction stability in clinical prediction model development: an empirical comparison using real clinical data

2026-06-05T08:53:48Z

Background and objective: Prediction stability is increasingly recognised as important for reliable clinical prediction model development, but the effect of continuous predictor modelling choices is unclear. This study examined how approaches to modelling continuous predictors influence prediction stability. Methods: We used a real clinical dataset of 19,418 emergency department patients to create five sample size scenarios ranging from 437 to 8,739 patients. Six methods were compared: dichotomisation at the median (DIC), tertile categorisation (TER), linear terms (LIN), quadratic terms (QUA), multivariable fractional polynomials (MFP), and extreme gradient boosting (XGB). Prediction stability was evaluated using a bootstrap-based framework. Optimism-corrected AUC and calibration were estimated through internal validation. A method was considered stable when at least 90% of individual predictions had a mean absolute prediction error (MAPE) <=5%. Results: Stability increased with sample size and varied by method. At n = 437, no method met the stability criterion; LIN was the most stable, followed by DIC. At n = 874, DIC and LIN achieved stable predictions with similar calibration, although DIC had lower AUC. At n = 1,748, QUA achieved stability, whereas MFP and XGB did not. At n = 3,496 and n = 8,739, all methods achieved stability. LIN, QUA, MFP, and XGB generally had higher AUCs than DIC and TER, while XGB showed the highest AUC but persistent miscalibration. Conclusion: Continuous predictor modelling methods appeared to influence prediction stability. LIN achieved stable predictions from the base sample size onwards, whereas QUA, MFP, and XGB required larger samples. Although XGB showed high discrimination, calibration concerns persisted. These findings suggest that, in smaller datasets, simpler approaches, particularly LIN, may provide more stable predictions.

Causal inference of Plackett-Burman designs in applications

2026-06-05T06:40:51Z

Driven by four applications of Plackett-Burman (PB) designs, this paper proposes a causal inference framework based on potential outcomes. First, we define the causal effects of the PB designs under finite populations. The Neymanian estimator of causal effects is then obtained, including the estimated variance and covariance. Furthermore, we conduct a sharp null-hypothesis test and construct the Fisherian interval using an algorithm. Finally, the proposed methods are illustrated through these applications.

Testing Equality of Conditional Distributions via Generative Models

2026-06-05T05:46:48Z

We study the problem of testing whether two conditional distributions are equal using generative models. The proposed method learns a conditional generator from each sample and uses it to create responses at covariate values observed in the other sample, allowing generated and observed responses to be compared directly. By aligning covariates through cross-generation, the approach avoids conditional density-ratio estimation and local smoothing over high-dimensional covariates. The population version of this construction yields a conditional discrepancy that characterizes equality of the two conditional distributions under suitable overlap conditions, while the sample version leads to a test statistic defined as the supremum of an RKHS-indexed empirical process with multiplier bootstrap calibration. A computationally efficient algorithm for evaluating the statistic and its bootstrap analogue is developed based on alternating maximization and the kernel trick. Theoretically, we derive the limiting distribution of the test statistic under both the null and alternative hypotheses, prove bootstrap validity and consistency of the resulting test, and show that the proposed procedure attains a double-robustness property with respect to conditional generator estimation errors. Simulations and real data applications suggest that the proposed method performs well for multivariate responses and high-dimensional covariates.