https://arxiv.org/api/sEVnJsHmwNFF7ovXP+QGa4Y+bCE 2026-04-06T11:42:26Z 34888 300 15 http://arxiv.org/abs/2511.09500v4 Distributional Shrinkage I: Universal Denoiser Beyond Tweedie's Formula 2026-03-24T22:28:38Z We study the problem of denoising when only the noise level is known, not the noise distribution. Independent noise $Z$ corrupts a signal $X$, yielding the observation $Y = X + σZ$ with known $σ\in (0,1)$. We propose \emph{universal} denoisers, agnostic to both signal and noise distributions, that recover the signal distribution $P_X$ from $P_Y$. When the focus is on distributional recovery of $P_X$ rather than on individual realizations of $X$, our denoisers achieve order-of-magnitude improvements over the Bayes-optimal denoiser derived from Tweedie's formula, which achieves $O(σ^2)$ accuracy. They shrink $P_Y$ toward $P_X$ with $O(σ^4)$ and $O(σ^6)$ accuracy in matching generalized moments and densities. Drawing on optimal transport theory, our denoisers approximate the Monge--Ampère equation with higher-order accuracy and can be implemented efficiently via score matching. Let $q$ denote the density of $P_Y$. For distributional denoising, we propose replacing the Bayes-optimal denoiser, $$\mathbf{T}^*(y) = y + σ^2 \nabla \log q(y),$$ with denoisers exhibiting less-aggressive distributional shrinkage, $$\mathbf{T}_1(y) = y + \frac{σ^2}{2} \nabla \log q(y),$$ $$\mathbf{T}_2(y) = y + \frac{σ^2}{2} \nabla \log q(y) - \frac{σ^4}{8} \nabla \!\left( \frac{1}{2} \| \nabla \log q(y) \|^2 + \nabla \cdot \nabla \log q(y) \right)\!.$$ 2025-11-12T17:20:42Z 27 pages, 5 figures Tengyuan Liang http://arxiv.org/abs/2603.23726v1 Inverse Probability Weighting of Count Exposures in the Presence of Missing Data: A Simulation Study 2026-03-24T21:22:52Z Inverse probability of treatment weighting (IPTW) is widely used to estimate causal effects, but guidance is limited for count exposures. It is also unclear how IPTW performs when combined with multiple imputation in this context. In this study, we evaluated five IPTW methods applied to count exposures: multinomial binning, parametric and non-parametric covariate balancing propensity scores (CBPS, npCBPS), generalised boosted models (GBM), and energy balancing. Our simulations were informed by an example using data from the 1970 British Cohort Study, aiming to estimate the effect of psychological distress, measured as a count of symptoms at age 34, on self-reported longstanding illness at age 42. We compared these approaches on bias, coverage, effective sample size, and other metrics under truncated negative binomial and Poisson exposure distributions. We also assessed the performance of Rubin's rules under different missingness mechanisms. Under complete data, multinomial, CBPS, GBM, and energy weights produced low bias and near-nominal coverage, whereas npCBPS resulted in bias and poor coverage due to extreme weights. When data were missing completely at random, similar performance patterns were observed for IPTW with multiple imputation. Under missing at random, bias increased with higher missingness, but this was present for both IPTW and covariate-adjusted regression, possibly reflecting a limitation of the imputation model rather than a failure of IPTW. Overall, these findings support the use of multinomial, CBPS, GBMs, and energy weights for count exposures in similar settings while highlighting trade-offs between these methods and the need for imputation models accommodating right-truncated overdispersed counts. 2026-03-24T21:22:52Z Martin N. Danka Jessica K. Bone George B. Ploubidis Richard J. Silverwood http://arxiv.org/abs/2409.16003v5 Easy Conditioning far beyond Gaussian 2026-03-24T20:45:59Z Multivariate Gaussian distributions enjoy Gaussian conditional distributions that makes conditioning easy: conditioning boils down to implementing analytical formulae for conditional means and covariances. For more general distributions, however, conditional distributions may not be available in analytical form and require demanding and approximate numerical approaches. Primarily motivatedby probabilistic imputation problems, we review and discuss families of multivariate distributions that do enjoy analytical conditioning, also providing a few counter-examples. Proving that trans-dimensional stability under conditioning extends to mixtures and transformations, we demonstrate that a broader class of multivariate distributions inherit easy conditioning properties. Building on this insight, we developed a generative method to estimate conditional distributions from data by first fitting a flexible joint distribution using copulas and then performing analytical conditioning in a latent space. In our applications, we specifically opt for Gaussian Mixture Copula Models (GMCM), comparing in turn various fitting strategies. Through simulations and real-world data experiments, we showcase the efficacy of our method in tasks involving conditional density estimation and data imputation. We also touch upon links to Gaussian process modelling and how stability by mixtures and transformations and mixtures carries over towards easy conditioning of non-Gaussian processes. 2024-09-24T12:04:28Z 36 pages, 13 figures Antoine Faul David Ginsbourger Ben Spycher http://arxiv.org/abs/2507.23743v2 Relative Bias Under Imperfect Identification in Observational Causal Inference 2026-03-24T20:23:47Z To conduct causal inference in observational settings, researchers must rely on certain identifying assumptions. In practice, these assumptions are unlikely to hold exactly. This paper considers the bias of selection-on-observables, instrumental variables, and proximal inference estimates under violations of their identifying assumptions. We develop bias expressions for IV and proximal inference that show how violations of their respective assumptions are amplified by any unmeasured confounding in the outcome variable. We propose a set of sensitivity tools that quantify the sensitivity of different identification strategies, and an augmented bias contour plot visualizes the relationship between these strategies. We argue that the act of choosing an identification strategy implicitly expresses a belief about the degree of violations that must be present in alternative identification strategies. Even when researchers intend to conduct an IV or proximal analysis, a sensitivity analysis comparing different identification strategies can help to better understand the implications of each set of assumptions. Throughout, we compare the different approaches on a re-analysis of the impact of state surveillance on the incidence of protest in Communist Poland. 2025-07-31T17:29:20Z 20 pages, 3 figures, plus references and appendices Melody Huang Cory McCartan http://arxiv.org/abs/2603.23688v1 Adaptive Gaussian Process Search for Simulation-Based Sample Size Estimation in Clinical Prediction Models: Validation of the pmsims R Package 2026-03-24T19:53:37Z Background: Determining an adequate sample size is essential for developing reliable and generalisable clinical prediction models, yet practical guidance on selecting appropriate methods remains limited. Existing analytical and simulation-based approaches often rely on restrictive assumptions and focus on mean-based criteria. We present and validate pmsims, an R package that uses Gaussian process surrogate modelling to provide a flexible and computationally efficient simulation-based framework for sample size determination across diverse prediction settings. Methods: We conducted a comprehensive simulation study with two aims. First, we compared three search engines implemented in pmsims: a Gaussian process-based adaptive method, a deterministic bisection method, and a hybrid approach, across binary, continuous, and survival outcomes. Second, we benchmarked the best-performing pmsims engine against existing analytical (pmsampsize) and simulation-based (samplesizedev) methods, evaluating recommended sample sizes, computational time, and achieved performance on large independent validation datasets. Results: The Gaussian process-based method consistently produced the most stable sample size estimates, particularly in low-signal, high-dimensional settings. In benchmarking, pmsims achieved performance close to prespecified targets across all outcome types, matching simulation-based approaches and outperforming analytical methods in more challenging scenarios. Conclusions: pmsims provides an efficient and flexible framework for principled sample size planning in clinical prediction modelling, requiring fewer model evaluations than non-adaptive simulation approaches. 2026-03-24T19:53:37Z 27 pages, 2 main-text figures, 16 supplementary figures, 9 tables, preprint Oyebayo Ridwan Olaniran Diana Shamsutdinova Sarah Markham Felix Zimmer Daniel Stahl Gordon Forbes Ewan Carr http://arxiv.org/abs/2002.12586v7 Nonparametric Empirical Bayes Estimation on Heterogeneous Data 2026-03-24T18:24:57Z The simultaneous estimation of many parameters based on data collected from corresponding studies is a key research problem that has received renewed attention in the high-dimensional setting. Many practical situations involve heterogeneous data where heterogeneity is captured by a nuisance parameter. Effectively pooling information across samples while correctly accounting for heterogeneity presents a significant challenge in large-scale estimation problems. We address this issue by introducing the ``Nonparametric Empirical Bayes Structural Tweedie" (NEST) estimator, which efficiently estimates the unknown effect sizes and properly adjusts for heterogeneity via a generalized version of Tweedie's formula. For the normal means problem, NEST simultaneously handles the two main selection biases introduced by heterogeneity: one, the selection bias in the mean, which cannot be effectively corrected without also correcting for, two, selection bias in the variance. We develop theory to show that NEST is asymptotically as good as the optimal Bayes rule that uniquely minimizes a weighted squared error loss. In our simulation studies NEST outperforms competing methods, with much efficiency gains in many settings. The proposed method is demonstrated on estimating the batting averages of baseball players and Sharpe ratios of mutual fund returns. Extensions to other members of the two-parameter exponential family are discussed. 2020-02-28T07:48:39Z Proof of Theorem 1 revised Trambak Banerjee Luella J. Fu Gareth M. James Gourab Mukherjee Wenguang Sun http://arxiv.org/abs/2510.16673v2 Identification and estimation of causal mechanisms in cluster-randomized trials with post-treatment confounding using Bayesian nonparametrics 2026-03-24T17:32:17Z Causal mediation analysis in cluster-randomized trials (CRTs) is essential for explaining how cluster-level interventions affect individual outcomes, yet it is complicated by interference, post-treatment confounding, and hierarchical covariate adjustment. We develop a Bayesian nonparametric framework that simultaneously accommodates interference and a post-treatment confounder that precedes the mediator. Identification is achieved through a multivariate Gaussian copula that replaces cross-world independence with a single dependence parameter, yielding a built-in sensitivity analysis to residual post-treatment confounding. For estimation, we introduce a nested common atoms enriched Dirichlet process (CA-EDP) prior that integrates the Common Atoms Model (CAM) to share information across clusters while capturing between- and within-cluster heterogeneity, and an Enriched Dirichlet Process (EDP) structure delivering robust covariate adjustment without impacting the outcome model. We provide formal theoretical support for our prior by deriving the model's key distributional properties, including its partially exchangeable partition structure, and by establishing convergence guarantees for the practical truncation-based posterior inference strategy. We demonstrate the performance of the proposed methods in simulations and provide further illustration through a reanalysis of a completed CRT. 2025-10-19T00:31:43Z 78 pages Yuki Ohnishi Michael J. Daniels Lei Yang Fan Li http://arxiv.org/abs/2603.16146v2 Deep Adaptive Model-Based Design of Experiments 2026-03-24T16:20:48Z Model-based design of experiments (MBDOE) is essential for efficient parameter estimation in nonlinear dynamical systems. However, conventional adaptive MBDOE requires costly posterior inference and design optimization between each experimental step, precluding real-time applications. We address this by combining Deep Adaptive Design (DAD), which amortizes sequential design into a neural network policy trained offline, with differentiable mechanistic models. For dynamical systems with known governing equations but uncertain parameters, we extend sequential contrastive training objectives to handle nuisance parameters and propose a transformer-based policy architecture that respects the temporal structure of dynamical systems. We demonstrate the approach on four systems of increasing complexity: a fed-batch bioreactor with Monod kinetics, a Haldane bioreactor with uncertain substrate inhibition, a two-compartment pharmacokinetic model with nuisance clearance parameters, and a DC motor for real-time deployment. 2026-03-17T05:53:09Z Arno Strouwen Sebastian Micluţa-Câmpeanu http://arxiv.org/abs/2602.17503v2 An extension to reversible jump Markov chain Monte Carlo for change point problems with heterogeneous temporal dynamics 2026-03-24T16:07:39Z Detecting brief changes in time-series data remains a major challenge in fields where short-lived states carry meaning. In single-molecule localisation microscopy, this problem is particularly acute as fluorescent molecules used to tag protein oligomers display heterogenous photophysical behaviour that can complicate photobleach step analysis; a key step in resolving nanoscale protein organisation. Existing methods often require extensive filtering or prior calibration, and can fail to accurately account for blinking or reversible dark states that may contaminate downstream analysis. In this paper, an extension to RJMCMC is proposed for change point detection with heterogeneous temporal dynamics. This approach is applied to the problem of estimating per-frame active fluorophore counts from one-dimensional integrated intensity traces derived from Fluorescence Localisation Imaging with Photobleaching (FLImP), where compound change point pair moves are introduced to better account for short-lived events known as blinking and dark states. The approach is validated using simulated and experimental data, demonstrating improved accuracy and robustness when compared with current photobleach step analysis methods and with the existing analysis approach for FLImP data. This Compound RJMCMC (CRJMCMC) algorithm performs reliably across a wide range of fluorophore counts and signal-to-noise conditions, with signal-to-noise ratio (SNR) down to 0.001 and counts as high as nineteen fluorophores, while also effectively estimating low counts observed when studying EGFR oligomerisation. Beyond single molecule imaging, this work has applications for a variety of time series change point detection problems with heterogeneous state persistence. For example, electrocorticography brain-state segmentation, fault detection in industrial process monitoring and realised volatility in financial time series. 2026-02-19T16:18:10Z Emily Gribbin Benjamin Davis Daniel Rolfe Hannah Mitchell http://arxiv.org/abs/2603.23374v1 Shape-Adaptive Conditional Calibration for Conformal Prediction via Minimax Optimization 2026-03-24T16:05:43Z Achieving valid conditional coverage in conformal prediction is challenging due to the theoretical difficulty of satisfying pointwise constraints in finite samples. Building upon the characterization of conditional coverage through marginal moment restrictions, we introduce Minimax Optimization Predictive Inference (MOPI), a framework that generalizes prior work by optimizing over a flexible class of set-valued mappings during the calibration phase, rather than simply calibrating a fixed sublevel set. This minimax formulation effectively circumvents the structural constraints of predefined score functions, achieving superior shape adaptivity while maintaining a principled connection to the minimization of mean squared coverage error. Theoretically, we provide non-asymptotic oracle inequalities and show that the convergence rate of the coverage error attains the optimal order under regular conditions. The MOPI also enables valid inference conditional on sensitive attributes that are available during calibration but unobserved at test time. Empirical results on complex, non-standard conditional distributions demonstrate that MOPI produces more efficient prediction sets than existing baselines. 2026-03-24T16:05:43Z Yajie Bao Chuchen Zhang Zhaojun Wang Haojie Ren Changliang Zou http://arxiv.org/abs/2312.10618v2 Sparse Learning and Class Probability Estimation with Weighted Support Vector Machines 2026-03-24T16:03:19Z Classification and probability estimation are fundamental tasks with broad applications across modern machine learning and data science, spanning fields such as biology, medicine, engineering, and computer science. Recent development of weighted Support Vector Machines (wSVMs) has demonstrated considerable promise in robustly and accurately predicting class probabilities and performing classification across a variety of problems (Wang et al., 2008). However, the existing framework relies on an $\ell^2$-norm regularized binary wSVMs optimization formulation, which is designed for dense features and exhibits limited performance in the presence of sparse features with redundant noise. Effective sparse learning thus requires prescreening of important variables for each binary wSVM to ensure accurate estimation of pairwise conditional probabilities. In this paper, we propose a novel class of wSVMs frameworks that incorporate automatic variable selection with accurate probability estimation for sparse learning problems. We developed efficient algorithms for variable selection by solving either the $\ell^1$-norm or elastic net regularized wSVMs optimization problems. Class probability is then estimated either via the $\ell^2$-norm regularized wSVMs framework applied to the selected variables, or directly through elastic net regularized wSVMs. The two-step approach offers a strong advantage in simultaneous automatic variable selection and reliable probability estimators with competitive computational efficiency. The elastic net regularized wSVMs achieve superior performance in both variable selection and probability estimation, with the added benefit of variable grouping, at the cost of increases compensation time for high dimensional settings. The proposed wSVMs-based sparse learning methods are broadly applicable and can be naturally extended to $K$-class problems through ensemble learning. 2023-12-17T06:12:33Z Liyun Zeng Hao Helen Zhang http://arxiv.org/abs/2511.15427v2 Tractable Estimation of Nonlinear Panels with Interactive Fixed Effects 2026-03-24T15:21:16Z Interactive fixed effects are routinely controlled for in linear panel models. While an analogous fixed effects (FE) estimator for nonlinear models has been available in the literature (Chen, Fernandez-Val and Weidner, 2021), it sees much more limited use in applied research because its implementation involves solving a high-dimensional non-convex problem. In this paper, we complement the theoretical analysis of Chen, Fernandez-Val and Weidner (2021) by providing a new computationally efficient estimator that is asymptotically equivalent to their estimator. Unlike the previously proposed FE estimator, our estimator avoids solving a high-dimensional optimization problem and can be feasibly computed in large nonlinear panels. Our proposed method involves two steps. In the first step, we convexify the optimization problem using nuclear norm regularization (NNR) and obtain preliminary NNR estimators of the parameters, including the fixed effects. Then, we find the global solution of the original optimization problem using a standard gradient descent method initialized at these preliminary estimates. Thus, in practice, one can simply combine our computationally efficient estimator with the inferential theory provided in Chen, Fernandez-Val and Weidner (2021) to construct confidence intervals and perform hypothesis testing; we also provide an R package for empirical implementation. 2025-11-19T13:26:48Z Andrei Zeleneev Weisheng Zhang http://arxiv.org/abs/2603.23309v1 Tail-Calibrated Estimation of Extreme Quantile Treatment Effects 2026-03-24T15:13:44Z Extreme quantile treatment effects (eQTEs) measure the causal impact of a treatment on the tails of an outcome distribution and are central for studying rare, high-impact events. Standard QTE methods often fail in extreme regimes due to data sparsity, while existing eQTE methods rely on restrictive tail assumptions or on interior-quantile theory. We propose the Tail-Calibrated Inverse Estimating Equation (TIEE) framework, which combines information across quantile levels and anchors the tail using extreme value models within a unified estimating equation approach. We establish asymptotic properties of the resulting estimator and evaluate its performance through simulation under different tail behaviours and model misspecifications. An application to extreme precipitation in the Austrian Alps illustrates how TIEE enables observational causal attribution for very rare events under anthropogenic warming. More broadly, the proposed framework establishes a new foundation for causal inference on rare, high-impact outcomes, with relevance across environmental risk, economics, and public health. 2026-03-24T15:13:44Z Mengran Li Daniela Castro-Camilo http://arxiv.org/abs/2603.23294v1 Granger Causality in Expectiles: an M-vine copula test 2026-03-24T14:56:07Z A model-free measure of Granger causality in expectiles is proposed, generalizing the traditional mean-based measure to arbitrary positions of the conditional distribution. Expectiles are the only law-invariant risk measures that are both coherent and elicitable, making them particularly well-suited for studying distributional Granger causality where risk quantification and forecast evaluation are both relevant. Based on this measure, a test is developed using M-vine copula models that accounts for multivariate Granger causality with $d+1$ series under non-linear and non-Gaussian dependence, without imposing parametric assumptions on the joint distribution. Strong consistency of the test statistic is established under some regularity conditions. In finite samples, simulations show accurate size control and power increasing with sample size. A key advantage is the joint testing capability: causal relationships invisible to pairwise tests can be detected, as demonstrated both theoretically and empirically. Two applications to international stock market indices at the global and Asian regional level illustrate the practical relevance of the proposed framework. 2026-03-24T14:56:07Z Roberto Fuentes-Martínez Irene Crimaldi http://arxiv.org/abs/2603.23277v1 A reduced rank model for spatial categorical data with many classes 2026-03-24T14:40:46Z We develop an identifiable reduced-rank spatial multinomial model for categorical data with many classes. The model represents class-specific spatial effects through a low-dimensional set of shared latent factors, substantially reducing parameter dimension while preserving joint dependence across classes. Because standard conjugate and Pólya-Gamma methods fail under this factorization, we propose a Gibbs sampler using Laplace-approximation proposals within Metropolis-Hastings updates. Simulation studies examine dimension selection and the accuracy of the Laplace proposals. An application to dominant tree species mapping in the Blue Ridge Mountains demonstrates scalable inference and flexible joint predictions for individual classes, class unions, and area-level summaries. 2026-03-24T14:40:46Z Paul B May Andrew Simpson Semhar Michael