https://arxiv.org/api/J73XwTeoAUxBd1rOXvLENH2edE42026-06-18T10:57:56Z3629679515http://arxiv.org/abs/2605.26288v1Beyond Differences: Doubly Robust Meta-Learners for Ratio-Based Treatment Effects2026-05-25T19:24:57ZWhen treatment effects are naturally expressed as ratios -- as in medicine, pricing, and marketing -- the ratio-based CATE $τ(x) = E[Y|W=1,X=x] / E[Y|W=0,X=x]$ is the appropriate estimand. Yet existing estimators either impose a log-linear parametric structure or apply generic regression without robustness guarantees for this functional. We introduce the Q-Learner, which decomposes $τ(x)$ into a product of two odds ratios, reducing ratio-CATE estimation for binary outcomes to two propensity classification tasks. We further derive doubly robust augmentations for both S/T- and Q-style ratio learners and characterize their distinct robustness properties. In benchmarks on seven RCT datasets, the Q-Learner is the most consistently competitive method in low-conversion regimes, where its propensity-only construction sidesteps the imbalanced regression that hurts outcome-based estimators. On four observational datasets, where propensity must be estimated and confounding cannot be ruled out, the DR learners introduced here decisively come out on top, making them practitioners' natural default for confounded observational data.2026-05-25T19:24:57Z13+5 pages, 5 figures, 6 tables. Code: https://github.com/michaelfuchs90/ratiobasedcateMichael FuchsDominik Kreisshttp://arxiv.org/abs/2605.26253v1Length-biased Birnbaum-Saunders quantile regression with application to water evaporation2026-05-25T18:24:09ZLength-biased distributions arise naturally in environmental, reliability, and economic studies where the sampling mechanism favors larger observational units. In this paper, we propose a quantile regression model based on the length-biased Birnbaum--Saunders (QLBS) distribution. The model is constructed through a reparameterization of the length-biased Birnbaum--Saunders distribution in terms of its quantile function, thereby allowing direct interpretation of covariate effects on conditional quantiles of the response variable. We derive the log-likelihood function and the corresponding score equations, and obtain maximum likelihood estimators via numerical optimization. Asymptotic and bootstrap confidence intervals are considered. Two types of residuals are proposed for model assessment, namely the generalized Cox--Snell and randomized quantile residuals. An elaborate Monte Carlo simulation study is carried out to evaluate the performance of the maximum likelihood estimators for several sample sizes and quantile levels. The proposed methodology is illustrated with a real meteorological data set from Brazil.2026-05-25T18:24:09Z21 pages, 3 figuresHelton SauloTailine NonatoRoberto Vilahttp://arxiv.org/abs/2510.07128v5A General Framework for Joint Multi-State Models2026-05-25T17:58:14ZConventional joint modeling approaches generally characterize the relationship between longitudinal biomarkers and discrete event occurrences within terminal, recurring or competing risk settings, thereby offering a limited representation of complex, multi-state trajectories.
We propose a general multi-state joint modeling framework that unifies longitudinal biomarker dynamics with multi-state time-to-event processes defined on arbitrary directed graphs. The proposed framework also accomodates nonlinear longitudinal submodels and scalable inference via stochastic gradient descent. This formulation encompasses both Markovian and semi-Markovian transition structures, allowing recurrent cycles and terminal absorptions to be naturally represented. The longitudinal and event processes are linked through shared latent structures within nonlinear mixed-effects models, extending classical joint modeling formulations.
We derive the complete likelihood, model selection criteria, and develop scalable inference procedures based on stochastic gradient descent to enable high-dimensional and large-scale applications. In addition, we formulate a dynamic prediction framework that provides individualized state-transition probabilities and personalized risk assessments along complex event trajectories.
Through simulation and application to the PAQUID cohort, we demonstrate accurate parameter recovery and individualized prediction.2025-10-08T15:24:51Z34 pages, 12 figuresFélix LaplanteChristophe Ambroisehttp://arxiv.org/abs/2604.05639v3Estimating Dynamic Marginal Policy Effects under Sequential Unconfoundedness2026-05-25T17:21:41ZWe develop methods for estimating how infinitesimal policy changes affect long-term outcomes in dynamic systems. We show that dynamic marginal policy effects (MPEs) can be identified via tractable reduced-form expressions, and can be estimated under a general sequential unconfoundedness assumption. We also propose a doubly robust estimator for dynamic MPEs. Our approach does not require observing full dynamic state information (as is typically assumed for off-policy evaluation in Markov decision processes), and does not incur an exponential curse of horizon (as is typical in non-Markovian off-policy evaluation). We demonstrate practicality and robustness of our approach in a number of simulations, including one motivated by a dynamic pricing application where people use past prices to form a reference level for current prices.2026-04-07T09:41:11ZFix typosI-han LaiStefan Wagerhttp://arxiv.org/abs/2604.10845v2Learning Preferences from Conjoint Data: A Structural Deep Learning Approach2026-05-25T17:00:30ZConjoint experiments randomize multidimensional profiles, offering a powerful design for recovering structural preference parameters -- including marginal rates of substitution, willingness to pay, and the distribution of preferences across a population. Yet the dominant approach in political science has focused on nonparametric causal estimands that do not leverage this potential. We propose a structural approach that embeds a deep neural network within a random utility logit model, allowing preference parameters to vary as a fully flexible function of respondent characteristics. The neural network addresses the concern that a parametric specification may not capture the true data generating process, while double/debiased machine learning provides valid inference on average preference parameters. We apply our method to three prominent conjoint studies and find rich preference heterogeneity masked by reduced-form averages: a near-zero gender effect coexists with 83% preferring female candidates, opposition to undemocratic behavior is near-universal but varies sharply in intensity, and progressive tax preferences cut across every partisan subgroup.2026-04-12T22:35:04ZAvidit AcharyaJens HainmuellerYiqing Xuhttp://arxiv.org/abs/2605.26023v1Considering causality in the construction of molecular signatures of lifestyle exposures2026-05-25T16:44:52ZMolecular signatures derived from omics data are increasingly used in epidemiological studies to characterize lifestyle exposures, either as proxies of exposure or to provide insight into disease mechanisms. These signatures are typically constructed by regressing the exposure on high-dimensional omics features. In the literature, an initial univariate screening step has sometimes been applied prior to multivariate modelling, but the causal implications of this choice have not yet been considered. Focusing on settings where the exposure causally influences molecular features (and not the reverse), we use directed acyclic graphs (DAGs) and $d$-separation arguments to show that collider bias may arise when the screening step is ignored, leading to the inclusion of non-causal features in the signature. We further demonstrate that the screening step can mitigate this bias. Our simulation studies illustrate that screening reduces the inclusion of non-causal features, albeit at the cost of lower sensitivity and reduced correlation between the exposure and the resulting signature. Overall, we recommend applying univariate screening prior to signature construction, particularly when the inclusion of non-causal features is undesirable, such as in mechanistic studies.2026-05-25T16:44:52Z28 pages, 10 figuresDiana WuVivian Viallonhttp://arxiv.org/abs/2605.26000v1Statistical Inference for Stochastic Gradient Descent Beyond Finite Variance2026-05-25T16:18:39ZStochastic gradient descent (SGD) is a foundational algorithm for large-scale statistical learning and stochastic optimization. However, statistical inference based on SGD iterates remains challenging when stochastic gradients have infinite variance, as the relevant limiting distributions depend on unknown nuisance parameters. In this paper, we develop an efficient, model-agnostic methodology for constructing confidence regions from SGD trajectories that applies in both finite- and infinite-variance regimes. The procedure is based on a joint weak convergence result for the Polyak-Ruppert averaged estimator and an empirical second-moment normalizer constructed from stochastic gradients along the SGD trajectory. This joint limit yields a self-normalized statistic in which the leading tail-dependent scaling terms cancel. We then use a subsampling calibration scheme to estimate the relevant critical values, avoiding explicit estimation of tail indices, slowly varying functions, or stable-law parameters. The resulting confidence regions are straightforward to implement and are asymptotically valid under both the finite- and infinite-second-moment regimes. Simulation studies show reliable coverage in various settings, supporting the proposed method as a practical tool for uncertainty quantification in stochastic optimization.2026-05-25T16:18:39ZJose BlanchetPeter GlynnWenhao Yanghttp://arxiv.org/abs/2606.07561v1Boundary Variance Inflation Causes Acquisition Bias in Gaussian Processes2026-05-25T15:59:40ZGaussian processes with stationary kernels on bounded domains exhibit inflated posterior variance near the boundary. Despite being a long-recognized artifact in geostatistics and a source of over-exploration in Bayesian optimization, the causes and effects of boundary-induced acquisition bias are underexplored. We trace the root cause to a simple geometric mechanism: the truncation of the kernel correlation neighborhood at the domain boundary creates an observation-independent distortion that worsens with dimensionality. We show how this distortion manifests across three acquisition classes: variance maximization concentrates selections at the corners, whereas negative integrated posterior variance and expected predictive information gain move selections inward to axis-aligned interior shells. These patterns arise without reference to any objective function, meaning that acquisition behavior can be dominated by kernel geometry rather than the desired task-specific uncertainty. To quantify this, we introduce a function-free selection-profile diagnostic for arbitrary acquisitions, kernels, and bounded-domain geometries.2026-05-25T15:59:40Z14 pages, 8 figures; appendices includedMaria BånkestadSanna JarlJens Sjölundhttp://arxiv.org/abs/2510.26051v4Estimation and Inference in Boundary Discontinuity Designs: Distance-Based Methods2026-05-25T15:41:35ZWe study nonparametric distance-based (isotropic) local polynomial methods for estimating the boundary average treatment effect curve, a causal functional that captures treatment effect heterogeneity in boundary discontinuity designs. We establish identification, estimation, and inference results both pointwise and uniformly along the treatment assignment boundary. We show that the geometric regularity of the boundary, a one-dimensional manifold, plays a central role in determining feasible convergence rates and valid inference procedures. Our theoretical contributions are threefold. First, we derive uniform lower and upper bounds on the convergence rate of the misspecification bias of isotropic local polynomial estimators. Second, we obtain uniform distributional approximations that justify boundary-robust inference. Third, we establish minimax lower bounds for a broad class of nonparametric isotropic regression estimators. These results yield practical guidance for empirical implementation, including new bandwidth selection rules that adapt to local irregularities of the treatment-assignment boundary. We illustrate the proposed methods using simulation evidence and an empirical application, and provide companion general-purpose software.2025-10-30T01:03:57ZMatias D. CattaneoRocio TitiunikRuiqi Rae Yuhttp://arxiv.org/abs/2601.09525v2Sparse covariate-driven factorization of high-dimensional brain connectivity with application to site effect correction2026-05-25T15:34:55ZLarge-scale neuroimaging studies often collect data from multiple scanners across different sites, where variations in scanners, scanning procedures, and other conditions across sites can introduce artificial site effects. These effects may bias brain connectivity measures, such as functional connectivity (FC), which quantify functional network organization derived from functional magnetic resonance imaging (fMRI). How to leverage high-dimensional network structures to effectively mitigate site effects has yet to be addressed. In this paper, we propose SLACC (Sparse LAtent Covariate-driven Connectome) factorization, a multivariate method that explicitly parameterizes covariate effects in latent subject scores corresponding to sparse rank-1 latent patterns derived from brain connectivity. The proposed method identifies localized site-driven variability within and across brain networks, enabling targeted correction. We develop a penalized Expectation-Maximization (EM) algorithm for parameter estimation, incorporating the Bayesian Information Criterion (BIC) to guide optimization. Extensive simulations validate SLACC's robustness in recovering the true parameters and underlying connectivity patterns. Applied to the Autism Brain Imaging Data Exchange (ABIDE) dataset, SLACC demonstrates its ability to reduce site effects.2026-01-14T14:48:13ZRongqian ZhangElena TuzhilinaJun Young Parkhttp://arxiv.org/abs/2605.25897v1Nonparametric Estimation via Expected Order Statistics2026-05-25T14:25:52ZThe empirical distribution function assigns mass $1/n$ to each of the $n$ observations in a sample. As these are highly variable, estimation error may be reduced by replacing them with estimated observations that are asymptotically less variable. Motivated by this idea, we introduce a nonparametric estimator obtained by assigning mass $1/m$ to $m$ estimated expected order statistics, with $m$ chosen arbitrarily. The estimator enjoys several finite-sample properties and yields a rich asymptotic theory. Its estimation error relative to its population counterpart is controlled by the $L^1$ error of the empirical distribution. Moreover, every $L$-functional of the new estimator corresponds to an $L$-functional of the empirical distribution with updated weights. We establish almost sure convergence in $L^p$ norm and Wasserstein distance as $n \to \infty$, and derive weak convergence of the associated empirical quantile process in $L^p(0,1)$, for $p\in[1,\infty)$ and $m$ fixed, and for $p=1,2$ as $n,m \to \infty$. These results yield asymptotic distributions for distance-based functionals, including $L^p$ and Wasserstein metrics. Bootstrap validity is also established. Simulations show that the estimator often improves on the empirical distribution and remains competitive with kernel methods, with more stable performance across different distributional settings.2026-05-25T14:25:52ZTommaso LandoLorenzo Tedescohttp://arxiv.org/abs/2605.25873v1Bayesian perspectives on exponential random graph models2026-05-25T14:00:56ZExponential random graph models (ERGMs) are a widely used framework for network data, enabling hypothesis testing on the structural mechanisms underlying observed networks. Bayesian ERGMs provide principled uncertainty quantification and enable the incorporation of prior knowledge through fully probabilistic modelling. However, computation remains challenging because the posterior is doubly intractable, with a likelihood normalising constant that depends on unknown parameters. This paper reviews Bayesian approaches to ERGM inference, categorising inference methods into three broad classes: auxiliary variable MCMC methods, adjusted pseudo-likelihood approaches, and variational methods, alongside dedicated treatment of model selection. We also discuss modelling extensions for missing data, longitudinal dynamics, populations of networks, weighted networks, highlighting applications across various scientific disciplines.2026-05-25T14:00:56Z16 pagesAlberto CaimoIsabella Gollinihttp://arxiv.org/abs/2602.07704v2Correcting for Nonignorable Nonresponse Bias in Ordinal Observational Survey Data2026-05-25T13:12:17ZMany political surveys rely on post-stratification, raking, or related weighting adjustments to align respondents with the target population. But when respondents differ from nonrespondents on the outcome itself (nonignorable nonresponse), these adjustments can fail, introducing bias even into basic descriptives. We provide a practical method that corrects for nonignorable nonresponse by leveraging response-propensity proxies (e.g., interviewer-coded cooperativeness) observed among respondents to extrapolate toward nonrespondents, while directly integrating observable covariates and retaining the benefits of post-stratification with known population shares. The method generalizes the variable-response-propensity (VRP) framework of Peress (2010) from binary to ordinal outcomes, which are widely used to measure trust, satisfaction, and policy attitudes. The resulting estimator is computed by maximum likelihood and implemented in a compact R routine that handles both ordinal and binary outcomes. Using the 2024 American National Election Study (ANES), we show that accounting for nonignorable nonresponse produces substantively meaningful shifts for life satisfaction (estimated latent correlation $ρ\approx 0.53$), while yielding negligible changes for retrospective economic evaluations ($ρ\approx 0$), highlighting when nonignorable nonresponse substantively affects survey estimates.2026-02-07T21:15:33Z17 pagesLukáš LafférsJozef Michal MintalIvan Sutórishttp://arxiv.org/abs/2605.25811v1Geometry Adaptive Counterfactual Distribution Learning with Diffusion-Guided Smoothing2026-05-25T13:02:56ZWe study counterfactual distribution learning for high-dimensional outcomes whose counterfactual law may concentrate near lower-dimensional structure. Standard isotropic smoothing treats all ambient directions equally, leading to unfavorable scaling and unstable local inference. We propose two diffusion-guided estimators based on semiparametric debiasing: diffusion-informed smoothing for counterfactual densities and diffusion-informed score smoothing for counterfactual scores. The estimators combine causal nuisance adjustment with geometry-adaptive localization driven by diffusion score information, removing first-order nuisance bias while aligning smoothing with local outcome geometry. We establish asymptotic expansions, risk bounds, and inference procedures for smoothed density and score-based targets, with ambient density inference obtained under additional approximation conditions. Under structural geometry conditions, the leading stochastic error is governed by an effective dimension induced by the diffusion-guided kernel, rather than by the ambient dimension. Semi-synthetic experiments based on CelebA show steeper error decay for geometry-adaptive methods, supporting the proposed effective-dimension theory.2026-05-25T13:02:56ZKwangho Kimhttp://arxiv.org/abs/2602.05938v2DiPPER: A Bayesian approach to differential prevalence analysis with applications in microbiome studies2026-05-25T12:31:56ZRecent evidence suggests that analyzing the presence/absence of taxonomic features can offer a compelling alternative to differential abundance analysis in microbiome studies. However, standard approaches to differential prevalence analysis face challenges with boundary cases and multiple testing. To address these limitations, we developed DiPPER (Differential Prevalence via Probabilistic Estimation in R), a method based on Bayesian hierarchical modeling. We benchmarked our method against existing differential prevalence methods, along with two differential abundance tools, using publicly available data from 57 human gut microbiome studies. We observed considerable variation in performance across the evaluated methods. Importantly, DiPPER demonstrated high sensitivity to detect potentially differentially prevalent features while maintaining a well-calibrated family-wise error rate under the global null hypothesis. Most notably, it outperformed the alternatives in the replication of findings across independent studies. Furthermore, DiPPER provides differential prevalence estimates and uncertainty intervals that are inherently adjusted for multiple testing.2026-02-05T17:49:08ZSource code and datasets: https://github.com/jepelt/differential-prevalence. R package: https://github.com/jepelt/DiPPERJuho PeltoKari AuranenJanne V. KujalaLeo Lahti