https://arxiv.org/api/LCzw0jhr2iUd+OOLPvTa1c6bVng 2026-06-13T16:16:47Z 36171 600 15 http://arxiv.org/abs/2605.26515v1 Learning a directed acyclic graph with additive heteroscedastic errors 2026-05-26T03:57:58Z

This paper studies causal discovery for a directed acyclic graph under a structural equation model with additive heteroscedastic errors. We first establish new identifiability results for location-scale noise models, showing that heteroscedasticity can be leveraged to recover causal directions. Based on these insights, we propose a novel iterative procedure, Residual Simultaneous Quantile Estimation (RESQUE), where each iteration consists of a residual-construction stage and a composite quantile regression stage, enabling recursive identification of sink nodes via the invariance of conditional scale coefficients across quantiles. We then establish its theoretical guarantees for recovering topological order and graph structure, even when the number of variables diverges with the sample size. Simulation studies and application to benchmark datasets show that RESQUE performs favorably compared with existing methods, especially when causal information is partly encoded in the variance component. These results highlight exploiting structured variance signals for causal discovery and provide a principled framework for multivariate causal discovery beyond mean-based modeling.

2026-05-26T03:57:58Z 33 pages, 4 figures Xintao Xia Li Chen Yue Hu Chunlin Li http://arxiv.org/abs/2605.26507v1 Improving inverse probability of censoring weighting for win statistics with composite survival outcomes 2026-05-26T03:47:36Z

Win statistics, including the win ratio, net benefit, and win odds, summarize treatment effects on hierarchical composite endpoints by sequentially comparing patient pairs on component outcomes ordered by clinical importance, proceeding to lower priority components only when higher priority ones are tied. Restricting comparisons to a pre-specified clinical horizon yields well defined estimands separated from the censoring mechanism, and it is critically important to address right censoring during estimation. Existing inverse probability of censoring weighting methods discard indeterminate pairs entirely, incurring avoidable efficiency loss that grows with censoring and restriction horizon length. We propose a novel estimator that replaces the confirmation of higher priority ties with a conditional tie probability given the observed data, recovering partially observed pairs as fractional contributions. Large sample theory is developed based on two-sample U-statistics with estimated nuisance functions, and closed-form sandwich variance estimators are obtained for the win ratio, net benefit, and win odds under our new weighting scheme. Simulations demonstrate sizable efficiency gains growing sharply from light censoring to high censoring rate based on our new estimator, and we further apply our estimator to reanalyze a completed randomized clinical trial.

2026-05-26T03:47:36Z Xi Fang Fan Li http://arxiv.org/abs/2508.00223v4 Structural Causal Models for Extremes: an Approach Based on Exponent Measures 2026-05-26T01:47:39Z

We introduce a new formulation of structural causal models for extremes, called the extremal structural causal model (eSCM). Unlike conventional structural causal models, where randomness is governed by a probability distribution, eSCMs use an exponent measure, an infinite-mass law that naturally arises in the analysis of multivariate extremes. Central to this framework are activation variables, which abstract the single-big-jump principle, along with additional randomization that enriches the class of eSCM laws. This formulation encompasses all possible laws of directed graphical models under the recently introduced notion of extremal conditional independence. We also identify an inherent asymmetry in eSCMs under natural assumptions, enabling the identifiability of causal directions, a central challenge in causal inference. Finally, we propose a method that utilizes this causal asymmetry and demonstrate its effectiveness in both simulated and real datasets.

2025-08-01T00:01:23Z Updated the statement of Theorem 5 Shuyang Bai Fei Fang Tiandong Wang http://arxiv.org/abs/2605.26429v1 Structure-Adaptive Conformal Inference for Large-Scale Out-of-Distribution Testing 2026-05-26T01:28:16Z

This paper addresses structured out-of-distribution (OOD) testing in high-stakes machine learning applications. Traditional conformal methods rely on joint exchangeability, making it difficult to incorporate auxiliary information such as spatiotemporal or grouping structures. To overcome this limitation, we propose the structure-adaptive conformal q-value (SCQ), a significance index that integrates individual test evidence with structural patterns. We also develop pseudo-score-guided transductive automated model selection (P-TAMS), which adapts conformalized model selection to structured OOD testing across a toolbox of candidate models. Together, SCQ and P-TAMS form a unified framework under pairwise exchangeability, providing finite-sample error-rate control, improved power, and enhanced interpretability. Experiments on simulated and real data demonstrate that the proposed approach controls the false discovery rate and performs well across diverse settings.

2026-05-26T01:28:16Z Rongyi Sun Wenguang Sun Zinan Zhao http://arxiv.org/abs/2605.26413v1 Confounder Detection via Treatment Intent: A New Observational Study Design 2026-05-26T00:41:47Z

Understanding the effects of interventions is central to scientific progress, with randomized controlled trials (RCTs) regarded as the gold standard for causal inference in many applied fields. However, RCTs are costly, time-consuming, and often constrained by ethical or practical limitations, motivating the need for causal methods able to draw conclusions from observational data. While such data is collected at ever larger scale, making its use for causal inference is often hindered by the fact that not all variables affecting treatment allocation and the outcome are observed: an issue known as unobserved confounding. In this paper, we introduce a new study design called confounder detection via treatment intent. The idea is to query a human expert who makes treatment decisions, and ask them to compare pairs of units proposed by a principled matching strategy, with the goal of eliciting unobserved variables that explain why treatment decisions differ. We provide a theoretical basis for such a procedure, ascertaining conditions under which such a study design may elicit unobserved confounders. Building on this newly established foundations, we study treatment effects of interventions in the intensive care unit (ICU). First, we show empirical evidence strongly indicating that electronic health records (EHRs) collected in ICUs are subject to unobserved confounding. By using clinical text notes as a proxy for physicians' knowledge and leveraging natural language processing, we provide a proof of concept for our methodology in a semi-synthetic environment with a known ground truth.

2026-05-26T00:41:47Z Drago Plecko Patrik Okanovic Torsten Hoefler Elias Bareinboim http://arxiv.org/abs/2605.27463v1 When prompt perturbations break your A/B test: A valid statistical test for generative surveying 2026-05-26T00:35:58Z

Generative surveying -- where collections of LLM-based personas provide feedback on messages -- has emerged as a cheap and scalable alternative to traditional market research. However, LLMs are sensitive to small variations in prompt design and conclusions drawn from generative surveys may depend on arbitrary phrasing choices. Controlling for this sensitivity requires including semantically equivalent perturbations in the analysis. In this paper, we show that standard hypothesis tests, including the sign test and Wilcoxon signed-rank test, are invalid under a statistical model for generative surveying that includes realistic perturbation structure. We propose a permutation test that is valid under this model and formally characterize the conditions under which standard tests fail. Applying our framework to a simple generative surveying problem, we estimate relevant parameters, characterize the power of the permutation test under realistic conditions, and provide practical guidance on budget allocation across personas, perturbations, and replicates. Finally, we show that both the magnitude and direction of the estimated effect are sensitive to the choice of model, even within the same model family.

2026-05-26T00:35:58Z Hayden Helm Carey Priebe http://arxiv.org/abs/2605.26401v1 Small-Area Precipitation Forecasting and Drought--Flood Early Warning with Reverse-Martingale Regularized Recurrent Networks 2026-05-26T00:18:05Z

Small-area precipitation forecasts support real-time decisions for reservoir operation, irrigation planning, drought monitoring, and flash-flood response. Operational value depends not only on point accuracy, but also on calibrated exceedance probabilities and warning rules that remain stable when local weather regimes depart from the training climatology. We evaluate a reverse-martingale regularized recurrent neural network (\RMRNN) for probabilistic precipitation forecasting and sequential early warning. A backward-coherence penalty is added to the recurrent hidden state; the resulting residual process drives a Shiryaev--Roberts (SR) detector, so the same latent trajectory that produces the forecast also supplies a continuously updated drought or flood-regime indicator. The framework is tested on the Taiwan CWA dense rain-gauge network, CHIRPS v2 daily gridded precipitation over Taiwan and the Horn of Africa, and NOAA GHCN-Daily stations over the Texas Hill Country. Across 1{,}000 replications, \RMRNN{} matches or slightly improves the GRU baseline in RMSE, MAE, and CRPS at 1~h--72~h lead while substantially improving alarm characteristics. The SR detector reduces false-alarm ratios by a factor of three to five at matched detection power. In the 2020--2021 Taiwan drought, onset is flagged 8--12 days earlier than SPI-3 thresholding; in the 2023 Typhoon Haikui flood, flash-flood risk is signalled 4~h before the CWA operational alert.

2026-05-26T00:18:05Z 4 figures Foo Hui-Mean Yuan-chin Ivan Chang http://arxiv.org/abs/2605.26335v1 Unobserved Heterogeneity in Threshold Regression Based on the Hitting Times of a Reflected Brownian Motion for Recurrent Hypoglycemia 2026-05-25T21:13:54Z

Analyses of recurrent hypoglycemia are critical for effective treatment management in diabetic patients. Typically, within-subject dependency in such analyses is captured through subject-level frailty. Recent research has modeled recurrent hypoglycemia using the first hitting times of a reflected Brownian motion. A close examination of this approach reveals that it does not adequately account for varying frailties among individuals, which indicate notable heterogeneity. To address this gap, we propose a finite mixture model of the first hitting time distribution of the reflected Brownian motion. This model allows for component-specific regression coefficients and frailty parameters, providing nuanced insights into how risk factors differently affect patient subgroups. We employ a Bayesian framework for inference, utilizing Markov chain Monte Carlo for estimation. Model selection is conducted using the Deviance Information Criterion and the Logarithm of the Pseudo-Marginal Likelihood. The effectiveness of these criteria is assessed through simulation studies. Application to recurrent hypoglycemia modeling revealed two subgroups with different risk profiles, as reflected in their volatilities. Bayesian model comparison criteria favor the model with component specific regression coefficients for volatilities. The subgroup with lower volatility exhibits a larger variance and, hence, a greater level of heterogeneity.

2026-05-25T21:13:54Z Yingfa Xie Haoda Fu Yuan Huang Jun Yan http://arxiv.org/abs/2605.26312v1 Cross-modal dependence analysis with asynchronous longitudinal multimodal data 2026-05-25T20:09:47Z

We propose a Bayesian latent variable model to estimate covariate-assisted dependence structures across multiple modalities of multivariate data that may be observed asynchronously. This setting commonly arises in longitudinal biomedical research, especially in observational and clinical studies of complex diseases, where dynamic and heterogeneous dependence across biomarker modalities can be pathologically and clinically informative. For example, the biological diagnosis and staging of Alzheimer's disease require integrated evaluation of multimodal biomarkers, including imaging and biofluid biomarkers, and the Alzheimer's Disease Neuroimaging Initiative (ADNI) study has collected biomarker data longitudinally for over two decades. However, quantitative analysis is often challenged by asynchronous collection of multimodal profiles due to study design and data collection constraints. Common analytic strategies that restrict inference to complete observations or analyze each modality separately can lose information and introduce bias. Therefore, we aim to jointly model all available data and estimate the population-level cross-modal dependence structure that evolves over time and varies across demographic or clinical groups, where the cross-covariance matrices for modality pairs serve as the primary quantities of interest. The proposed model uses modality-specific low-rank loading structures with shared latent variables to borrow information across modalities, visits, and subjects while accounting for repeated measurements. The application to ADNI data reveals clinically meaningful patterns in longitudinal cross-modal biomarker dependence, and the simulation study shows improved recovery under limited modality synchrony.

2026-05-25T20:09:47Z Kun Qian Hyung G. Park http://arxiv.org/abs/2605.26288v1 Beyond Differences: Doubly Robust Meta-Learners for Ratio-Based Treatment Effects 2026-05-25T19:24:57Z

When treatment effects are naturally expressed as ratios -- as in medicine, pricing, and marketing -- the ratio-based CATE $τ(x) = E[Y|W=1,X=x] / E[Y|W=0,X=x]$ is the appropriate estimand. Yet existing estimators either impose a log-linear parametric structure or apply generic regression without robustness guarantees for this functional. We introduce the Q-Learner, which decomposes $τ(x)$ into a product of two odds ratios, reducing ratio-CATE estimation for binary outcomes to two propensity classification tasks. We further derive doubly robust augmentations for both S/T- and Q-style ratio learners and characterize their distinct robustness properties. In benchmarks on seven RCT datasets, the Q-Learner is the most consistently competitive method in low-conversion regimes, where its propensity-only construction sidesteps the imbalanced regression that hurts outcome-based estimators. On four observational datasets, where propensity must be estimated and confounding cannot be ruled out, the DR learners introduced here decisively come out on top, making them practitioners' natural default for confounded observational data.

2026-05-25T19:24:57Z 13+5 pages, 5 figures, 6 tables. Code: https://github.com/michaelfuchs90/ratiobasedcate Michael Fuchs Dominik Kreiss http://arxiv.org/abs/2605.26253v1 Length-biased Birnbaum-Saunders quantile regression with application to water evaporation 2026-05-25T18:24:09Z

Length-biased distributions arise naturally in environmental, reliability, and economic studies where the sampling mechanism favors larger observational units. In this paper, we propose a quantile regression model based on the length-biased Birnbaum--Saunders (QLBS) distribution. The model is constructed through a reparameterization of the length-biased Birnbaum--Saunders distribution in terms of its quantile function, thereby allowing direct interpretation of covariate effects on conditional quantiles of the response variable. We derive the log-likelihood function and the corresponding score equations, and obtain maximum likelihood estimators via numerical optimization. Asymptotic and bootstrap confidence intervals are considered. Two types of residuals are proposed for model assessment, namely the generalized Cox--Snell and randomized quantile residuals. An elaborate Monte Carlo simulation study is carried out to evaluate the performance of the maximum likelihood estimators for several sample sizes and quantile levels. The proposed methodology is illustrated with a real meteorological data set from Brazil.

2026-05-25T18:24:09Z 21 pages, 3 figures Helton Saulo Tailine Nonato Roberto Vila http://arxiv.org/abs/2510.07128v5 A General Framework for Joint Multi-State Models 2026-05-25T17:58:14Z

Conventional joint modeling approaches generally characterize the relationship between longitudinal biomarkers and discrete event occurrences within terminal, recurring or competing risk settings, thereby offering a limited representation of complex, multi-state trajectories. We propose a general multi-state joint modeling framework that unifies longitudinal biomarker dynamics with multi-state time-to-event processes defined on arbitrary directed graphs. The proposed framework also accomodates nonlinear longitudinal submodels and scalable inference via stochastic gradient descent. This formulation encompasses both Markovian and semi-Markovian transition structures, allowing recurrent cycles and terminal absorptions to be naturally represented. The longitudinal and event processes are linked through shared latent structures within nonlinear mixed-effects models, extending classical joint modeling formulations. We derive the complete likelihood, model selection criteria, and develop scalable inference procedures based on stochastic gradient descent to enable high-dimensional and large-scale applications. In addition, we formulate a dynamic prediction framework that provides individualized state-transition probabilities and personalized risk assessments along complex event trajectories. Through simulation and application to the PAQUID cohort, we demonstrate accurate parameter recovery and individualized prediction.

2025-10-08T15:24:51Z 34 pages, 12 figures Félix Laplante Christophe Ambroise http://arxiv.org/abs/2604.05639v3 Estimating Dynamic Marginal Policy Effects under Sequential Unconfoundedness 2026-05-25T17:21:41Z

We develop methods for estimating how infinitesimal policy changes affect long-term outcomes in dynamic systems. We show that dynamic marginal policy effects (MPEs) can be identified via tractable reduced-form expressions, and can be estimated under a general sequential unconfoundedness assumption. We also propose a doubly robust estimator for dynamic MPEs. Our approach does not require observing full dynamic state information (as is typically assumed for off-policy evaluation in Markov decision processes), and does not incur an exponential curse of horizon (as is typical in non-Markovian off-policy evaluation). We demonstrate practicality and robustness of our approach in a number of simulations, including one motivated by a dynamic pricing application where people use past prices to form a reference level for current prices.

2026-04-07T09:41:11Z Fix typos I-han Lai Stefan Wager http://arxiv.org/abs/2604.10845v2 Learning Preferences from Conjoint Data: A Structural Deep Learning Approach 2026-05-25T17:00:30Z

Conjoint experiments randomize multidimensional profiles, offering a powerful design for recovering structural preference parameters -- including marginal rates of substitution, willingness to pay, and the distribution of preferences across a population. Yet the dominant approach in political science has focused on nonparametric causal estimands that do not leverage this potential. We propose a structural approach that embeds a deep neural network within a random utility logit model, allowing preference parameters to vary as a fully flexible function of respondent characteristics. The neural network addresses the concern that a parametric specification may not capture the true data generating process, while double/debiased machine learning provides valid inference on average preference parameters. We apply our method to three prominent conjoint studies and find rich preference heterogeneity masked by reduced-form averages: a near-zero gender effect coexists with 83% preferring female candidates, opposition to undemocratic behavior is near-universal but varies sharply in intensity, and progressive tax preferences cut across every partisan subgroup.

2026-04-12T22:35:04Z Avidit Acharya Jens Hainmueller Yiqing Xu http://arxiv.org/abs/2605.26023v1 Considering causality in the construction of molecular signatures of lifestyle exposures 2026-05-25T16:44:52Z

Molecular signatures derived from omics data are increasingly used in epidemiological studies to characterize lifestyle exposures, either as proxies of exposure or to provide insight into disease mechanisms. These signatures are typically constructed by regressing the exposure on high-dimensional omics features. In the literature, an initial univariate screening step has sometimes been applied prior to multivariate modelling, but the causal implications of this choice have not yet been considered. Focusing on settings where the exposure causally influences molecular features (and not the reverse), we use directed acyclic graphs (DAGs) and $d$-separation arguments to show that collider bias may arise when the screening step is ignored, leading to the inclusion of non-causal features in the signature. We further demonstrate that the screening step can mitigate this bias. Our simulation studies illustrate that screening reduces the inclusion of non-causal features, albeit at the cost of lower sensitivity and reduced correlation between the exposure and the resulting signature. Overall, we recommend applying univariate screening prior to signature construction, particularly when the inclusion of non-causal features is undesirable, such as in mechanistic studies.

2026-05-25T16:44:52Z 28 pages, 10 figures Diana Wu Vivian Viallon