https://arxiv.org/api/zfMd4tnp5o9FJm7XKe7n0qFI018 2026-06-21T18:37:51Z 36316 1110 15 http://arxiv.org/abs/2605.15639v1 Leveraging heterogeneity for identifiability: Bayesian order-based learning of multiple DAGs 2026-05-15T05:43:59Z

We propose a joint order-based scoring framework for causal structure learning of directed acyclic graph (DAG) models under heterogeneous data settings. We show that leveraging heterogeneity improves the accuracy of causal ordering estimation. In the most favorable case, the causal ordering is identifiable up to two permutations. Building on this framework, we propose an order-based Bayesian method for Gaussian DAG models and establish its theoretical properties in the high-dimensional regime. For posterior inference over the space of orderings, we introduce a random-to-random (R2R) proposal neighborhood for the Metropolis-Hastings algorithm, which is theoretically motivated and exhibits efficient mixing behavior. Simulation studies confirm the strong empirical performance of the proposed method, and an application to single-nucleus RNA sequencing data from major depressive disorder demonstrates practical utility.

2026-05-15T05:43:59Z Hyunwoong Chang Fariha Taskin http://arxiv.org/abs/2605.15633v1 Structured Transfer Learning for Survival Risk Stratification in Data-Sparse Clinical Cohorts 2026-05-15T05:34:52Z

Background: Survival prediction models are often less reliable in clinical groups with limited sample sizes or few outcome events. Target-only models may be unstable, whereas models from larger cohorts may transfer poorly when risk-factor effects differ across populations. We evaluated whether structured transfer learning can improve survival risk stratification in data-sparse cohorts while allowing cohort-specific adaptation. Methods: We developed the COhort-shared Rank-rEduced Cox model (CORE-Cox), a two-stage framework for multi-outcome survival prediction. CORE-Cox learns shared risk-factor patterns across related outcomes in a larger source cohort via a low-rank Cox coefficient structure, then adapts these patterns to a smaller target cohort through regularized residual correction. We evaluated CORE-Cox in UK Biobank (White source, n=150,093; Asian target, n=2,534) and MIMIC-IV (White ICU source, n=15,997; Asian ICU target, n=672), comparing against target-only Cox, penalized Cox, low-rank multi-task, naive pooling, direct transfer, and single-outcome residual transfer under repeated nested cross-validation. Results: CORE-Cox achieved best or near-best discrimination across most outcomes. Mean C-index improved from 0.733 to 0.766 in UK Biobank and from 0.628 to 0.658 in MIMIC-IV, with gains in eight of nine outcomes. CORE-Cox also improved top-15% risk enrichment, with hazard-ratio estimates typically intermediate between source-only and target-only models. Discussion: CORE-Cox offers an interpretable transfer-learning framework for survival risk stratification in data-sparse cohorts, combining shared cross-outcome structure with cohort-specific adaptation. Further validation is needed before use in calibrated absolute-risk prediction or clinical decision-making.

2026-05-15T05:34:52Z Junhan Yu Yurui Chen Juan Delgado-SanMartin Dennis Wang Hong Pan Doudou Zhou http://arxiv.org/abs/2404.04775v3 Bipartite causal inference with interference, time series data, and a random network 2026-05-15T04:22:08Z

In bipartite causal inference with interference, interventional units might receive treatment or control, and they might affect the outcome of outcome units through their connections on a bipartite network. We study bipartite causal inference with interference based on observational data across time and a changing bipartite network. Under an exposure mapping framework, we define the immediate and carryover causal effects for each outcome unit, representing contrasts of potential outcomes under different values of the immediately preceding and past exposures, respectively, averaged over time. We establish unconfoundedness of the exposure received by outcome units based on unconfoundedness assumptions on the interventional units' treatment assignment and the random network, hence respecting the bipartite structure of the problem. Our results hold for binary, continuous, and multivariate exposure mappings. In the special case of binary exposure and carryover mappings, we propose algorithms for the immediate and carryover causal effects that combine matching and covariate balancing. We show that the bias of the resulting estimators is bounded. In our motivating study, we find some evidence that smoke from wildfires has an immediate impact on reducing transportation by bicycle in San Francisco.

2024-04-07T01:34:27Z Zhaoyan Song Georgia Papadogeorgou http://arxiv.org/abs/2605.15596v1 Tail postcoloring in long-run variance estimation of time series 2026-05-15T04:07:08Z

Prewhitening is a common approach to deal with strong autocorrelation. In this article, we propose a new approach called tail postcoloring, motivated by it. It uses parametric models to project, or color back, the neglected tail autocovariances in nonparametric estimators onto the final estimator. This approach bridges the non-parametric variance estimator and the parametric coloring model through a scaling factor. It automatically switches between these two arms using a bandwidth parameter, without the need to transform the entire dataset into residuals, as in the standard prewhitening approach. When the coloring model is well-specified, a parametric rate can be achieved. In finite samples, it is also more robust to misspecification of the coloring model compared to the whitening model in the standard approach. Besides, it avoids severe potential variance inflation or power reduction caused by the recoloring factor in the standard approach. We show that multiple parametric models can be used to construct a multiply robust tail postcolored estimator. It also naturally works for multivariate time series. A real-data example in Markov chain Monte Carlo output analysis is provided.

2026-05-15T04:07:08Z Xu Liu Kin Wai Chan http://arxiv.org/abs/2503.00326v2 A Bayesian Additive Regression Tree Model for Learning Conditional Average Treatment Effects in Regression Discontinuity Designs 2026-05-15T01:10:42Z

This paper develops a performant Bayesian approach to conditional average treatment effect (CATE) estimation in regression discontinuity designs (RDD), an increasingly prevalent form of quasi-experiment that facilitates causal inference. Earlier Bayesian approaches do not easily accommodate CATE estimation while recent frequentist approaches to this problem assume a known basis expansion, a steep model specification requirement that our approach avoids. The new model is a variant of a Bayesian additive regression tree (BART) model with linear leaf-level regressions on the running variable and a treatment dummy (and their interaction). The model adaptively partitions covariate space into regions where the slope on the running variable appreciably differs, providing interpretable Bayesian inference on conditional average treatment effects near the cutoff.

2025-03-01T03:23:10Z Rafael Alcantara P. Richard Hahn Hedibert F. Lopes http://arxiv.org/abs/2605.15483v1 Improving the Efficiency of Subgroup Analysis in Randomized Controlled Trials with TMLE 2026-05-14T23:54:07Z

Subgroup analyses within randomized controlled trials are often underpowered due to limited sample sizes. We address this challenge by leveraging trial participants outside the subgroup of interest to augment estimation within the subgroup. Specifically, we study two Targeted Maximum Likelihood Estimators (TMLEs) that borrow information from non-subgroup participants within the same trial: a TMLE with pooled regression (TMLE-PR) and an Adaptive Targeted Maximum Likelihood Estimator (A-TMLE). Both estimators enable information sharing without relying on any external real-world data, thereby capitalizing on key strengths of the trial: most importantly, the protection against bias afforded by the randomized treatment, but also harmonized data collection, and consistent treatment and outcome definitions. The general strategy proposed here directly advances the priorities of key regulatory agencies, including the FDA, by improving the precision of subgroup-specific treatment effect estimates without introducing external sources of bias, thereby facilitating rigorous inference to support equitable labeling, access, and post-market evaluation. In a case study based on analysis of data from a cardiovascular outcome trial (LEADER, NCT01179048), we estimate the risk reduction of major adverse cardiac events (MACE) under liraglutide treatment among Black and Asian subgroups -- each comprising less than 10\% of the trial population -- using the proposed estimators that borrow information from the remainder of the trial. Using A-TMLE, in particular, we find estimated absolute MACE risk reductions of 1.6, 1.5, and 1.5 percentage points among Asian participants and 2.1, 2.0, and 2.1 percentage points among Black participants at 365, 540, and 730 days, respectively, with 95\% confidence intervals excluding the null at each time point.

2026-05-14T23:54:07Z Sky Qiu Nerissa Nance Rachael Phillips Jens Tarp Maya Petersen Mark van der Laan http://arxiv.org/abs/2605.15469v1 Tree-aggregated regression for compositional data with measurement errors 2026-05-14T23:16:34Z

High-dimensional compositional covariates, often derived from count data, are subject to measurement error and are frequently analyzed after aggregation along a prespecified tree to improve interpretability in applications such as microbiome studies. Existing approaches typically handle either tree-guided compositional regression or errors-in-variables correction, but they do not account for the hierarchical contamination induced by their interaction. We show that tree aggregation turns leaf-level measurement error into level-dependent, correlated contamination across aggregated nodes, which inflates bias, weakens concentration rates for corrected estimating quantities, and leads to unstable variable selection for naive approaches. We propose Tree-Aggregated Regression with Correction for Observation Error (TARCO), which integrates bias-corrected estimating quantities with a tree-aware positive semidefinite stabilization and sparse regularization, with tuning selected by cross-validation based on the corrected objective. The resulting convex program can be solved with scalable algorithms. We establish finite-sample bounds for prediction and estimation errors and prove sign consistency under conditions that explicitly reflect tree heterogeneity. The guarantees persist when the measurement-error covariance is replaced by a consistent estimator. Simulations across multiple tree depths and a microbiome application demonstrate improved estimation accuracy, support recovery, and aggregation-level interpretability compared with methods that ignore the interaction between tree aggregation and measurement error.

2026-05-14T23:16:34Z Zhenghan Li Tianying Wang http://arxiv.org/abs/2605.15428v1 Modeling Misclassification in Spousal Violence Reporting: Evidence from Bayesian Quantile Regression 2026-05-14T21:23:02Z

Quantile regression extends regression analysis beyond the conditional mean, providing a richer characterization of covariate effects across the outcome distribution. For sensitive binary outcomes, however, misclassification due to underreporting can substantially bias inference. We propose a Bayesian quantile regression framework for misclassified binary outcomes that introduces a latent true response and explicitly models false negative and false positive reporting errors. Estimation is performed through a novel Markov chain Monte Carlo (MCMC) algorithm. Simulation studies under varying prior specifications and misclassification rates demonstrate improved performance over models that ignore misclassification. We apply the method to self-reported spousal violence data, examining associations with employment status and household wealth while adjusting for socio-demographic factors. The results indicate that underreporting exceeds overreporting across quantiles and that accounting for misclassification can change substantive conclusions.

2026-05-14T21:23:02Z Joon Jin Song Mohammad Arshad Rahman Yoo-Mi Chin James Stamey http://arxiv.org/abs/2605.15405v1 Estimating Social Norm Complementarities 2026-05-14T20:42:24Z

We develop a model of choice over social norms that allows for complementarities along two dimensions: \textit{technological}, analogous to complementarities between consumption goods, and social, capturing returns from conformity. Together, these determine whether two norms are complements, substitutes, or independent, as defined by how the equilibrium prevalence of one norm responds to a marginal shift in the utility of another. We estimate the model using repeated cross-sections from Sierra Leone and Nigeria, focusing on female genital cutting, polygyny, and child marriage. Social returns are significant across all specifications. For female genital cutting and child marriage, we find evidence of complementarities, especially strong in Sierra Leone. For polygyny and child marriage, we find evidence of social substitutability, particularly in Nigeria. We interpret these differences using insights from anthropology. Finally, we iterate the model forward to study policy counterfactuals, assessing the potential effects of legal reforms and social interventions.

2026-05-14T20:42:24Z Eliana La Ferrara Cheaheon Lim Davide Viviano http://arxiv.org/abs/2510.20741v2 A comparison of methods for designing hybrid type 2 cluster-randomized trials with continuous effectiveness and implementation endpoints 2026-05-14T20:27:52Z

Hybrid type 2 studies are gaining popularity for their ability to assess both implementation and health outcomes as co-primary endpoints. Often conducted as cluster-randomized trials (CRTs), five design methods can validly power these studies: p-value adjustment methods, combined outcomes approach, single weighted 1-DF test, disjunctive 2-DF test, and conjunctive test. We compared these methods theoretically and numerically. Theoretical comparisons of power equations allowed us to identify when one method had more or less power than another globally. We showed that p-value adjustment methods are always less powerful than both the combined outcomes approach and the single 1-DF test, and identified conditions where the disjunctive 2-DF test is less powerful than the single 1-DF test. To further investigate when power advantages shift, we conducted a large-scale numerical study using our novel crt2power R package, which calculates power or sample size for CRTs with two continuous co-primary endpoints using these methods. Across 45,000 input scenarios, we found specific patterns: when treatment effects are unequal, the disjunctive 2-DF test tends to be most powerful; when treatment effects are equal, the single 1-DF test tends to dominate. Together, these comparisons offer practical guidance for powering hybrid type 2 studies.

2025-10-23T17:00:15Z Melody Owen Fan Li Ruyi Liu Donna Spiegelman http://arxiv.org/abs/2605.15373v1 Nonparametric inference for sublevel-set probabilities of conditional average treatment effect functions 2026-05-14T20:01:21Z

The average treatment effect can obscure important heterogeneity when individuals respond differently to a treatment. While the conditional average treatment effect (CATE) function captures such heterogeneity, it is difficult to communicate when it depends on many covariates. Sublevels sets of a multivariate CATE function are equally complicated objects, but the probability of a sublevel set of a CATE function is a single number with a simple interpretation as the proportion of individuals whose expected treatment effect does not exceed a prespecified threshold. By varying the threshold, a univariate monotone curve appears which can be used to visualize the overall type and degree of heterogeneity in a population. We formalize this curve as a target parameter and show that it is not pathwise differentiable under a nonparametric model. To address this nonstandard estimation problem, we leverage recent advances in monotone function estimation and develop a Grenander-type estimator that incorporates machine learning. We also show that the best piecewise linear approximation to the curve of interest is a pathwise differentiable parameter, and we develop a debiased machine learning estimator of this approximation. We investigate our proposed estimators' finite sample performance in a sequence of numerical studies based on data synthesized from a randomized trial. The methods are illustrated in data from a randomized trial on diabetes medication.

2026-05-14T20:01:21Z Anders Munch Thomas A. Gerds http://arxiv.org/abs/2605.15303v1 Functional Cox model for interval-censored data 2026-05-14T18:17:08Z

Interval-censored data arise frequently in scientific studies, where the event of interest is known only to occur within a specific time interval. In such studies, functional covariates taking the form of continuous curves or spatial profiles are increasingly encountered, and it is of substantial scientific relevance to investigate how the trajectory of a functional covariate affects the event time. We formulate the effects of both scalar and functional covariates on the interval-censored event time through a functional Cox model. We consider penalized maximum likelihood estimation for this model and devise an EM algorithm to stably compute the parameter estimators. The resulting estimators for the regression parameters and linear functionals of the coefficient function are shown to be consistent and asymptotically normal, with limiting covariance matrices that attain the semiparametric efficiency bound and can be readily estimated through the profile likelihood method. Building upon these results, we construct a global test for the overall effect of the functional covariate. Finally, we assess the performance of the proposed methods through extensive simulation studies and present an application to data from the Alzheimer's Disease Neuroimaging Initiative.

2026-05-14T18:17:08Z Yangjianchen Xu Peijun Sang http://arxiv.org/abs/2605.15142v1 Creating treatment and component hierarchies in component network meta-analysis 2026-05-14T17:46:51Z

Component network meta-analysis (CNMA) is a statistical methodology that enables estimation of relative effects for multi-component treatments, such as combinations of antidepressants, and individual components, such as single antidepressants, by synthesizing data from multiple studies. A commonly desired output of a systematic review and meta-analysis is a hierarchy of the treatments in terms of a certain performance metric. Methods have been established for standard network meta-analysis (NMA), but have not yet been extended to CNMA. In particular, CNMA presents unique challenges because the set of relative effects that can be uniquely estimated is more complex to determine compared to standard NMA, and a hierarchy involving relative effects that are not uniquely estimable is misleading. We present a step-by-step workflow for answering treatment hierarchy questions in both frequentist and Bayesian CNMA, including explicitly identifying the uniquely estimable relative effects. We illustrate the workflow by posing multiple treatment hierarchy questions in two distinct networks, one concerning primary care of depression and one disconnected network investigating treatment for chronic lymphocytic leukemia.

2026-05-14T17:46:51Z Augustine Wigle Audrey Béliveau Adriani Nikolakopoulou Lifeng Lin http://arxiv.org/abs/2605.15115v1 A Practical Guide to Instrumental Variables Methods with Heterogeneous Treatment Effects 2026-05-14T17:29:26Z

Instrumental variables (IV) methods are central to applied microeconomics. While classical approaches assume linear models with constant effects, recent literature has shifted toward the local average treatment effect (LATE) framework to accommodate heterogeneous treatment effects. This paper provides a practical guide to aligning empirical practice with recent theory. We first examine how different specifications with covariates lead to distinct weighted averages of covariate-specific LATEs. We then discuss how parametric misspecification can undermine the causal interpretation of these estimands and suggest flexible specifications as essential robustness checks. Finally, we review formal tests for LATE assumptions and methods robust to monotonicity violations. We provide a guide to software implementations to help researchers apply the methods in practice.

2026-05-14T17:29:26Z Tymon Słoczyński Liyang Sun S. Derya Uysal http://arxiv.org/abs/2605.15085v1 From Data to Action: Accelerating Refinery Optimization with AI 2026-05-14T17:07:41Z

Nowadays refinery optimization utilizes sheer amounts of data, which can be handled with modern Linear Programming (LP) software, but the interpreting and applying the results remains challenging. Large petrochemical companies use massive models, with hundreds of thousands of input matrix elements. The LP solution is mathematically correct, but simplifications are made in the model, and data supply errors may occur. Therefore, further insight is needed to trust the results. The LP solver does not have a memory, so additional understanding could be gained by analyzing historical data and comparing it to the current plan. As such, machine learning approaches were suggested to support decision making based on the LP solution. Among these, Anomaly Detection tools are proposed to be used in tandem with the LP output. A transformed version of the popular ECOD methodology is applied. New methods are proposed to handle high-dimensional data: choosing the most informative pairs. Then, this is used alongside two 2D Anomaly Detection algorithms, revealing several business opportunities and data supply errors in the MOL refinery scheduling and planning architecture.

2026-05-14T17:07:41Z 34 pages, 17 figures Dániel Pfeifer Ábrahám Papp Tibor Bernáth Tamás Zoltán Varga Márk Czifra Botond Szilágyi Edith Alice Kovács