https://arxiv.org/api/Ntzc1N2WdjXx48dTa4eZBVibpu42026-06-21T16:19:17Z36316108015http://arxiv.org/abs/2605.16828v1Prediction-Intervention Games and Invariant Sets2026-05-16T06:15:41ZWe consider the following two-player game: using observational data, the leader chooses a prediction function for a response variable $Y$ from given covariates. The follower then reacts with an intervention on some covariates in the underlying structural causal model to maximize their own objective. The leader knows the intervention targets, but may have limited knowledge of the follower's objective. We call this setup a prediction-intervention game, a special case of a Stackelberg game. Finding an optimal strategy for the leader is generally difficult. To avoid severe performance loss, the leader may base their prediction on the causal parents of $Y$, or more generally on an invariant subset of covariates. We prove, for two common classes of follower objectives, that predictors based on the stable blanket, a specific invariant subset, are always better or as good as those based on the causal parents. We further upper bound the leader's post-intervention risk by a worst-case risk over allowed interventions and strengthen existing distribution generalization results to analyze this bound: we give sufficient conditions under which stable-blanket predictors are worst-case optimal, and show by examples that these conditions cannot in general be dropped. Finally, we discuss practical strategies for settings with known and unknown graph, and test them on simulated and real-world data.2026-05-16T06:15:41ZLinus KühneFelix SchurJonas Petershttp://arxiv.org/abs/2603.20904v5Weak-Form Recovery of Stochastic Generators and Dynamical Invariants2026-05-16T05:54:31ZSpectral gaps, Kramers escape rates, and position-dependent relaxation timescales are dynamical invariants encoded in the infinitesimal generator $\Lop$ of a stochastic flow. We show that weak projection of the governing Itô SDE onto temporal test functions produces an endogeneity bias of order $O(T\,\dt^{3/2})$ that grows with the observation window and cannot be eliminated by additional data. Projecting instead onto spatial Gaussian kernels removes the bias exactly: $\mathcal{F}_{t_n}$-measurability and the tower property guarantee unbiased regression rows at every step. The resulting framework jointly identifies the drift $b(x)$ and diffusion $a(x)$ from a single sparse regression, producing an explicit symbolic enerator amenable to spectral analysis. Validation on three benchmark systems yields coefficient errors below 5%, stationary-density total-variation distances below 0.01, and autocorrelation functions that faithfully reproduce true relaxation timescales.2026-03-21T18:28:10Z21 pages, 5 figuresEshwar R AGajanan V. Honnavarhttp://arxiv.org/abs/2501.02475v2Tactics for Improving Least Squares Estimation2026-05-16T04:13:38ZThis paper deals with tactics for fast computation in least squares regression in high dimensions. These tactics include: (a) the majorization-minimization (MM) principle, (b) smoothing by Moreau envelopes, and (c) the proximal distance principle for constrained estimation. In iteratively reweighted least squares, the MM principle can create a surrogate function that trades case weights for adjusted responses. Reduction to ordinary least squares then permits the reuse of the Gram matrix and its Cholesky decomposition across iterations. This tactic is pertinent to estimation in L2E regression and generalized linear models. For problems such as quantile regression, non-smooth terms of an objective function can be replaced by their Moreau envelope approximations and majorized by spherical quadratics. Finally, penalized regression with distance-to-set penalties also benefits from this perspective. Our numerical experiments validate the speed and utility of deweighting and Moreau envelope approximations. Julia software implementing these experiments is available on our web page.2025-01-05T08:18:58ZQiang HengHua ZhouKenneth Langehttp://arxiv.org/abs/2605.16757v1NeuroMAS: Multi-Agent Systems as Neural Networks with Joint Reinforcement Learning2026-05-16T02:11:34ZMulti-agent language systems are often built as hand-designed workflows, where agents are assigned semantic roles and communication protocols are specified in advance. We propose NeuroMAS, a method that first treats a multi-agent language system as a trainable and scalable neural-network-like architecture with LLM agents as nodes and intermediate textual signals as edges. In NeuroMAS, agent nodes are role-free but structure-aware: the topology only determines how information can flow in general, while reinforcement learning training determines how nodes communicate, specialize, and coordinate. This formulation shifts multi-agent design from workflow engineering toward architecture design, where depth, width, connectivity, and growth protocol become scalable sources of capability. Further, we provide a theoretical perspective showing why such modular textual computation is more parameter-efficient when tasks admit hierarchical decompositions. Experiments show that NeuroMAS improves significantly over both inference-time and trained multi-agent baselines. We further find that organizational scaling is path-dependent: larger systems can be challenging to train from scratch, but become feasible when grown progressively from smaller trained systems. These results suggest that learned neural multi-agent systems are a promising scaling axis for LLMs.2026-05-16T02:11:34ZHaoran LuLuyang FangWenxuan ZhongPing Mahttp://arxiv.org/abs/2605.16742v1Diffeomorphic Cortical Alignment via Direct Warping of Streamline Endpoints2026-05-16T01:46:32ZCortical surface registration is often driven by local geometric descriptors (e.g., sulcal depth and curvature). While this approach achieves geometric correspondence, it neglects the long-range wiring constraints imposed by white-matter anatomy. Diffusion MRI tractography offers these crucial constraints; however, prior connectivity-informed pipelines typically align precomputed connectivity matrices, making the optimization highly sensitive to connectivity estimation and its resolution. In this paper, we introduce a novel connectivity-based surface registration method that aligns cortical surfaces by operating directly on white-matter fiber-tract endpoints. We model tract endpoints as a point cloud on the product manifold $Ω\times Ω$, where $Ω$ represents the spherical domain of the inflated cortical hemispheres. Our alignment method iteratively (i) computes a small diffeomorphic warp for $Ω$ by minimizing connectivity mismatch, and (ii) updates the endpoints based on this warp. The method relies on a geometric framework that ensures output warps are diffeomorphisms and has a final goal that optimizes the matching of well-known fiber bundles. Experiments on Human Connectome Project (HCP) data demonstrate improved tract-level correspondence, achieving higher connectivity-level overlap coefficients on major fiber bundles and stronger robustness across grid resolutions for $Ω$ compared to state-of-the-art methods such as ENCORE and MSMAll.2026-05-16T01:46:32ZYang XiangMartin ColeZhengwu Zhanghttp://arxiv.org/abs/2509.15480v2A tree-based kernel for densities and its applications in clustering DNase-seq profiles2026-05-15T22:44:29ZModeling multiple sampling densities within a hierarchical framework enables borrowing of information across samples. These density random effects can act as kernels in latent variable models to represent exchangeable subgroups or clusters. A key feature of these kernels is the (functional) covariance they induce, which determines how densities are grouped in mixture models. Our motivating problem is clustering chromatin accessibility profiles from high-throughput DNase-seq experiments to detect transcription factor (TF) binding. TF binding typically produces footprint profiles with spatial patterns, creating long-range dependency across genomic locations. Existing nonparametric hierarchical models impose restrictive covariance assumptions and cannot accommodate such dependencies, often leading to biologically uninformative clusters. We propose a nonparametric density kernel flexible enough to capture diverse covariance structures and adaptive to various spatial patterns of TF footprints. The kernel specifies dyadic tree splitting probabilities via a multivariate logit-normal model with a sparse precision matrix. Bayesian inference for latent variable models using this kernel is implemented through Gibbs sampling with Polya-Gamma augmentation. Extensive simulations show that our kernel substantially improves clustering accuracy. We apply the proposed mixture model to DNase-seq data from the ENCODE project, which results in biologically meaningful clusters corresponding to binding events of two common TFs.2025-09-18T22:56:02ZYuliang XuKaixuan LuoLi Mahttp://arxiv.org/abs/2605.16652v1Semiparametric Regression for Misclassified Competing Risks Data2026-05-15T21:45:07ZThe analysis of competing risks data is often complicated by misclassification of the cause of failure. This issue can lead to seriously biased estimates and invalid conclusions. One way to deal with such misclassification is to use a gold-standard cause of failure ascertainment procedure in a subset of the non-right-censored participants (internal validation sample) along with methods for missing data to deal with the missing gold-standard ascertainments. However, this approach can be costly and time-consuming and, therefore, cannot be implemented in many studies. In this work, we propose a semiparametric regression analysis methodology for the case where no internal validation sample exists. Our approach leverages estimates of the misclassification probabilities from an external validation study to adjust for misclassification in the study at hand. These probabilities are incorporated in a B-spline-based sieve pseudo-likelihood function, which is maximized to jointly estimate models for all event types. Using empirical process theory, we show that the proposed estimator is consistent. Extensive simulation experiments demonstrate that the method performs well with realistic sample sizes and provides substantially more efficient estimates compared to previously proposed approaches. The methodology is applied to competing risks data from a large HIV observational study in sub-Saharan Africa, where event type is misclassified due to significant death under-reporting.2026-05-15T21:45:07ZOriginal Article, Biostatistics - Survival Analysis, 2 figuresTheofanis BalanosConstantin T. YiannoutsosFelix M. Pabon-RodriguezHongmei NanGiorgos Bakoyannishttp://arxiv.org/abs/2504.15879v2Multivariate Poisson intensity estimation via low-rank tensor decomposition2026-05-15T19:59:24ZIn this work, we propose new matrix- and tensor-based methodologies for estimating multivariate intensity functions of inhomogeneous point processes. By viewing multivariate intensity functions as infinite-dimensional matrices or tensors within function spaces, our algorithms attain the optimal bias-variance trade-off, yielding rate-optimal estimation error, with model complexity governed by matrix or tensor ranks. They substantially improve estimation accuracy, while simultaneously reducing computational cost. To illustrate the adaptivity of the proposed framework, we show that many fundamental classes of multivariate functions, including additive and mean-field models, admit finite-rank tensor representations. We apply our method to a four-dimensional U.S. Geological Survey earthquake dataset, comprising features such as latitude, longitude, depth, and magnitude. Our tensor estimator recovers localized seismicity patterns (California, Oklahoma, Pacific Northwest, north-central U.S.), whereas the kernel baseline oversmooths them.2025-04-22T13:25:17ZHaotian XuCarlos Misael Madrid PadillaOscar Hernan Madrid PadillaDaren Wanghttp://arxiv.org/abs/2411.18510v2A subgroup-aware scoring approach to the study of effect modification in observational studies2026-05-15T19:00:28ZEffect modification means the size of a treatment effect varies with an observed covariate. Generally speaking, a larger treatment effect with more stable error terms is less sensitive to bias. Thus, we might be able to conclude that a study is less sensitive to unmeasured bias by using these subgroups experiencing larger treatment effects. Lee et al. (2018) proposed the submax method that leverages the joint distribution of test statistics from subgroups to draw a firmer conclusion if effect modification occurs. However, one version of the submax method uses M-statistics as the test statistics and is implemented in the R package submax (Rosenbaum, 2017). The scaling factor in the M-statistics is computed using all observations combined across subgroups. We show that this combining can confuse effect modification with outliers. We propose a novel group M-statistic that scores the matched pairs in each subgroup to tackle the issue. We examine our novel scoring strategy in extensive settings to show the superior performance. The proposed method is applied to an observational study of the effect of a malaria prevention treatment in West Africa.2024-11-27T16:58:59ZYijun FanDylan S. Smallhttp://arxiv.org/abs/2605.16246v1FRESH: Information-Geometric Calibration of Patient-Level Models to Aggregate Evidence2026-05-15T17:50:11ZThis note introduces FRESH (Fusion of Recent Evidence and Subject Histories), a method for incorporating population-level summary results -- published clinical trials, registry summaries,
prior natural-history studies, and peer-reviewed indirect comparisons -- into predictive models trained on patient-level data. This method provides a principled means of combining both
patient-level and aggregate-level data types into a unified data-efficient model for clinical decision making.
FRESH assumes access to a generative model trained on patient-level data sources (e.g. clinical trial or real-world data). The method produces patient-level predictions from a re-calibrated
model that matches a set of specified aggregate statistics for a target population. This can be understood as a patient-level recapitulation of the aggregate source -- with the key property
that the recalibration is a minimal perturbation of the original joint distribution in a specific information-geometric sense. The resulting samples can be analyzed directly or combined into a
post-training procedure to update the original generative model.
This approach enables several applications where rigorously incorporating patient-level data with summary information is valuable, including (i) contextualizing single-arm trial results with
respect to recent standard-of-care, (ii) clinical-trial simulations for design and probability-of-technical-success estimation, and (iii) comparative-effectiveness analyses of on-market
therapies.2026-05-15T17:50:11ZFranklin FullerDaniele BertoliniSamantha LiangJason ChristopherAaron M. Smithhttp://arxiv.org/abs/2605.16221v1Why Empirical p-Values Are Not Uniform: Reference Samples, Dependence, and PIT Backtesting2026-05-15T17:32:13ZProbability integral transforms (PITs) and empirical $p$-values are widely used to assess the calibration of predictive distributions. While exact PIT values are uniformly distributed under correct model specification, practical implementations rely on empirical estimates constructed from finite samples. We show that this estimation step fundamentally alters the statistical structure of the problem. In particular, common-sample and rolling-window implementations introduce dependence and variance distortions that invalidate classical one-sample uniformity tests. When empirical percentiles are conditioned on a shared reference sample, the resulting statistics converge towards a two-sample Kolmogorov--Smirnov regime, while rolling windows induce autocorrelation and variance suppression. Our findings indicate that treating empirical percentiles as independent uniform draws can distort statistical inference and that backtesting procedures based on PITs require revised calibration methods accounting for the underlying two-stage sampling structure.2026-05-15T17:32:13Z16 pages, 5 figuresJakub Lishttp://arxiv.org/abs/2510.18903v3Centered-Innovation MA for Bayesian Dirichlet ARMA: Theoretical Equivalence and an Application to Bank-Asset Shares2026-05-15T17:28:44ZWe study a minimal change to an observation-driven Bayesian Dirichlet ARMA (B--DARMA) for compositional time series: replace the raw additive log-ratio (ALR) residual in the moving-average block with a centered innovation that subtracts the Dirichlet conditional ALR mean, available in closed form via digamma identities. We prove a recursion-level first-order equivalence (in $1/φ$) between the centered specification and a digamma-link DARMA at fixed parameters, under explicit interior and lag-stability conditions. The result clarifies why the two specifications should be predictively indistinguishable in the high-precision regime but does not by itself govern the geometry of the Bayesian posteriors that re-estimation produces. On weekly Federal Reserve H.8 bank-asset shares (October~2015 through October~2025, $T=522$ weeks), predictive performance is statistically indistinguishable across $104$ rolling weekly origins on every accuracy metric examined, while Hamiltonian Monte Carlo divergent transitions are approximately an order of magnitude more frequent under the raw specification, driven by isolated rolling fits at which the raw posterior exhibits localized pathologies. A four-reference sensitivity analysis confirms that predictive equivalence is reference-invariant and that the geometric advantage of centering is preserved across references but varies with the prevalence of pathological raw fits, from a substantial reduction at the loans reference to parity at the cash reference. The practical implication is operational rather than predictive: centering avoids the catastrophic raw-MA divergence spikes that occur at isolated rolling origins, which matters for production workflows in which posterior simulation feeds downstream stress tests. The adjustment is analytic and plug-in, and requires only a local change to the MA innovation calculation.2025-10-20T22:13:35ZHarrison Katzhttp://arxiv.org/abs/2508.14690v3Nesting a Target Study within a Target Trial: A Framework for Evaluating Intervention Effects on Disparities2026-05-15T16:30:13ZWe present a novel framework (TS+TT) to nest a Target Study (TS) within a Target Trial (TT) for evaluating the effects of interventions on disparities. The TS component grounds the measurement of disparity in ethical assumptions, based on the concept of allowability, and anchors it to an explicit population within calendar time. It specifies an enrollment plan of stratified sampling of eligible persons to yield a sample where social groups are distributionally similar on covariates deemed allowable for measuring disparity. Within this enrolled sample, the TT component specifies randomization of intervention strategies within each social group. Because social groups are similarly situated on allowable covariates at baseline, and because assigned intervention arms are exchangeable within social groups, TS+TT reflects a meaningful causal estimand for evaluating how interventions impact disparity. We describe the framework's key components, its emulation, and demonstrate its application to evaluate how hypothetical interventions on pulse oximeter bias affect disparities in treatment receipt in clinical care. We also extend semiparametric G-computation to accommodate continuous stochastic interventions and estimate counterfactual disparities in time-to-event outcomes. The TS+TT framework offers a versatile and policy-relevant approach for generating ethically informed causal evidence to reduce disparities and avoid exacerbating disparities.2025-08-20T13:14:32ZMain text: 23 pages, 4 tables; Appendix: 45 pagesXinyi SunTheodore J. IwashynaEmmanuel F. DraboDeidra C. CrewsKadija FerrymanJohn W. Jacksonhttp://arxiv.org/abs/2605.15108v2Logging Policy Design for Off-Policy Evaluation2026-05-15T16:30:03ZOff-policy evaluation (OPE) estimates the value of a target treatment policy (e.g., a recommender system) using data collected by a different logging policy. It enables high-stakes experimentation without live deployment, yet in practice accuracy depends heavily on the logging policy used to collect data for computing the estimate. We study how to design logging policies that minimize OPE error for given target policies. We characterize a fundamental reward-coverage tradeoff: concentrating probability mass on high-reward actions reduces variance but risks missing signal on actions the target policy may take. We propose a unifying framework for logging policy design and derive optimal policies in canonical informational regimes where the target policy and reward distribution are (i) known, (ii) unknown, and (iii) partially known through priors or noisy estimates at logging time. Our results provide actionable guidance for firms choosing among multiple candidate recommendation systems. We demonstrate the importance of treatment selection when gathering data for OPE, and describe theoretically optimal approaches when this is a firm's primary objective. We also distill practical design principles for selecting logging policies when operational constraints prevent implementing the theoretical optimum.2026-05-14T17:25:19ZConnor DouglasJoel PerssonFoster Provosthttp://arxiv.org/abs/2605.12830v2Linking COPD Prevalence with Income Distribution: A Spatial Heterogeneous Compositional Regression via Geographically Weighted Penalized Approach2026-05-15T16:07:51ZIncome inequality is a major contributor to health disparities, yet its effects often vary by geography and are commonly represented as compositional distributions (e.g., proportions of households across income brackets). Existing spatial regression methods struggle in this setting: they typically assume smooth spatial variation, cannot accommodate abrupt spatial heterogeneity, and lack principled treatment of compositional covariates. We propose a geographically weighted penalized compositional regression model that addresses these challenges simultaneously. Our method adopts a pairwise fusion penalty that enables detection of both contiguous and noncontiguous regional clusters with shared regression effects, thereby relaxing strong assumptions of spatial smoothness and geographic contiguity. This allows regions with similar underlying socioeconomic structures to be identified even when they are not geographically adjacent. By incorporating nonconvex penalties, such as the minimax concave penalty (MCP), the approach achieves improved estimation accuracy, interpretability, and scalability in high-dimensional spatial settings. We illustrate the method through an analysis linking U.S. income composition to chronic obstructive pulmonary disease (COPD) prevalence, revealing spatially heterogeneous associations that are obscured by conventional models. The proposed framework provides a flexible and robust tool for spatial data analysis involving compositional predictors and region-specific heterogeneity.2026-05-12T23:54:29Z39 pages, 7 figures, appendix includedJingwen DengShujie MaSergio J. ReyGuanyu Hu