https://arxiv.org/api/WQxlRi7DCU7QNWkZBbnLIrunFjU2026-06-18T09:50:13Z3629678015http://arxiv.org/abs/2605.15902v2Tweedie's Formula and Score-Driven Updating2026-05-26T09:55:03ZScore-driven models update time-varying parameters using conditional likelihood scores. This paper develops a Bayesian interpretation of such updates through Tweedie's formula, which connects posterior mean corrections with marginal scores. In Gaussian signal extraction, this gives an exact posterior-correction identity. For natural exponential families, related identities characterize posterior means in natural- and expectation-parameter spaces. Building on these identities, we show that conjugate Bayesian filtering in expectation space coincides exactly with an inverse-Fisher-scaled conditional score update under local precision discounting. For general conditional densities, the exact Bayesian correction involves a generally unavailable predictive-marginal score. A local Gaussian approximation shows that the conditional likelihood score provides the leading approximation to this posterior correction; under local precision discounting, the predictive covariance becomes proportional to inverse Fisher information, yielding the familiar inverse-Fisher-scaled score recursion. The results clarify when score-driven updates are exact Bayesian filters and when they should instead be viewed as tractable local approximations.2026-05-15T12:41:05ZPeter Reinhard HansenChen Tonghttp://arxiv.org/abs/2605.26723v1Marginal likelihoods for finite-support Huber contamination2026-05-26T09:03:09ZFor Huber contamination on a known finite sample space, the unrestricted contaminating law is a probability vector on the support atoms, and domination over all measurable subsets reduces to atomwise inequalities. Placing a Dirichlet prior on this probability vector and a Beta prior on the contamination proportion gives an exact marginal likelihood for the structural parameter after analytic integration of both nuisance quantities. The likelihood is a finite weighted sum over allocations of the observed counts between the structural and contaminating components. For fixed support size, this sum and its score can be evaluated by a dynamic program with quadratic cost in the sample size, enabling gradient-based posterior sampling.2026-05-26T09:03:09Z16 pages, 3 figuresJaehoan Kimhttp://arxiv.org/abs/2605.26608v1Statistical Inference and Stability Boundaries of Multi-cellular Interaction Hypergraphs from Asynchronous Event Streams2026-05-26T06:45:04ZWe introduce the Hyperedge-triggered Hawkes (HTH) process for inferring higher-order interaction structure in multi-cellular systems from asynchronous event-time data. Beyond standard pairwise excitation, the HTH intensity includes a term activated by the simultaneous co-firing of a cell group within a temporal window. We derive a closed-form Expectation-Maximisation algorithm whose key ingredient is a piecewise compensator that eliminates the systematic bias present in the naive integral formulation. A CP tensor decomposition reduces the hyperedge parameter count from O(N^K) to O(NR). Across eleven synthetic experiments the framework achieves pairwise recovery error below 5%, while revealing a systematic -22% bias on hyperedge weights that is non-monotonic in the kernel decay rate, ruling out a simple temporal-overlap explanation and motivating adaptive kernel methods. On multi-electrode recordings of mouse retinal ganglion cells, the model yields a +20.6 nat likelihood gain over the pairwise baseline, providing suggestive but not decisive evidence for higher-order interactions. Code and all experiments are publicly available at https://github.com/Hanii0210/hypergraph-hawkes.2026-05-26T06:45:04Z8 pages, 5 figures, 1 tableZihan Xuhttp://arxiv.org/abs/2605.26572v1Using Transcripts for Nonparametric Monitoring of Serial Dependence2026-05-26T05:44:37ZControl charts for process monitoring are widely used in practice. Most control charts require the monitored (residuals) process to be serially independent (and to satisfy specified distributional assumptions), whereas undetected dependence (or violations of distributional assumptions) may severely affect the charts' performances. Therefore, (distribution-free) control charts for monitoring serial dependence are of utmost relevance for practice. Recently, various nonparametric control charts have been proposed for this purpose, which are based on ordinal patterns, and which showed an appealing performance in detecting different types of serial dependence. In this research, we further progress in this direction and develop novel nonparametric control charts being based on transcripts and algebraic distances (as derived from ordinal patterns). The performance of the newly proposed control charts is evaluated in a simulation study, and their application in practice is illustrated with a real-world data example from chemical industry.2026-05-26T05:44:37ZChristian H. WeißJosé M. Amigóhttp://arxiv.org/abs/2605.26568v1Target-Oriented Statistical Compression: Sufficiency, Reverse Martingales, and Sequential Monitoring2026-05-26T05:37:10ZStatistical procedures rarely retain all features of the observed data. A sufficient statistic removes information irrelevant to a parameter; a maximum likelihood estimate compresses an empirical objective into an optimizing point; and a hidden state in a sequential model compresses past observations into a learned representation. This article develops these practices under the unified notion of \emph{target-oriented statistical compression}: a useful summary preserves what matters for an inferential, predictive, or decision-relevant target, rather than every detail of the realized data path.
The central object is the conditional target process \(M_n=\E(Z\given\G_n)\), where \(Z\) is the target and \(\G_n=σ(T_n)\) is the information retained by the compression map \(T_n\). When \((\G_n)\) is a decreasing filtration, \((M_n)\) is a reverse martingale with limit \(M_\infty=\E(Z\given\G_\infty)\). Exact sufficiency corresponds to lossless compression, while approximate summaries such as penalized estimators, principal components, and neural-network hidden states produce reverse quasi-martingale defects measuring coherence loss across compression levels. The diagnostic \(r_n=|M_n-M_{n-1}|\) is treated as an observable stability proxy, not as an unbiased estimator of the theoretical defect.
Boundary degeneracy in sequential binary problems is developed as a central application. Practical boundary claims require joint assessment of boundary closeness, uncertainty control, and trajectory stability. The companion paper \citet{chang2025rm} develops the corresponding stopping procedures, finite-sample bounds, and numerical evidence; the present paper provides the broader theoretical infrastructure and extends the framework to Gaussian, Poisson, and quasi-martingale monitoring problems.2026-05-26T05:37:10Z28 pages, 9 figuresYuan-chin Ivan Changhttp://arxiv.org/abs/2605.26532v1Global Average Treatment Effects for Individualized Randomization Experiments with Aggregate Data2026-05-26T04:23:53ZIndividualized randomized experiments are central to online platforms for optimizing personalized decisions in complex environments. In two-sided markets, however, standard treatment effect estimation is often invalid due to strong temporal and cross-unit interference, a challenge compounded when only aggregated data are available because of privacy or system constraints. To address these issues, we identify the Global Average Treatment Effect (GATE) using only group-level data from treatment and control groups. We first establish identification conditions based on aggregated observations, and then propose the Individualized Randomized Experiment Varying Coefficient Decision Process (IRE-VCDP) model, which accounts for interference through supply-demand dynamics. Building on this framework, we develop a complete procedure for estimation and statistical inference of the GATE, along with theoretical guarantees for the proposed test. Extensive simulations and real-world experiments using data from a leading ridesharing platform demonstrate the effectiveness of our approach.2026-05-26T04:23:53ZShuguang YuTing LiYuchen LuChengchun ShiFan ZhouZhichao ZouPeng ZhenHongtu Zhuhttp://arxiv.org/abs/2605.26515v1Learning a directed acyclic graph with additive heteroscedastic errors2026-05-26T03:57:58ZThis paper studies causal discovery for a directed acyclic graph under a structural equation model with additive heteroscedastic errors. We first establish new identifiability results for location-scale noise models, showing that heteroscedasticity can be leveraged to recover causal directions. Based on these insights, we propose a novel iterative procedure, Residual Simultaneous Quantile Estimation (RESQUE), where each iteration consists of a residual-construction stage and a composite quantile regression stage, enabling recursive identification of sink nodes via the invariance of conditional scale coefficients across quantiles. We then establish its theoretical guarantees for recovering topological order and graph structure, even when the number of variables diverges with the sample size. Simulation studies and application to benchmark datasets show that RESQUE performs favorably compared with existing methods, especially when causal information is partly encoded in the variance component. These results highlight exploiting structured variance signals for causal discovery and provide a principled framework for multivariate causal discovery beyond mean-based modeling.2026-05-26T03:57:58Z33 pages, 4 figuresXintao XiaLi ChenYue HuChunlin Lihttp://arxiv.org/abs/2605.26507v1Improving inverse probability of censoring weighting for win statistics with composite survival outcomes2026-05-26T03:47:36ZWin statistics, including the win ratio, net benefit, and win odds, summarize treatment effects on hierarchical composite endpoints by sequentially comparing patient pairs on component outcomes ordered by clinical importance, proceeding to lower priority components only when higher priority ones are tied. Restricting comparisons to a pre-specified clinical horizon yields well defined estimands separated from the censoring mechanism, and it is critically important to address right censoring during estimation. Existing inverse probability of censoring weighting methods discard indeterminate pairs entirely, incurring avoidable efficiency loss that grows with censoring and restriction horizon length. We propose a novel estimator that replaces the confirmation of higher priority ties with a conditional tie probability given the observed data, recovering partially observed pairs as fractional contributions. Large sample theory is developed based on two-sample U-statistics with estimated nuisance functions, and closed-form sandwich variance estimators are obtained for the win ratio, net benefit, and win odds under our new weighting scheme. Simulations demonstrate sizable efficiency gains growing sharply from light censoring to high censoring rate based on our new estimator, and we further apply our estimator to reanalyze a completed randomized clinical trial.2026-05-26T03:47:36ZXi FangFan Lihttp://arxiv.org/abs/2508.00223v4Structural Causal Models for Extremes: an Approach Based on Exponent Measures2026-05-26T01:47:39ZWe introduce a new formulation of structural causal models for extremes, called the extremal structural causal model (eSCM). Unlike conventional structural causal models, where randomness is governed by a probability distribution, eSCMs use an exponent measure, an infinite-mass law that naturally arises in the analysis of multivariate extremes. Central to this framework are activation variables, which abstract the single-big-jump principle, along with additional randomization that enriches the class of eSCM laws. This formulation encompasses all possible laws of directed graphical models under the recently introduced notion of extremal conditional independence. We also identify an inherent asymmetry in eSCMs under natural assumptions, enabling the identifiability of causal directions, a central challenge in causal inference. Finally, we propose a method that utilizes this causal asymmetry and demonstrate its effectiveness in both simulated and real datasets.2025-08-01T00:01:23ZUpdated the statement of Theorem 5Shuyang BaiFei FangTiandong Wanghttp://arxiv.org/abs/2605.26429v1Structure-Adaptive Conformal Inference for Large-Scale Out-of-Distribution Testing2026-05-26T01:28:16ZThis paper addresses structured out-of-distribution (OOD) testing in high-stakes machine learning applications. Traditional conformal methods rely on joint exchangeability, making it difficult to incorporate auxiliary information such as spatiotemporal or grouping structures. To overcome this limitation, we propose the structure-adaptive conformal q-value (SCQ), a significance index that integrates individual test evidence with structural patterns. We also develop pseudo-score-guided transductive automated model selection (P-TAMS), which adapts conformalized model selection to structured OOD testing across a toolbox of candidate models. Together, SCQ and P-TAMS form a unified framework under pairwise exchangeability, providing finite-sample error-rate control, improved power, and enhanced interpretability. Experiments on simulated and real data demonstrate that the proposed approach controls the false discovery rate and performs well across diverse settings.2026-05-26T01:28:16ZRongyi SunWenguang SunZinan Zhaohttp://arxiv.org/abs/2605.26413v1Confounder Detection via Treatment Intent: A New Observational Study Design2026-05-26T00:41:47ZUnderstanding the effects of interventions is central to scientific progress, with randomized controlled trials (RCTs) regarded as the gold standard for causal inference in many applied fields. However, RCTs are costly, time-consuming, and often constrained by ethical or practical limitations, motivating the need for causal methods able to draw conclusions from observational data. While such data is collected at ever larger scale, making its use for causal inference is often hindered by the fact that not all variables affecting treatment allocation and the outcome are observed: an issue known as unobserved confounding. In this paper, we introduce a new study design called confounder detection via treatment intent. The idea is to query a human expert who makes treatment decisions, and ask them to compare pairs of units proposed by a principled matching strategy, with the goal of eliciting unobserved variables that explain why treatment decisions differ. We provide a theoretical basis for such a procedure, ascertaining conditions under which such a study design may elicit unobserved confounders. Building on this newly established foundations, we study treatment effects of interventions in the intensive care unit (ICU). First, we show empirical evidence strongly indicating that electronic health records (EHRs) collected in ICUs are subject to unobserved confounding. By using clinical text notes as a proxy for physicians' knowledge and leveraging natural language processing, we provide a proof of concept for our methodology in a semi-synthetic environment with a known ground truth.2026-05-26T00:41:47ZDrago PleckoPatrik OkanovicTorsten HoeflerElias Bareinboimhttp://arxiv.org/abs/2605.27463v1When prompt perturbations break your A/B test: A valid statistical test for generative surveying2026-05-26T00:35:58ZGenerative surveying -- where collections of LLM-based personas provide feedback on messages -- has emerged as a cheap and scalable alternative to traditional market research. However, LLMs are sensitive to small variations in prompt design and conclusions drawn from generative surveys may depend on arbitrary phrasing choices. Controlling for this sensitivity requires including semantically equivalent perturbations in the analysis. In this paper, we show that standard hypothesis tests, including the sign test and Wilcoxon signed-rank test, are invalid under a statistical model for generative surveying that includes realistic perturbation structure. We propose a permutation test that is valid under this model and formally characterize the conditions under which standard tests fail. Applying our framework to a simple generative surveying problem, we estimate relevant parameters, characterize the power of the permutation test under realistic conditions, and provide practical guidance on budget allocation across personas, perturbations, and replicates. Finally, we show that both the magnitude and direction of the estimated effect are sensitive to the choice of model, even within the same model family.2026-05-26T00:35:58ZHayden HelmCarey Priebehttp://arxiv.org/abs/2605.26401v1Small-Area Precipitation Forecasting and Drought--Flood Early Warning with Reverse-Martingale Regularized Recurrent Networks2026-05-26T00:18:05ZSmall-area precipitation forecasts support real-time decisions for reservoir operation, irrigation planning, drought monitoring, and flash-flood response. Operational value depends not only on point accuracy, but also on calibrated exceedance probabilities and warning rules that remain stable when local weather regimes depart from the training climatology. We evaluate a reverse-martingale regularized recurrent neural network (\RMRNN) for probabilistic precipitation forecasting and sequential early warning. A backward-coherence penalty is added to the recurrent hidden state; the resulting residual process drives a Shiryaev--Roberts (SR) detector, so the same latent trajectory that produces the forecast also supplies a continuously updated drought or flood-regime indicator. The framework is tested on the Taiwan CWA dense rain-gauge network, CHIRPS v2 daily gridded precipitation over Taiwan and the Horn of Africa, and NOAA GHCN-Daily stations over the Texas Hill Country. Across 1{,}000 replications, \RMRNN{} matches or slightly improves the GRU baseline in RMSE, MAE, and CRPS at 1~h--72~h lead while substantially improving alarm characteristics. The SR detector reduces false-alarm ratios by a factor of three to five at matched detection power. In the 2020--2021 Taiwan drought, onset is flagged 8--12 days earlier than SPI-3 thresholding; in the 2023 Typhoon Haikui flood, flash-flood risk is signalled 4~h before the CWA operational alert.2026-05-26T00:18:05Z4 figuresFoo Hui-MeanYuan-chin Ivan Changhttp://arxiv.org/abs/2605.26335v1Unobserved Heterogeneity in Threshold Regression Based on the Hitting Times of a Reflected Brownian Motion for Recurrent Hypoglycemia2026-05-25T21:13:54ZAnalyses of recurrent hypoglycemia are critical for effective treatment management in diabetic patients. Typically, within-subject dependency in such analyses is captured through subject-level frailty. Recent research has modeled recurrent hypoglycemia using the first hitting times of a reflected Brownian motion. A close examination of this approach reveals that it does not adequately account for varying frailties among individuals, which indicate notable heterogeneity. To address this gap, we propose a finite mixture model of the first hitting time distribution of the reflected Brownian motion. This model allows for component-specific regression coefficients and frailty parameters, providing nuanced insights into how risk factors differently affect patient subgroups. We employ a Bayesian framework for inference, utilizing Markov chain Monte Carlo for estimation. Model selection is conducted using the Deviance Information Criterion and the Logarithm of the Pseudo-Marginal Likelihood. The effectiveness of these criteria is assessed through simulation studies. Application to recurrent hypoglycemia modeling revealed two subgroups with different risk profiles, as reflected in their volatilities. Bayesian model comparison criteria favor the model with component specific regression coefficients for volatilities. The subgroup with lower volatility exhibits a larger variance and, hence, a greater level of heterogeneity.2026-05-25T21:13:54ZYingfa XieHaoda FuYuan HuangJun Yanhttp://arxiv.org/abs/2605.26312v1Cross-modal dependence analysis with asynchronous longitudinal multimodal data2026-05-25T20:09:47ZWe propose a Bayesian latent variable model to estimate covariate-assisted dependence structures across multiple modalities of multivariate data that may be observed asynchronously. This setting commonly arises in longitudinal biomedical research, especially in observational and clinical studies of complex diseases, where dynamic and heterogeneous dependence across biomarker modalities can be pathologically and clinically informative. For example, the biological diagnosis and staging of Alzheimer's disease require integrated evaluation of multimodal biomarkers, including imaging and biofluid biomarkers, and the Alzheimer's Disease Neuroimaging Initiative (ADNI) study has collected biomarker data longitudinally for over two decades. However, quantitative analysis is often challenged by asynchronous collection of multimodal profiles due to study design and data collection constraints. Common analytic strategies that restrict inference to complete observations or analyze each modality separately can lose information and introduce bias. Therefore, we aim to jointly model all available data and estimate the population-level cross-modal dependence structure that evolves over time and varies across demographic or clinical groups, where the cross-covariance matrices for modality pairs serve as the primary quantities of interest. The proposed model uses modality-specific low-rank loading structures with shared latent variables to borrow information across modalities, visits, and subjects while accounting for repeated measurements. The application to ADNI data reveals clinically meaningful patterns in longitudinal cross-modal biomarker dependence, and the simulation study shows improved recovery under limited modality synchrony.2026-05-25T20:09:47ZKun QianHyung G. Park