https://arxiv.org/api/1R1/wbCqIlRFIJLtdeTNK9x0ofE2026-06-21T10:16:21Z36316100515http://arxiv.org/abs/2601.18178v2Asymptotic properties of the multivariate Szász-Mirakyan estimator for cumulative distribution functions on the nonnegative orthant2026-05-19T02:02:50ZThe asymptotic properties of multivariate Szász-Mirakyan estimators for cumulative distribution functions (cdf) supported on the nonnegative orthant are investigated. Explicit bias and variance expansions are derived on compact subsets of the interior, yielding sharp mean squared error characterizations and optimal smoothing rates. The analysis shows that the proposed Poisson smoothing yields a non-negligible variance reduction relative to the empirical cdf, leading to asymptotic efficiency gains that can be quantified through local and global deficiency measures. The behavior of the estimator near the boundary of its support is examined separately. Under a boundary-layer scaling that preserves nondegenerate Poisson smoothing as the evaluation point approaches the boundary of $[0,\infty)^d$, bias and variance expansions are obtained that differ fundamentally from those in the interior region. In particular, the variance reduction mechanism disappears at leading order, implying that no asymptotically optimal smoothing parameter exists in the boundary regime. Central limit theorems and almost sure uniform consistency are also established. Together, these results provide a unified asymptotic theory for multivariate Szász-Mirakyan cdf estimation and clarify the distinct roles of smoothing in the interior and boundary regions.2026-01-26T05:56:30Z40 pages, 3 figures, 3 tablesGuanjie LyuFrédéric OuimetCindy Fenghttp://arxiv.org/abs/2603.07018v2TEA-Time: Transporting Effects Across Time2026-05-19T00:17:24ZTreatment effects estimated from a randomized controlled trial are local not only to the study population but also to the time at which the trial was conducted. The literature on generalizing experimental findings to new populations is extensive, yet transporting effects across time has received far less attention, and even defining the target estimand is nonobvious. We formalize the transported average treatment effect under a separable temporal effects assumption, derive two identification strategies: replicated trials and common arm, and develop doubly robust, semiparametrically efficient estimators for each. Applied to a large archive of headline A/B tests, the common arm strategy is substantially more precise but exhibits systematic bias when the temporal factor depends on the gap between intervention and measurement rather than on measurement time alone, while the replicated trials strategy, which allows this dependence, tracks the ground truth more faithfully. Simulation studies investigate when each strategy is reliable and when it silently fails.2026-03-07T03:34:13ZHarsh ParikhGabriel Levin-KonigsbergDominique Perrault-JoncasAlexander Volfovskyhttp://arxiv.org/abs/2605.19189v1Inference Functionals and Observation Operators for Distributional Statistical Models2026-05-18T23:28:36ZThis paper generalises inference functions (Godambe, 1960) to distributional statistical models, in which each probability measure is represented by a distribution--kernel pair $(T_θ, \varphi) \in \mathcal S'(\mathbb R) \times \mathcal S(\mathbb R)$. The generalisation is strategically motivated: the key properties of maximum likelihood estimation-consistency and asymptotic normality -derive not from maximising the likelihood but from the MLE being the root of a regular inference function. Extending inference functions to the distributional setting provides an optimality theory for models lacking classical densities or finite moments.
The extension requires enlarging the notion of observation. We introduce observation operators $\mathcal O : \mathcal S'(\mathbb R) \to \mathcal Y$ mapping distributional models to an observation space, and define inference functionals as estimating equations composed with these operators. The framework encompasses classical point observations, interval-censored data, convolutional measurements, and transform-based statistics.
We establish asymptotic theory (consistency, asymptotic normality, Godambe optimality) under mild conditions and derive a hierarchy of information bounds -- classical Fisher information dominates the information available through the observation operator, which in turn dominates the information captured by any inference functional -- via the Hájek--Le~Cam convolution theorem. The two gaps quantify distinct sources of information loss: the observation mechanism and the choice of inference functional. Examples include sinusoidal inference functions for heavy-tailed distributions, interval-censored location inference, elliptically contoured models, and nuisance parameters via the Bhapkar--Godambe projection.2026-05-18T23:28:36Z40 pages, one figure and one tableR. Labouriauhttp://arxiv.org/abs/2605.19164v1The Spatial Cram'{e}r--von Mises Test of Independence under $β$-Mixing: Asymptotic Theory and Python Implementation2026-05-18T22:32:52ZWe derive the asymptotic distribution of the spatial Cram'{e}r--von Mises statistic for testing bivariate independence in stationary random fields on $\mathbb{R}^2$ under polynomial $β$-mixing dependence, and document the Python implementation that reproduces all simulation results. The classical test assumes i.i.d. observations; we extend it to spatially dependent data by combining three ingredients: (i) a Davydov-type covariance bound yielding integrability of the spatial covariance kernel under $θ> 2(2+δ)/δ$; (ii) a reformulation of the inner-form test statistic as a degenerate U-statistic of order~2 with product kernel $Q = G_1 \otimes G_2$, following De Wet (1980); and (iii) an extension of Gregory's (1977) U-statistic limit theorem to $β$-mixing sequences via Yoshihara (1976). The limit distribution is a weighted sum of correlated $χ^2_1$ variables whose eigenvalues factor as products of marginal eigenvalues; in the small-bandwidth limit the correlation vanishes and the limit reduces to the classical i.i.d. form. Explicit eigenvalue formulas are given for three weight functions (uniform, optimal normal, Anderson--Darling), producing computable critical values. The software generates Mat'{e}rn random fields by circulant embedding, computes the test statistic via the inner-form kernel decomposition, evaluates asymptotic critical values by Monte Carlo, and runs permutation-based alternatives. Simulation experiments show that the Anderson--Darling weight achieves the best power, while the Mantel and cross-$K$ tests have no power against cross-dependence in spatially correlated fields.2026-05-18T22:32:52Z34 pagesMarco Mandaphttp://arxiv.org/abs/2506.17036v2Bayesian Joint Model of Multi-Sensor and Failure Event Data for Multi-Mode Failure Prediction2026-05-18T21:58:27ZModern industrial systems are often subject to multiple failure modes, and their conditions are monitored by multiple sensors, generating multiple time-series signals. Additionally, time-to-failure data are commonly available. Accurately predicting a system's remaining useful life (RUL) requires effectively leveraging multi-sensor time-series data alongside multi-mode failure event data. In most existing models, failure modes and RUL prediction are performed independently, ignoring the inherent relationship between these two tasks. Some models integrate multiple failure modes and event prediction using black-box machine learning approaches, which lack statistical rigor and cannot characterize the inherent uncertainty in the model and data. This paper introduces a unified approach to jointly model the multi-sensor time-series data and failure time concerning multiple failure modes. This proposed model integrate a Cox proportional hazards model, a Convolved Multi-output Gaussian Process, and multinomial failure mode distributions in a hierarchical Bayesian framework with corresponding priors, enabling accurate prediction with robust uncertainty quantification. Posterior distributions are effectively obtained by Variational Bayes, and prediction is performed with Monte Carlo sampling. The advantages of the proposed model is validated through extensive numerical and case studies with jet-engine dataset.2025-06-20T14:44:15ZSina Aghaee Dabaghan FardMinhee KimAkash DeepJaesung Leehttp://arxiv.org/abs/2507.11719v2Barycentric model aggregation in the Wasserstein space of distributions and a variational approach to consistency2026-05-18T21:15:53ZWe study the problem of model aggregation within the Wasserstein space for probability measures on the real line. Given a fixed finite collection of candidate probability models, we consider the associated class of Wasserstein barycenters and develop a data-driven calibration framework in which the aggregation weights are statistically learned from empirical information associated with a target distribution. From a variational perspective based on $Γ$-convergence, we establish consistency of the resulting aggregation scheme, showing that empirical minimizers converge to the minimizers of the actual problem, along with the associated barycentric estimators, under mild conditions. The performance of the proposed method is evaluated through synthetic experiments and illustrated on a real dataset from a temperature monitoring network of sensors.2025-07-15T20:41:57Z25 pages, 5 figuresEmmanouil AndroulakisGeorgios I. PapayiannisAthanasios N. Yannacopouloshttp://arxiv.org/abs/2504.08220v2Feature aware covariance estimation, with application to mixtures of chemical exposures2026-05-18T20:59:11ZThe motivation of this article is to improve inferences on the covariation in environmental exposures, motivated by data from a study of Toddlers Exposure to SVOCs in Indoor Environments (TESIE). The challenge is that the sample size is limited, so empirical covariance provides a poor estimate. In related applications, Bayesian factor models have been popular; these approaches express the covariance as low rank plus diagonal and can infer the number of factors adaptively. However, they have the disadvantage of shrinking towards a diagonal covariance, often under estimating important covariation patterns in the data. Alternatively, the dimensionality problem is addressed by collapsing the detailed exposure data within chemical classes, potentially obscuring important information. We apply a feature aware covariance regression extension of Bayesian factor analysis, which improves performance by including information from features summarizing properties of the different exposures. This approach enables shrinkage to more flexible covariance structures, reducing the over-shrinkage problem, as we illustrate in the TESIE data using various chemical features.2025-04-11T03:00:12Z25 pages, 6 figuresElizabeth BerssonKate HoffmanHeather M. StapletonDavid B. Dunsonhttp://arxiv.org/abs/2605.19113v1Learning Interpretable Point-Based Clinical Risk Scores via Direct Optimization2026-05-18T20:58:39ZMany clinical risk scores are deployed as additive rules with nonnegative integer points assigned to relevant binary predictive features. These integer weights not only make the score easier to use in practice but also promote sparsity in the resulting prediction model. Such risk scores are often derived by first fitting a regression model and then rounding the estimated coefficients to the nearest integer after appropriate scaling. This approach is computationally fast but does not guarantee optimality of the resulting score. Alternatively, one may search over all possible integer weights to directly optimize a value function by posing the problem as an integer programming task. However, the associated computational burden can be substantial, especially when the value function is nonconcave or even discontinuous. In this paper, we develop new machine learning algorithms that employ a flexible greedy optimization strategy to learn such additive scoring directly under explicit and sensible optimality objectives. We apply the proposed method to a large electronic health record (EHR) cohort in Epic Cosmos to construct an integer-weighted comorbidity score for measuring the risk of post-discharge mortality. We also conduct a simulation study to examine the finite-sample operating characteristics.2026-05-18T20:58:39Z23 pages, 4 figuresYing CuiAlbert M LiVivek CharuYeon-Mi HwangTina Hernandez-BoussardLu Tianhttp://arxiv.org/abs/2506.20058v2Causal mediation analysis for longitudinal and survival data in continuous time using Bayesian non-parametric joint models2026-05-18T20:51:54ZObservational cohort data is an important source of information for understanding the causal effects of treatments on survival and the degree to which these effects are mediated through changes in disease-related risk factors. However, these analyses are often complicated by irregular data collection intervals and the presence of longitudinal confounders and mediators. We propose a causal mediation framework that jointly models longitudinal exposures, confounders, mediators, and time-to-event outcomes as continuous functions of age. This framework for longitudinal covariate trajectories enables statistical inference even at ages where the subject's covariate measurements are unavailable. The observed data distribution in our framework is modeled using an enriched Dirichlet process mixture (EDPM) model. Using data from the Atherosclerosis Risk in Communities cohort study, we apply our methods to assess how medication -- prescribed to target cardiovascular disease (CVD) risk factors -- affects the time-to-CVD death.2025-06-24T23:43:36ZSaurabh BhandariMichael J. DanielsJuned Siddiquehttp://arxiv.org/abs/2605.19100v1ldmppr: Location Dependent Marked Point Processes in R2026-05-18T20:40:43ZIn this article, we present $\textbf{ldmppr}$, an R package for estimating, evaluating, simulating from, and visualizing location-dependent marked spatial point processes. To date, it has commonly been assumed that the marks associated with a point process are independent of the locations. However, when dealing with many point processes, such as those arising in forestry applications, the independence assumption proves unreasonable. We introduce a practical framework for generating marked point processes with dependence between the marks and locations. We provide a brief discussion of the theory underpinning our modeling approach and outline the use of the package in a typical scenario involving real data. We highlight the functionality of the package for both generating from and assessing the goodness-of-fit of a given model, enabling users to generate realistic point patterns given a reference pattern or parameter values of interest.2026-05-18T20:40:43Z31 pages, 5 figuresLane DrewAndee Kaplanhttp://arxiv.org/abs/2205.02726v3Asymptotic Efficiency Bounds for a Class of Experimental Designs2026-05-18T20:14:02ZWe consider an experimental design setting in which units are assigned to treatment after being sampled sequentially from an infinite population. We derive asymptotic efficiency bounds that apply to data from any experiment that assigns treatment as a (possibly randomized) function of covariates and past outcome data, including stratification on covariates and adaptive designs. For estimating the average treatment effect of a binary treatment, our results show that no further first order asymptotic efficiency improvement is possible relative to an estimator that achieves the Hahn (1998) bound in an experimental design where the propensity score is chosen to minimize this bound. Our results also apply to settings with multiple treatments with possible constraints on treatment, as well as covariate based sampling of a single outcome.2022-05-05T15:57:06ZTimothy B. Armstronghttp://arxiv.org/abs/2512.05650v4Efficient sequential Bayesian inference for state-space epidemic models using ensemble data assimilation2026-05-18T19:22:01ZEstimating latent epidemic states and model parameters from partially observed, noisy data remains a major challenge in infectious disease modeling. State-space formulations provide a coherent probabilistic framework for such inference, yet fully Bayesian estimation is often computationally prohibitive because evaluating the observed-data likelihood requires integration over a latent trajectory. The Sequential Monte Carlo squared (SMC$^2$) algorithm offers a principled approach for joint state and parameter inference, combining an outer SMC sampler over parameters with an inner particle filter that estimates the likelihood up to the current time point. Despite its theoretical appeal, this nested particle filter imposes substantial computational cost, limiting routine use in near-real-time outbreak response. We propose Ensemble SMC$^2$ (eSMC$^2$), a computationally efficient variant that replaces the inner particle filter with an Ensemble Kalman Filter (EnKF) to approximate the incremental likelihood at each observation time. While this substitution introduces bias via a Gaussian approximation, we mitigate finite-sample effects using an unbiased Gaussian density estimator and adapt the EnKF for epidemic data through state-dependent observation variance. This makes our approach particularly suitable for overdispersed incidence data commonly encountered in infectious disease surveillance. Simulation experiments with known ground truth and an application to 2022 United States (U.S.) monkeypox incidence data demonstrate that eSMC$^2$ achieves substantial computational gains while producing posterior estimates comparable to SMC$^2$. The method accurately reconstructs epidemic trajectories and estimates key epidemiological parameters, providing an efficient framework for sequential Bayesian inference from imperfect surveillance data.2025-12-05T11:51:55ZDhorasso TemfackJason Wysehttp://arxiv.org/abs/2605.19034v1Sparse Latent Class Analysis: Post-Estimation Refinement via Item-level Pseudo-Likelihood2026-05-18T18:57:52ZLatent Class Analysis (LCA) is widely used to identify unobserved subgroups in social and behavioural sciences. A long-standing challenge for LCA is the interpretability of the latent classes, due to the high complexity of the estimated item response probability matrix. To address this, we propose a computationally efficient post-estimation refinement procedure that enhances model interpretability by a sparse model estimate. The method begins by estimating a classical, unrestricted, latent class model and determining the number of classes using the Bayesian information criterion (BIC). It is followed by a refinement step that further performs model selection on the item-specific response probabilities based on the initial estimate. This refinement penalises the number of distinct response probability levels per item, collapsing redundant levels to yield a sparse matrix that is significantly easier to interpret than those produced by classical LCA. We provide asymptotic theory showing that the proposed procedure consistently recovers the sparse pattern of the item response probabilities for each item, and further validate its performance through extensive simulations. The practical power of the proposed method is further illustrated via an application to survey data on social role performance, where it provides a parsimonious and clear characterisation of the resulting latent classes. The code for implementing the proposed method is publicly available at https://github.com/florence07/Sparse-LCA-Refinement.2026-05-18T18:57:52Z32 pages, 4 figuresYuxuan XuLea KaufmannYunxiao ChenMaria KateriIrini Moustakihttp://arxiv.org/abs/2605.19024v1Conformal Prediction via Transported Beta Laws2026-05-18T18:46:39ZSplit conformal prediction provides finite-sample marginal coverage under exchangeability, but this guarantee averages over the random calibration sample. We study instead the law of the calibration-conditional coverage induced by a realized conformal threshold. In the continuous i.i.d. setting this law is exactly $Beta(k,n+1-k)$, so the usual marginal guarantee corresponds to its mean. We take this beta law as a finite-sample reference object and quantify departures from it using Wasserstein distances on $[0,1]$. The framework yields direct bounds on marginal coverage gaps and on bad-calibration probabilities, and separates different sources of non-i.i.d. behavior according to how they deform the beta reference: test-side shift acts through a transport map on the coverage scale, while calibration dependence changes the order-statistic law itself. We instantiate the framework in scale-shift, clustered, and stationary mixing settings, where the induced deformations can be characterized explicitly or through Berry-Esseen approximations. Simulations on dependent processes confirm that the first-order approximation tracks the empirical Wasserstein distance even at moderate sample sizes.2026-05-18T18:46:39ZThiago R. RamosHelton GraziadeiLuben M. C. Cabezashttp://arxiv.org/abs/2605.19006v1Causal Inference with Categorical Unobserved Confounder via Mixture Learning2026-05-18T18:28:16ZUnobserved confounding is a fundamental challenge for estimating causal effects. To address unobserved confounding, recent literature has turned to two different approaches -- proxy variables and the use of multiple treatments. The first approach, commonly referred to as proximal causal inference, requires proxies to be assigned to specific asymmetric roles: treatment-inducing proxies (negative control exposures), variables that act as common causes of the treatment and outcome, and outcome-inducing proxies (negative control outcomes). In practice, however, identifying variables that satisfy these asymmetric roles can be difficult depending on the application domain. The second approach, commonly referred to as the ``Deconfounder," deals with multiple conditionally independent treatments. There has been limited progress towards developing a consistent estimation method for this setting. As the primary contribution of this work, we establish that causal effects are identifiable in both settings when the unobserved confounder is categorical under suitable conditions. Our approach builds on a mixture learning perspective: we show that the underlying confounding structure can be recovered by identifying the corresponding mixture distribution. We propose an estimation procedure based on tensor decomposition, which allows consistent recovery of the latent structure and comes with non-asymptotic guarantees. Simulation studies and real data experiments demonstrate that the proposed method performs well even with limited data.2026-05-18T18:28:16ZAytijhya SahaStephen BatesDevavrat Shah