https://arxiv.org/api/y1DxiRWEE+lMnaZKm5C0YaU+S7s 2026-06-21T19:46:11Z 36316 1125 15 http://arxiv.org/abs/2406.06980v2 Sensitivity Analysis for the Test-Negative Design 2026-05-14T15:50:28Z

The test-negative design has become popular for evaluating the effectiveness of post-licensure vaccines using observational data. In addition to its logistical convenience on data collection, the design is also believed to control for the differential health-care-seeking behavior between vaccinated and unvaccinated individuals, an important while often unmeasured confounder between the vaccination and infection. Hence, the design has been employed routinely to monitor seasonal flu vaccines and more recently to measure the COVID-19 vaccine effectiveness. Despite its popularity, the design has been questioned, in particular about its ability to fully control for the unmeasured confounding. In this paper, we explore deviations from a perfect test-negative design, and propose various sensitivity analysis methods for estimating the effect of vaccination measured by the causal odds ratio on the subpopulation of individuals with good health-care-seeking behavior. We start with point identification of the causal odds ratio under a test-negative design, comparing different forms of identification assumptions and their corresponding estimands. We then propose two approaches for conducting sensitivity analysis, addressing the influence of the unmeasured confounding in two different ways. Specifically, one approach investigates partial control for unmeasured confounding in the test-negative design, while the other examines the impact of unmeasured confounding on both vaccination and infection. Furthermore, we combine these approaches to provide narrower bounds on the true causal odds ratio, and further sharpen the bounds by restricting the treatment effect heterogeneity. Finally, we apply the proposed methods to evaluate the effectiveness of COVID-19 vaccines using observational data from test-negative designs.

2024-06-11T06:23:29Z Soumyabrata Kundu Peng Ding Jingshu Wang Xinran Li http://arxiv.org/abs/2605.14976v1 Multi-regime Markov-switching models with time-varying transition probabilities: An application to U.S. Treasury yields 2026-05-14T15:40:04Z

This paper studies Markov-switching (MS) models with time-varying transition probabilities (TVTP) under various specifications of the transition probability matrix. Especially, we extend the two-regime common-variance setting of the Generalized Autoregressive Score (GAS) model from (Bazzi et al., 2017) to the general $K$-regime case with regime-specific means and variances. Our study contains comprehensive Monte Carlo simulations and we developed an open-source R package, \texttt{multiregimeTVTP}, for data simulation and parameter estimation. We find that the regime means, variances, and transition probabilities are reliably recovered, whereas the TVTP driving coefficients are harder to identify. Another finding from our paper is that the GAS score coefficient appears to be statistically non-identifiable, due to a ridge in the joint likelihood surface $(σ^2,A)$. In addition, we find that one-step point forecasts are remarkably robust to TVTP misspecification, but filtered regime probabilities are not, so correct specification matters most for characterizing regime dynamics rather than short-horizon forecasting. An empirical application to U.S. Treasury zero-coupon yield changes at four maturities (1961-2024) shows that an exogenous specification driven by the lagged yield level dominates the constant and lagged-change models in fit, while the GAS specification fails to converge, with $\hat{A}$ collapsing to zero, reflecting the same identifiability issue observed in simulation.

2026-05-14T15:40:04Z 15 pages, 1 figure Samuel Modée Yushu Li Sjur Westgaard Stein Andreas Bethuelsen http://arxiv.org/abs/2605.14952v1 Generalizing conditional average treatment effects from nested randomized trials to all trial-eligible individuals 2026-05-14T15:22:03Z

Randomized controlled trials often enroll participants whose characteristics differ from those of a target population, which can limit the generalizability of the estimated treatment effects when effect modifiers differ across populations. While existing generalizability methods primarily focus on estimating the average treatment effect (ATE) in the target population, such summaries may obscure important heterogeneity that is relevant for clinical and policy decision-making. In this work, we illustrate an approach for estimating the conditional average treatment effect (CATE) in a target population of trial-eligible individuals as a function of prespecified effect modifiers within a nested trial setting. Our approach combines semiparametric theory with flexible estimation: we first estimate nuisance functions using data-adaptive methods and construct pseudo-outcomes from conditional influence functions, then estimate the CATE function via local linear (kernel) regression. Sample splitting and cross-fitting are used to reduce overfitting bias and ensure asymptotic valid inference. Finite-sample performance is assessed via simulations and illustrated in the Coronary Artery Surgery Study (CASS).

2026-05-14T15:22:03Z Lan Wen Issa J. Dahabreh Yu-Han Chiu http://arxiv.org/abs/2605.14936v1 Relaxation of Projected Prior with Continuous Gap Shrinkage 2026-05-14T15:11:06Z

Projected priors were originally introduced to accommodate parameter constraints, but have recently regained popularity due to their ability to assign probability mass to low-dimensional parameter sets, such as the spaces of sparse vectors, directed acyclic graphs, or transport plans. When employed as a transformation of random variables, projection is especially useful, since its contraction property not only preserves probability concentration, but also often preserves differentiability for gradient-based posterior computation. On the other hand, unless the projection can be obtained by some non-iterative algorithm, posterior computation can be expensive because it requires nesting an iterative optimization routine within each Markov chain Monte Carlo iteration. In this article, inspired by the success of continuous shrinkage models as replacements for discrete spike-and-slab priors, we propose a continuous relaxation of projected priors. The key idea is to quantify the duality gap between the primal projection loss and the dual objective, and impose a probabilistic prior that shrinks this gap toward zero. The resulting gap-shrinkage prior has a tractable form, does not require running an optimization subroutine inside each posterior update, and puts probability mass near the exact projection. We demonstrate useful properties of gap-shrinkage priors, including connections to global-local shrinkage priors, broad applicability to generalized projection functions, and competitive performance in posterior contraction. We apply the gap-shrinkage model to a marketing data analysis aimed at identifying important predictor effects on multivariate grocery-shopping decisions.

2026-05-14T15:11:06Z Leo L Duan Sunghyun Cho Mingzhang Yin http://arxiv.org/abs/2410.24003v6 On testing for independence between generalized error models of several time series 2026-05-14T14:41:11Z

We define generalized innovations associated with generalized error models having arbitrary distributions, that is, distributions that can be mixtures of continuous and discrete distributions. These models include stochastic volatility models and regime-switching models. We also propose statistics for testing independence between the generalized errors of these models, extending previous results of Duchesne, Ghoudi and Remillard (2012) obtained for stochastic volatility models. We define families of empirical processes constructed from lagged generalized errors, and we show that their joint asymptotic distributions are Gaussian and independent of the estimated parameters of the individual time series. Moebius transformations of the empirical processes are used to obtain tractable covariances. Several tests statistics are then proposed, based on Cramer-von Mises statistics and dependence measures, as well as graphical methods to visualize the dependence. In addition, numerical experiments are performed to assess the power of the proposed tests. Finally, to show the usefulness of our methodologies, examples of applications for financial data and crime data are given to cover both discrete and continuous cases. ll developed methodologies are implemented in the CRAN package IndGenErrors.

2024-10-31T15:02:35Z Kilani Ghoudi Bouchra R. Nasri Bruno N. Remillard http://arxiv.org/abs/2505.05670v3 Estimation and Inference in Boundary Discontinuity Designs: Location-Based Methods 2026-05-14T14:08:17Z

Boundary discontinuity designs are used to learn about causal treatment effects along a continuous assignment boundary that splits units into control and treatment groups according to a bivariate location score. We analyze location-based local polynomial treatment effect estimators that directly employ the bivariate score of each unit. We develop pointwise and uniform estimation and inference methods for the \textit{Boundary Average Treatment Effect Curve} (BATEC), as well as for two aggregated causal parameters: the \textit{Weighted Boundary Average Treatment Effect} (WBATE) and the \textit{Largest Boundary Average Treatment Effect} (LBATE). Our results cover both sharp and fuzzy (imperfect compliance) designs. We illustrate the methods with an empirical application, and provide companion general-purpose software. The supplemental appendix includes additional substantive theoretical results, methodological details, and simulation evidence.

2025-05-08T21:59:05Z Matias D. Cattaneo Rocio Titiunik Ruiqi Rae Yu http://arxiv.org/abs/2506.12296v3 Finite-sample bias-variance tradeoff with variables related to trial participation inserted into causal forest models for ensuring generalizability 2026-05-14T13:59:31Z

Estimating conditional average treatment effects (CATE) from randomized controlled trials (RCTs) and generalizing them to broader populations is essential for personalizing treatment rules but is complicated by selection bias due to trial participation and potentially high dimensional covariates. We evaluated finite sample bias variance tradeoff for Causal Forest based CATE estimation strategies to address the selection bias. Identification theory suggests unbiased CATE estimation is possible when covariates related to trial participation are included in CATE estimating models. However, simulation studies demonstrated that, under realistic RCT sample sizes, variance inflation from high dimensional covariates often outweighed modest bias reduction. In our data generating process that define individual treatment effect (ITE) in source population and selected trial samples, including more than 3 covariates related to participation in causal forest substantially degraded precision unless sample sizes were large. In contrast, inverse probability weighting (IPW) based methods consistently improved performance across scenarios. Application to a RCT of omega 3 fatty acids and coronary heart disease illustrated how IPW shifts CATE estimates toward source population effects and refines heterogeneity assessments. Our findings highlight that including trial-selection variables for CATE estimating models may inflate estimator variance and reduce ITE prediction performance in applications using medical RCTs. Addressing selection bias separately (e.g. through IPW) would be a reasonable strategy.

2025-06-14T01:17:59Z 4 figures Rikuta Hamaya Etsuji Suzuki Konan Hara http://arxiv.org/abs/2605.14828v1 K-Models: a Flexible and Interpretable Method for Ordinal Clustering with Application to Antigen-Antibody Interaction Profiles 2026-05-14T13:35:44Z

Existing clustering methods for functional data often prioritize partitioning accuracy over interpretability, making it challenging to extract meaningful insights when the data-generating process follows a specific underlying structure and an ordinal relationship among clusters is suspected. This work introduces K-Models, a novel framework that integrates ordinal constraints and estimates key underlying elements of the random process generating the observed functional profiles, improving both interpretability and structure identification. The proposed method is evaluated through simulations and real-world applications. In particular, it is tested on Region of Interest (ROI) curves, which represent reaction profiles from a reflectometric sensor monitoring biomolecular interactions, such as antigen-antibody binding. These curves represent changes in reflected light intensity over time at multiple measurement spots with immobilized antigens during analyte exposure, capturing the binding dynamics of the system. The goal is to identify intrinsic signal patterns solely from the observed dynamics, making this dataset an ideal benchmark for assessing the added interpretability of the proposed approach. By incorporating structural assumptions into the clustering process, K-Models enhances interpretability while maintaining performance comparable to state-of-the-art techniques, providing a valuable tool for analyzing functional data with an underlying ordinal structure.

2026-05-14T13:35:44Z Giulia Patanè Alessandra Menafoglio Alexander Krauth Peter Fechner Luca Dede' Bianca Maria Colosimo Federica Nicolussi http://arxiv.org/abs/2605.14796v1 A Class of Higher-Order INAR Random Fields for Poisson Counts and Beyond 2026-05-14T13:06:16Z

Existing integer-valued autoregressive (INAR) models for count random fields suffer from difficulties in characterizing the stationary marginal distribution and in computing conditional probabilities (as required for likelihood inference). To overcome these drawbacks, the novel class of combined INAR (CINAR) models is proposed, which both exhibits the classical autoregressive dependence structure and allows to specify the marginal distribution within the wide class of discrete self-decomposable distributions. In particular, CINAR random fields can be equipped with a Poisson or negative-binomial marginal distribution. The CINAR's key stochastic properties are derived (including a simple expression for conditional probabilities), and special cases as well as possible extensions are discussed. Approaches for parameter estimation are developed and investigated, and the practical relevance of the novel CINAR family is demonstrated by an agricultural data application.

2026-05-14T13:06:16Z Christian H. Weiß Angelika Silbernagel http://arxiv.org/abs/2510.11177v2 Policy Robustness & Uncertainty in Model-based Decision Support for the Energy Transition 2026-05-14T12:52:14Z

Climate policy modelling is a key tool for assessing mitigation strategies in complex systems, where uncertainty is inherent and unavoidable. We present a general methodology for extensive uncertainty analysis in this field. While other studies have performed uncertainty analyses, few apply methods from the field of Uncertainty Quantification, which are commonly used in other modelling disciplines. We show how emulators can identify key uncertainties in modelling frameworks and demonstrate a novel policy analysis previously restricted by computational cost and limited representation of uncertainty. We apply this methodology to FTT:Power to explore uncertainties in the electricity system transition both globally and in India to assess the robustness of mitigation strategies to a wide range of policy and techno-economic scenarios. This approach results in much larger uncertainties in transition outcomes than commonly represented, but policy design can be shaped to mitigate this. Globally, our results indicate transition uncertainty is dominated by average rates of renewables cannibalisation, construction times and grid connection lead times, outweighing regional price policies, including policy reversals in the US. Solar PV appears most resilient due to low costs, though still sensitive to infrastructure constraints and cannibalisation. Onshore wind is more exposed to a range of uncertainties. In India, we find evidence that policy packages including partial phase-out instruments have greater robustness to key uncertainties, although longer lead times still hinder policy goals. Our results suggest that enabling policy and regulating fossil fuels are critical for robust power sector transitions.

2025-10-13T09:09:47Z Ian J. Burton Femke J. M. M. Nijsse James M. Salter http://arxiv.org/abs/2605.14762v1 Differentially private inference framework of Riemannian manifold data 2026-05-14T12:24:28Z

We propose a novel and systematic differentially private (DP) inference framework for non-Euclidean data. First, we design two types of DP mechanisms for the Fréchet mean and variance with i.i.d. Riemannian manifold-valued data, tailored to different geometric structures and accompanied by analytic privacy budgets calibrated to the geometry of the underlying manifold. Second, we establish the consistency and central limit theorems (CLTs) of the proposed DP estimators, enabling a suite of statistical inference procedures under privacy protection. Furthermore, we provide comprehensive implementation guidelines and feasible procedures, including consistent DP estimators of the asymptotic variance in the CLTs. Extensive numerical experiments support the proposed methodologies. Finally, we demonstrate the effectiveness of our approach on real-world medical image and sociological datasets lying on two representative manifolds.

2026-05-14T12:24:28Z Yangdi Jiang Xiaotian Chang Qirui Hu http://arxiv.org/abs/2507.11922v2 Enhancing Signal Proportion Estimation Through Leveraging Arbitrary Covariance Structures 2026-05-14T10:58:17Z

Accurately estimating the proportion of true signals among a large number of variables is crucial for enhancing the precision and reliability of scientific research. Traditional signal proportion estimators often assume independence among variables and specific signal sparsity conditions, limiting their applicability in real-world scenarios where such assumptions may not hold. This paper introduces a novel signal proportion estimator that leverages arbitrary covariance dependence information among variables, thereby improving performance across a wide range of sparsity levels and dependence structures. Building on previous work that provides lower confidence bounds for signal proportions, we extend this approach by incorporating the principal factor approximation procedure to account for variable dependence. Our theoretical insights offer a deeper understanding of how signal sparsity, signal intensity, and covariance dependence interact. By comparing the conditions for estimation consistency before and after dependence adjustment, we highlight the advantages of integrating dependence information across different contexts. This theoretical foundation not only validates the effectiveness of the new estimator but also guides its practical application, ensuring reliable use in diverse scenarios. Through extensive simulations, we demonstrate that our method outperforms state-of-the-art estimators in both estimation accuracy and the detection of weaker signals that might otherwise go undetected.

2025-07-16T05:37:42Z Revised technical details in Section 4 Jingtian Bai Xinge Jessie Jeng http://arxiv.org/abs/2605.14647v1 Multiscale Topological Inference for Marked Point Processes via Euler Characteristic Envelopes 2026-05-14T10:05:25Z

The statistical analysis of marked point processes requires disentangling complex spatial arrangements from attribute-dependent interactions. While classical summary statistics are effective for second-order dependencies, they frequently fail to capture higher-order topological structures and non-linear interactions between marks and space. In this work, we propose a novel multiscale topological inference framework for marked point processes by integrating mark-weighted filtrations with Euler Characteristic envelopes. We redefine the underlying metric space using an exponential mark-weighted distance, which modulates connectivity based on attribute similarity, effectively accelerating the merger of connected components among homophilic neighbors. To ensure rigorous statistical inference, we apply non-parametric global envelope tests to the resulting Euler Characteristic Curves, allowing for formal hypothesis testing against the null model of random labeling. Furthermore, we introduce a local decomposition of the topological signal via Z-scores at the critical filtration scale to identify and localize structural hubs and topological barriers. Systematic simulations across various scenarios demonstrate the framework's high specificity and sensitivity to attribute-space dependencies while remaining robust against purely geometric effects. This methodology provides a comprehensive and interpretable toolkit for identifying, quantifying, and localizing complex structural dependencies in marked spatial data, bridging the gap between topological data analysis and classical point process statistics.

2026-05-14T10:05:25Z Matthias Eckardt Mehdi Moradi http://arxiv.org/abs/2410.09504v5 Bayesian Transfer Learning for Artificially Intelligent Geospatial Systems: A Predictive Stacking Approach 2026-05-14T08:59:19Z

Building artificially intelligent geospatial systems requires rapid delivery of spatial data analysis on massive scales with minimal human intervention. Depending upon their intended use, data analysis can also involve model assessment and uncertainty quantification. This article devises transfer learning frameworks for deployment in artificially intelligent systems, where a massive data set is split into smaller data sets that stream into the analytical framework to propagate learning and assimilate inference for the entire data set. Specifically, we introduce Bayesian predictive stacking for multivariate spatial data and demonstrate rapid and automated analysis of massive data sets. Furthermore, inference is delivered without human intervention without excessively demanding hardware settings. We illustrate the effectiveness of our approach through extensive simulation experiments and in producing inference from massive dataset on vegetation index that are indistinguishable from traditional (and more expensive) statistical approaches.

2024-10-12T11:45:14Z Luca Presicce Sudipto Banerjee http://arxiv.org/abs/2605.14575v1 The Asset Price Channel of Monetary Policy: Evidence from Regional Stock-Market Developments in the Successor States of Former Yugoslavia 2026-05-14T08:48:02Z

The aim of this study is to empirically investigate the existence of a sectoral asset price channel of monetary policy in the region of the six republics of former Yugoslavia. The study constructs sectoral indices for the entire region, building on the idea that one regional stock exchange may provide more efficiency for the listed companies in the region, while monetary policy relevance for it may be sector-specific. We employ panel vector autoregressive model to observe impulse responses of sectoral indices to innovations in monetary policy, while then disentangle the long- from the short-run relationships per index through a Pooled Mean Group estimation. Overall, we document presence of the asset price channel in the finance and telecom sectors, likely driven by the established multinational corporate networks fostering sub-market regionalization. Yet, this is not the case for the manufacturing and electricity sectors, which may imply that local stock markets are yet too fragmented and space for a more efficient regional stock market, either in the true sense of the word or, more realistically, though enhanced regional cooperation of the stock exchanges certainly exists.

2026-05-14T08:48:02Z Stefan Tanevski