https://arxiv.org/api/9Ag6TQIzJHH2CahqBQ34Yj9YyKU 2026-03-22T11:36:40Z 1629 30 15 http://arxiv.org/abs/2602.20954v1 Hierarchical Aggregation Clustering Algorithms Derived from the Bi-partial Objective Function 2026-02-24T14:36:35Z The paper outlines the principles of construction of a broad class of hierarchical aggregation algorithms of cluster analysis, essentially based on minimum distance mergers, which are derived from the general bi-partial objective function. It is shown how the algorithms arise from the bi-partial objective function, their affinity with the classical hierarchical aggregation algorithms is demonstrated, and the examples of such algorithms for the concrete forms of the bi-partial objective function are provided. This amounts to the first explicit and, at the same time, quite general, connection between optimization in clustering and the hierarchical aggregation algorithms. Thereby, the respective hierarchical algorithms gain a deeper justification, the means for evaluating the quality of clustering is provided, along with the criterion of stopping the cluster mergers. 2026-02-24T14:36:35Z An original paper, not yet submitted anywhere Jan W. Owsiński http://arxiv.org/abs/2405.09797v3 Extrapolating Single-Treatment Effects Out of Factorial Experiments 2026-02-23T15:47:06Z Despite their cost, randomized controlled trials (RCTs) are widely regarded as gold-standard evidence in disciplines ranging from social science to medicine. In recent decades, researchers have increasingly sought to reduce the resource burden of repeated RCTs with factorial designs that simultaneously test multiple hypotheses, e.g. experiments that evaluate the effects of many medications or products simultaneously. Here I show that when multiple interventions are randomized in experiments, the effect any single intervention would have outside the experimental setting is not identified absent heroic assumptions, even if otherwise perfectly realistic conditions are achieved. This happens because single-treatment effects involve a counterfactual world with a single focal intervention, allowing other variables to take their natural values (which may be confounded or modified by the focal intervention). In contrast, observational studies and factorial experiments provide information about potential-outcome distributions with zero and multiple interventions, respectively. In this paper, I formalize sufficient conditions for the identifiability of those isolated quantities. I show that researchers who rely on this type of design have to justify either linearity of functional forms or -- in the nonparametric case -- specify with Directed Acyclic Graphs how variables are related in the real world. Finally, I develop nonparametric sharp bounds -- i.e., maximally informative best-/worst-case estimates consistent with limited RCT data -- that show when extrapolations about effect signs are empirically justified. These new results are illustrated with simulated data. 2024-05-16T04:01:53Z Guilherme Duarte http://arxiv.org/abs/2602.18242v1 Reflections on the Future of Statistics Education in a Technological Era 2026-02-20T14:26:15Z Keeping pace with rapidly evolving technology is a key challenge in teaching statistics. To equip students with essential skills for the modern workplace, educators must integrate relevant technologies into the statistical curriculum where possible. University-level statistics education has experienced substantial technological change, particularly in the tools and practices that underpin teaching and learning. Statistical programming has become central to many courses, with R widely used and Python increasingly incorporated into statistics and data analytics programmes. Additionally, coding practices, database management, and machine learning now feature within some statistics curricula. Looking ahead, we anticipate a growing emphasis on artificial intelligence (AI), particularly the pedagogical implications of generative AI tools such as ChatGPT. In this article, we explore these technological developments and discuss strategies for their integration into contemporary statistics education. 2026-02-20T14:26:15Z Craig Alexander Jennifer Gaskell Vinny Davies http://arxiv.org/abs/2602.17896v1 Central limit theorem for the global clustering coefficient of random geometric graphs 2026-02-19T23:27:39Z The global clustering coefficient serves as a powerful metric for the structural analysis and comparison of complex networks. Random geometric graphs offer a realistic framework for representing the spatial constraints and geometry often found in real-world network datasets. In this paper, we establish a central limit theorem for the global clustering coefficient of random geometric graphs. Our main result identifies the centering and scaling sequences required for convergence in law to the standard normal distribution. Our approach varies by regime: in the dense case, we employ the Lyapunov CLT; in the intermediate case, we utilize the asymptotic theory of $U$-statistics with sample-size-dependent kernels; and in the sparse regime, we use the method of moments to derive the asymptotic distribution. Notably, the convergence rates for non-uniform and uniform random geometric graphs diverge in the dense regime, yet they coincide in the sparse regime. In addition, we find that the global clustering coefficient for both uniform and non-uniform RGGs is asymptotically equal to $3/4$ 2026-02-19T23:27:39Z Mingao Yuan Md. Niamul Islam Sium http://arxiv.org/abs/2602.16283v1 Orthogonal parametrisations of Extreme-Value distributions 2026-02-18T09:06:26Z Extreme value distributions are routinely employed to assess risks connected to extreme events in a large number of applications. They typically are two- or three- parameter distributions: the inference can be unstable, which is particularly problematic given the fact that often times these distributions are fitted to small samples. Furthermore, the distribution's parameters are generally not directly interpretable and not the key aim of the estimation. We present several orthogonal reparametrisations of the main extreme-value distributions, key in the modelling of rare events. In particular, we apply the theory developed in Cox and Reid (1987) to the Generalised Extreme-Value, Generalised Pareto, and Gumbel distributions. We illustrate the principal advantage of these reparametrisations in a simulation study. 2026-02-18T09:06:26Z Nathan Huet Ilaria Prosdocimi http://arxiv.org/abs/2507.12257v3 Robust Causal Discovery in Real-World Time Series with Power-Laws 2026-02-17T23:35:05Z Exploring causal relationships in stochastic time series is a challenging yet crucial task with a vast range of applications, including finance, economics, neuroscience, and climate science. Many algorithms for Causal Discovery (CD) have been proposed; however, they often exhibit a high sensitivity to noise, resulting in spurious causal inferences in real data. In this paper, we observe that the frequency spectra of many real-world time series follow a power-law distribution, notably due to an inherent self-organizing behavior. Leveraging this insight, we build a robust CD method based on the extraction of power-law spectral features that amplify genuine causal signals. Our method consistently outperforms state-of-the-art alternatives on both synthetic benchmarks and real-world datasets with known causal structures, demonstrating its robustness and practical relevance. 2025-07-16T14:02:21Z Matteo Tusoni Giuseppe Masi Andrea Coletta Aldo Glielmo Viviana Arrigoni Novella Bartolini http://arxiv.org/abs/2603.00098v1 Profiling vs. Case-specific Evidence: A Probabilistic Analysis 2026-02-17T13:43:02Z The use of profiling evidence in criminal trials is a longstanding controversy in legal epistemology and evidence law theory. Many scholars, even when they oppose its use at trial, still assume that profiling evidence can be probative of guilt. We reject that assumption. Profiling evidence may support a generic hypothesis, but is not evidence that the defendant is guilty of the specific crime of which they are accused. We contrast profiling evidence with case-specific evidence, which speaks more directly to the facts of the case. Our critique departs from others by grounding the argument in a probabilistic analysis of evidentiary value. We also explore the implications of our account for debates about stereotyping. 2026-02-17T13:43:02Z 16 pages Marcello Di Bello Nicolò Cangiotti Michele Loi http://arxiv.org/abs/2602.14284v1 Benchmarking AI Performance on End-to-End Data Science Projects 2026-02-15T19:16:04Z Data science is an integrated workflow of technical, analytical, communication, and ethical skills, but current AI benchmarks focus mostly on constituent parts. We test whether AI models can generate end-to-end data science projects. To do this we create a benchmark of 40 end-to-end data science projects with associated rubric evaluations. We use these to build an automated grading pipeline that systematically evaluates the data science projects produced by generative AI models. We find the extent to which generative AI models can complete end-to-end data science projects varies considerably by model. Most recent models did well on structured tasks, but there were considerable differences on tasks that needed judgment. These findings suggest that while AI models could approximate entry-level data scientists on routine tasks, they require verification. 2026-02-15T19:16:04Z Evelyn Hughes Rohan Alexander http://arxiv.org/abs/2602.13565v1 An Improved Milstein Method for the Numerical Solution of Multidimensional Stochastic Differential Equations 2026-02-14T02:54:38Z Stochastic differential equations (SDEs) offer powerful and accessible mathematical models for capturing both deterministic and probabilistic aspects of dynamic behavior across a wide range of physical, financial, and social systems. However, analytical solutions for many SDEs are often unavailable, necessitating the use of numerical approximation methods. The rate of convergence of such numerical methods is of great importance, as it directly influences both computational efficiency and accuracy. This paper presents a proposed theorem, along with its proof, that facilitates the numerical evaluation of the strong (and weak) order of convergence of a numerical scheme for an SDE when the analytical solution is unavailable. Additionally, we address the challenge of numerically computing the multiple stochastic integrals required by the Milstein method to achieve improved convergence rates for multidimensional SDEs. In this context, two newly proposed numerical techniques for computing these multiple stochastic integrals are introduced and compared with existing approaches in terms of efficiency and effectiveness. The methodologies are further illustrated through simulation studies and applications to widely used financial models. 2026-02-14T02:54:38Z Paromita Banerjee Anirban Mondal http://arxiv.org/abs/2602.12216v1 Bayesian inference for the automultinomial model with an application to landcover data 2026-02-12T17:54:02Z Multicategory lattice data arise in a wide variety of disciplines such as image analysis, biology, and forestry. We consider modeling such data with the automultinomial model, which can be viewed as a natural extension of the autologistic model to multicategory responses, or equivalently as an extension of the Potts model that incorporates covariate information into a pure-intercept model. The automultinomial model has the advantage of having a unique parameter that controls the spatial correlation. However, the model's likelihood involves an intractable normalizing function of the model parameters that poses serious computational problems for likelihood-based inference. We address this difficulty by performing Bayesian inference through the Double-Metropolis Hastings algorithm, and implement diagnostics to assess the convergence to the target posterior distribution. Through simulation studies and an application to land cover data, we find that the automultinomial model is flexible across a wide range of spatial correlations while maintaining a relatively simple specification. For large data sets we find it also has advantages over spatial generalized linear mixed models. To make this model practical for scientists, we provide recommendations for its specification and computational implementation. 2026-02-12T17:54:02Z Maria Paula Duenas-Herrera Stephen Berg Murali Haran http://arxiv.org/abs/2504.08263v2 A roadmap for systematic identification and analysis of multiple biases in causal inference 2026-02-10T21:49:53Z Observational studies examining causal effects rely on unverifiable assumptions, the violation of which can induce multiple biases. Quantitative bias analysis (QBA) methods examine the sensitivity of findings to such violations, generally, by producing estimates under alternative assumptions, incorporating external information. Although substantial guidance exists for implementing QBA, there is limited guidance on how to systematically determine the assumptions underlying a primary causal analysis and the potential violations that should guide bias analysis. Consequently, many assumptions remain implicit, leading to selective and therefore misleading QBA. To address this gap, we propose a roadmap for systematically identifying and analysing multiple biases. Briefly, this consists of (1) articulating the assumptions underlying the primary analysis through specification and emulation of the ideal trial that defines the causal estimand and depicting these assumptions using a causal diagram; (2) extending the diagram to depict alternative assumptions under which biases may arise; (3) obtaining a single estimate that simultaneously corrects for all potential biases. We illustrate the roadmap using an investigation of the effect of breastfeeding on risk of childhood asthma, and through simulations illustrate the need for analysing multiple biases jointly rather than one at a time. 2025-04-11T05:30:32Z 12 Pages, 4 Figures Rushani Wijesuriya Rachael A. Hughes John B. Carlin Rachel L. Peters Jennifer J. Koplin Margarita Moreno-Betancur http://arxiv.org/abs/2506.05776v3 Analyzing the retraining frequency of global forecasting models: towards more stable forecasting systems 2026-02-10T14:56:53Z Forecast stability, that is, the consistency of predictions over time, is essential in business settings where sudden shifts in forecasts can disrupt planning and erode trust in predictive systems. Despite its importance, stability is often overlooked in favor of accuracy. In this study, we evaluate the stability of point and probabilistic forecasts across several retraining scenarios using three large forecastingdatasets and ten different global forecasting models. To analyze stability in the probabilistic setting, we propose a new model-agnostic, distribution-free, and scale-free metric that measuresprobabilistic stability: the Scaled Multi-Quantile Change (SMQC). The results show that less frequent retraining not only preserves but often improves forecast stability, challenging the need for frequent retraining. Moreover, the study shows that accuracy and stability are not necessarily conflicting objectives when adopting a global modeling approach. The study promotes a shift toward stability-aware forecasting practices, proposing a new metric to evaluate forecast stability effectively in probabilistic settings, and offering practical guidelines for building more stable and sustainable forecasting systems. 2025-06-06T06:13:29Z Marco Zanotti http://arxiv.org/abs/2509.14218v2 Adaptive Off-Policy Inference for M-Estimators Under Model Misspecification 2026-02-08T13:47:37Z When data are collected adaptively, such as in bandit algorithms, classical statistical approaches such as ordinary least squares and $M$-estimation will often fail to achieve asymptotic normality. Although recent lines of work have modified the classical approaches to ensure valid inference on adaptively collected data, most of these works assume that the model is correctly specified. The misspecified setting poses unique challenges because the parameter of interest itself may not be well-defined over a non-stationary distribution of rewards. We therefore tackle the problem of \emph{off-policy} inference in adaptive settings, where we uniquely define a projected solution over a stationary evaluation policy. Our method provides valid inference for $M$-estimators that use adaptively collected bandit data with a possibly misspecified working model. A key ingredient in our approach is the use of flexible approaches to stabilize the variance induced by adaptive data collection. A major novelty is that the procedure enables the construction of valid confidence sets even in settings where treatment policies are unstable and non-converging, such as when there is no unique optimal arm and standard bandit algorithms are used. Empirical results on semi-synthetic datasets constructed from the Osteoarthritis Initiative demonstrate that the method maintains type I error control, while existing methods for inference in adaptive settings do not cover in the misspecified case. 2025-09-17T17:51:40Z 43 pages, 6 figures James Leiner Robin Dunn Aaditya Ramdas http://arxiv.org/abs/2502.11510v3 Here Be Dragons: Bimodal posteriors arise from numerical integration error in longitudinal models 2026-02-07T00:39:44Z Longitudinal models with dynamics governed by differential equations may require numerical integration alongside parameter estimation. We have identified a situation where the numerical integration introduces error in such a way that it becomes a novel source of non-uniqueness in estimation. We obtain two very different sets of parameters, one of which is a good estimate of the true values and the other a very poor one. The two estimates have forward numerical projections statistically indistinguishable from each other because of numerical error. In such cases, the posterior distribution for parameters is bimodal, with a dominant mode closer to the true parameter value, and a second cluster around the errant value. We demonstrate that multi-modality exists both theoretically and empirically for an affine first order differential equation, that a simulation workflow can test for evidence of the issue more generally, and that Markov Chain Monte Carlo sampling with a suitable solution can avoid bimodality. The issue of multi-modal posteriors arising from numerical error has consequences for Bayesian inverse methods that rely on numerical integration more broadly. 2025-02-17T07:26:15Z 33 pages, 7 figures, 2 tables Tess O'Brien Matthew T. Moores David Warton Daniel Falster http://arxiv.org/abs/2407.11518v2 Ensemble Transport Filter via Optimized Maximum Mean Discrepancy 2026-02-06T17:48:27Z In this paper, we present a new ensemble-based filter method by reconstructing the analysis step of the particle filter through a transport map, which directly transports prior particles to posterior particles. The transport map is constructed through an optimization problem described by the Maximum Mean Discrepancy loss function, which matches the expectation information of the approximated posterior and reference posterior. The proposed method inherits the accurate estimation of the posterior distribution from particle filtering while gives an extension to high dimensional assimilation problems. To improve the robustness of Maximum Mean Discrepancy, a variance penalty term is used to guide the optimization. It prioritizes minimizing the discrepancy between the expectations of highly informative statistics for the reference posteriors. The penalty term significantly enhances the robustness of the proposed method and leads to a better approximation of the posterior. A few numerical examples are presented to illustrate the advantage of the proposed method over ensemble Kalman filter. 2024-07-16T08:54:12Z 27 pages, 14 figures Dengfei Zeng Lijian Jiang