https://arxiv.org/api/MsfUF9pXNHQRYmj6n93FoxEUI2w2026-03-26T08:25:26Z163413515http://arxiv.org/abs/2510.03512v1Comparison of Parametric versus Machine-learning Multiple Imputation in Clinical Trials with Missing Continuous Outcomes2025-10-03T20:51:50ZThe use of flexible machine-learning (ML) models to generate imputations of missing data within the framework of Multiple Imputation (MI) has recently gained traction, particularly in observational settings. For randomised controlled trials (RCTs), it is unclear whether ML approaches to MI provide valid inference, and whether they outperform parametric MI approaches under complex data generating mechanisms. We conducted two simulations in RCT settings that have incomplete continuous outcomes but fully observed covariates. We compared Complete Cases, standard MI (MI-norm), MI with predictive mean matching (MI-PMM) and ML-based approaches to MI, including classification and regression trees (MI-CART), Random Forests (MI-RF) and SuperLearner when outcomes are missing completely at random or missing at random conditional on treatment/covariate. The first simulation explored non-linear covariate-outcome relationships in the presence/absence of covariate-treatment interactions. The second simulation explored skewed repeated measures, motivated by a trial with digital outcomes. In the absence of interactions, we found that Complete Cases yields reliable inference; MI-norm performs similarly, except when missingness depends on the covariate. ML approaches can lead to smaller mean squared error than Complete Cases and MI-norm in specific non-linear settings, but provide unreliable inference for others. MI-PMM can lead to unreliable inference in several settings. In the presence of complex treatment-covariate interactions, performing MI separately by arm, either with MI-norm, MI-RF or MI-CART, provides inference that has comparable or with better properties compared to Complete Cases when the analysis model omits the interaction. For ML approaches, we observed unreliable inference in terms of bias in the estimated effect and/or its standard error when Rubin's Rules are implemented.2025-10-03T20:51:50ZMia S. TackneyJonathan W. BartlettElizabeth WilliamsonKim May Leehttp://arxiv.org/abs/2510.06238v1Uncertainty Quantification In Surface Landmines and UXO Classification Using MC Dropout2025-10-03T03:01:22ZDetecting surface landmines and unexploded ordnances (UXOs) using deep learning has shown promise in humanitarian demining. However, deterministic neural networks can be vulnerable to noisy conditions and adversarial attacks, leading to missed detection or misclassification. This study introduces the idea of uncertainty quantification through Monte Carlo (MC) Dropout, integrated into a fine-tuned ResNet-50 architecture for surface landmine and UXO classification, which was tested on a simulated dataset. Integrating the MC Dropout approach helps quantify epistemic uncertainty, providing an additional metric for prediction reliability, which could be helpful to make more informed decisions in demining operations. Experimental results on clean, adversarially perturbed, and noisy test images demonstrate the model's ability to flag unreliable predictions under challenging conditions. This proof-of-concept study highlights the need for uncertainty quantification in demining, raises awareness about the vulnerability of existing neural networks in demining to adversarial threats, and emphasizes the importance of developing more robust and reliable models for practical applications.2025-10-03T03:01:22ZThis work has been accepted and presented at IGARSS 2025 and will appear in the IEEE IGARSS 2025 proceedingsSagar LekhakEmmett J. IentilucciDimah DeraSusmita Ghoshhttp://arxiv.org/abs/2510.00900v1How can the use of different modes of survey data collection introduce bias? A simple introduction to mode effects using directed acyclic graphs (DAGs)2025-10-01T13:44:00ZSurvey data are self-reported data collected directly from respondents by a questionnaire or an interview and are commonly used in epidemiology. Such data are traditionally collected via a single mode (e.g. face-to-face interview alone), but use of mixed-mode designs (e.g. offering face-to-face interview or online survey) has become more common. This introduces two key challenges. First, individuals may respond differently to the same question depending on the mode; these differences due to measurement are known as 'mode effects'. Second, different individuals may participate via different modes; these differences in sample composition between modes are known as 'mode selection'. Where recognised, mode effects are often handled by straightforward approaches such as conditioning on survey mode. However, while reducing mode effects, this and other equivalent approaches may introduce collider bias in the presence of mode selection. The existence of mode effects and the consequences of naïve conditioning may be underappreciated in epidemiology. This paper offers a simple introduction to these challenges using directed acyclic graphs by exploring a range of possible data structures. We discuss the potential implications of using conditioning- or imputation-based approaches and outline the advantages of quantitative bias analyses for dealing with mode effects.2025-10-01T13:44:00ZGeorgia D TomovaRichard J SilverwoodPeter WG TennantLiam Wright10.1093/aje/kwag017http://arxiv.org/abs/2510.00503v1Higher-order spacings in the superposed spectra of random matrices with comparison to spacing ratios and application to complex systems2025-10-01T04:34:11ZThe connection between random matrices and the spectral fluctuations of complex quantum systems in a suitable limit can be explained by using the setup of random matrix theory. Higher-order spacing statistics in the $m$ superposed spectra of circular random matrices are studied numerically. We tabulated the modified Dyson index $β'$ for a given $m$, $k$, and $β$, for which the nearest neighbor spacing distribution is the same as that of the $k$-th order spacing distribution corresponding to the $β$ and $m$. Here, we conjecture that for given $m(k)$ and $β$, the obtained sequence of $β'$ as a function of $k(m)$ is unique. This result can be used as a tool for the characterization of the system and to determine the symmetry structure of the system without desymmetrization of the spectra. We verify the results of the $m=2$ case of COE with the quantum kicked top model corresponding to various Hilbert space dimensions. From the comparative study of the higher-order spacings and ratios in both $m=1$ and $m=2$ cases of COE and GOE by varying dimension, keeping the number of realizations constant and vice-versa, we find that both COE and GOE have the same asymptotic behavior in terms of a given higher-order statistics. But, we found from our numerical study that within a given ensemble of COE or GOE, the results of spacings and ratios agree with each other only up to some lower $k$, and beyond that, they start deviating from each other. It is observed that for the $k=1$ case, the convergence towards the Poisson distribution is faster in the case of ratios than the corresponding spacings as we increase $m$ for a given $β$. Further, the spectral fluctuations of the intermediate map of various dimensions are studied. There, we find that the effect of random numbers used to generate the matrix corresponding to the map is reflected in the higher-order statistics.2025-10-01T04:34:11Z26 pages (two-column) + 6 pages (one-column) + 59 figures. Comments are welcomeSashmita RoutUdaysinh T. Bhosalehttp://arxiv.org/abs/2509.26141v1CLT for LES of real valued random centrosymmetric matrices2025-09-30T11:58:04ZWe study the fluctuations of the eigenvalues of real valued large centrosymmetric random matrices via its linear eigenvalue statistic. This is essentially a central limit theorem (CLT) for sums of dependent random variables. The dependence among them leads to behavior that differs from the classical CLT. The main contribution of this article is finding the expression of the variance of the limiting Gaussian distribution. The crux of the proof lies in combinatorial arguments that involve counting overlapping loops in complete undirected weighted graphs with growing degrees.2025-09-30T11:58:04ZIndrajit JanaSunita Ranihttp://arxiv.org/abs/2510.03266v1Variational Autoencoders-based Detection of Extremes in Plant Productivity in an Earth System Model2025-09-26T22:03:20ZClimate anomalies significantly impact terrestrial carbon cycle dynamics, necessitating robust methods for detecting and analyzing anomalous behavior in plant productivity. This study presents a novel application of variational autoencoders (VAE) for identifying extreme events in gross primary productivity (GPP) from Community Earth System Model version 2 simulations across four AR6 regions in the Continental United States. We compare VAE-based anomaly detection with traditional singular spectral analysis (SSA) methods across three time periods: 1850-80, 1950-80, and 2050-80 under the SSP585 scenario. The VAE architecture employs three dense layers and a latent space with an input sequence length of 12 months, trained on a normalized GPP time series to reconstruct the GPP and identifying anomalies based on reconstruction errors. Extreme events are defined using 5th percentile thresholds applied to both VAE and SSA anomalies. Results demonstrate strong regional agreement between VAE and SSA methods in spatial patterns of extreme event frequencies, despite VAE producing higher threshold values (179-756 GgC for VAE vs. 100-784 GgC for SSA across regions and periods). Both methods reveal increasing magnitudes and frequencies of negative carbon cycle extremes toward 2050-80, particularly in Western and Central North America. The VAE approach shows comparable performance to established SSA techniques, while offering computational advantages and enhanced capability for capturing non-linear temporal dependencies in carbon cycle variability. Unlike SSA, the VAE method does not require one to define the periodicity of the signals in the data; it discovers them from the data.2025-09-26T22:03:20ZBharat SharmaJitendra Kumarhttp://arxiv.org/abs/2502.11820v3A Diagnostic to Find and Help Combat Stochastic Positivity Issues -- with a Focus on Continuous Treatments2025-09-25T13:50:13ZThe positivity assumption is central in the identification of a causal effect, and especially the stochastic variant is an issue many applied researchers face, yet is rarely discussed, especially in conjunction with continuous treatments or Modified Treatment Policies. One common recommendation for dealing with a violation is to change the estimand. However, an applied researcher is faced with two problems: First, how can she tell whether there is a stochastic positivity violation given her estimand of interest, preferably without having to estimate a model first? Second, if she finds a problem with stochastic positivity, how should she change her estimand in order to arrive at an estimand which does not face the same issues? We suggest a novel diagnostic which allows the researcher to answer both questions by providing insights into how well an estimation for a certain estimand can be made for each observation using the data at hand. We provide a simulation study on the general behaviour of different Modified Treatment Policies (MTPs) at different levels of stochastic positivity violations and show how the diagnostic helps understand where bias is to be expected. We illustrate the application of our proposed diagnostic in a pharmacoepidemiological study based on data from CHAPAS-3, a trial comparing different treatment regimens for children living with HIV.2025-02-17T14:13:09Z33 pages (24 without appendix), 12 figures (7 without appendix)Katharina RingMichael Schomaker10.1515/jci-2025-0007http://arxiv.org/abs/2509.19511v1A direct approach for full-field state-parameter estimation from fusion of noncollocated multi-rate sensor data using UKF-based algorithms2025-09-23T19:27:37ZHeterogeneous sensor setups may entail measurements recorded at varying sampling frequencies, commonly known as multi-rate data. For system identification and state estimation with such data, existing studies mostly focus on data fusion algorithms that utilize acceleration measurements, with collocated measurements of other types at lower sampling frequencies, to estimate the displacement at the collocated location with the sampling frequency of the acceleration measurements. The obtained displacements, along with the available acceleration measurements, are then utilized for system identification. This paper introduces a direct and straightforward methodology aimed at estimating the states (i.e., displacements and velocities) along with the unknown structural parameters from fused multi-rate data through Unscented Kalman Filter (UKF) based algorithms with a modification during measurement update. By utilizing all available measurements at any time instant, which can differ due to the multi-rate nature, and by modifying the non-linear measurement equation of the system accordingly at the considered time instant, the UKF framework is suitably tailored for direct applications with multi-rate measurements. The approach is demonstrated with a variety of numerical and laboratory-scale experiments, including fusion of higher sampling frequency acceleration data with lower sampling frequency displacement, axial strain, or bending strain data. The results show that the approach is successful in accurately estimating full-field states and parameters. The state estimates compare well with those obtained using existing data fusion algorithms. The advantages of the approach lie in not requiring collocated sensing, in its generalizability for different types of measurements, in its simplicity and ease of implementation, and in achieving both the state and parameter estimates simultaneously.2025-09-23T19:27:37Z14 pages, 11 figuresDhiraj GhoshAdrita KunduSuparno Mukhopadhyayhttp://arxiv.org/abs/2509.19123v1George Udny Yule and the Interpretation of Regression Betas2025-09-23T15:10:13ZInitially applied in astronomy and geodesy, the linear regression model aimed to find the best estimates for parameters with predefined meanings. E.g., orbital elements, geodetic constants. As its use expanded to other disciplines, often to summarize data without an underlying theoretical model, the need for a general interpretation of regression betas arose. Early attempts by Galton and Karl Pearson met with mixed success. G. U. Yule was the first to develop a general statistical interpretation, the culmination of efforts begun in 1896. Yule interpretation is based on the partial regression theorem, which he proved in 1907.2025-09-23T15:10:13ZFrancesco Coriellihttp://arxiv.org/abs/2509.17122v1Insensitivity-induced potential non-uniqueness in system identification of Bouc-Wen models2025-09-21T15:26:50ZDuring system identification of a structural system with Bouc-Wen (BW) restoring force mechanisms, the estimated BW parameters may be different for different sets of input-output measurements, indicating potential non-uniqueness in the parameter estimates. Nonetheless, the non-unique and incorrectly estimated BW parameters may result in dynamic responses and hysteretic behaviours which are very similar to those obtained for the correct system. In this work, the existence of alternate sets of BW parameters, which result in hysteretic restoring force behaviour similar to the true system, is studied analytically. Approximate expressions for the rate of change of the hysteretic force with deformation are derived and analyzed in detail. It is shown that alternate sets of BW parameters with significant deviations from a set of "true" BW parameters may exist, which result in the rate of change of the hysteretic force, and consequently, the restoring force behaviour itself, to remain very similar to that obtained with the "true" BW parameters. The existence of these alternate parameters results in potential non-uniqueness of the BW parameter estimates, despite satisfying analytical identifiability requirements. Furthermore, the deviations of the alternate BW parameters depend on the magnitudes of the "true" BW parameters as well as the extent of the hysteretic action being developed by the input excitation. The results are illustrated using different inputs: sinusoidal, El Centro motion, and a suite of ground motions compatible with the Kanai-Tajimi spectrum. The results of this work help in a better understanding of the potential non-uniqueness issues associated with the estimation of the BW parameters from measured responses using any system identification technique, which is caused by the insensitivity of these parameters towards the dynamic responses of the structure.2025-09-21T15:26:50Z23 pages, 18 figuresAdrita KunduSuparno Mukhopadhyayhttp://arxiv.org/abs/2503.21719v3The Principle of Redundant Reflection2025-09-18T15:45:04ZThe fact that redundant information does not update a rational belief implies that rational beliefs are updated using Bayes rule. In the framework of Hild (1998a), this is true under mild conditions for discrete, continuous, and arbitrary measure spaces. We prove this result and illustrate it with two examples.2025-03-27T17:31:22Z11 pages, 0 figuresMartin MetodievMaarten MarsmanLourens WaldorpQuentin F. GronauEric-Jan Wagenmakershttp://arxiv.org/abs/2506.22236v3A Plea for History and Philosophy of Statistics and Machine Learning2025-09-18T10:12:59ZThe integration of the history and philosophy of statistics was initiated at least by Hacking (1975) and advanced by Hacking (1990), Mayo (1996), and Zabell (2005), but it has not received sustained follow-up. Yet such integration is more urgent than ever, as the recent success of artificial intelligence has been driven largely by machine learning -- a field historically developed alongside statistics. Today, the boundary between statistics and machine learning is increasingly blurred. What we now need is integration, twice over: of history and philosophy, and of two fields they engage -- statistics and machine learning. I present a case study of a philosophical idea in machine learning (and in formal epistemology) whose root can be traced back to an often under-appreciated insight in Neyman and Pearson's 1936 work (a follow-up to their 1933 classic). This leads to the articulation of an epistemological principle -- largely implicit in, but shared by, the practices of frequentist statistics and machine learning -- which I call achievabilism: the thesis that the correct standard for assessing non-deductive inference methods should not be fixed, but should instead be sensitive to what is achievable in specific problem contexts. Another integration also emerges at the level of methodology, combining two ends of the philosophy of science spectrum: history and philosophy of science on the one hand, and formal epistemology on the other hand.2025-06-27T13:59:08ZHanti Linhttp://arxiv.org/abs/2001.10488v4Statistical Consequences of Fat Tails: Real World Preasymptotics, Epistemology, and Applications2025-09-17T07:38:53Z(The third edition corrects minor typos and adds 3 chapters synthesized from published papers plus an appendix on maximum entropy distributions.) The monograph investigates the misapplication of conventional statistical techniques to fat tailed distributions and looks for remedies, when possible.
Switching from thin tailed to fat tailed distributions requires more than "changing the color of the dress". Traditional asymptotics deal mainly with either n=1 or $n=\infty$, and the real world is in between, under of the "laws of the medium numbers" --which vary widely across specific distributions. Both the law of large numbers and the generalized central limit mechanisms operate in highly idiosyncratic ways outside the standard Gaussian or Levy-Stable basins of convergence.
A few examples:
+ The sample mean is rarely in line with the population mean, with effect on "naive empiricism", but can be sometimes be estimated via parametric methods.
+ The "empirical distribution" is rarely empirical.
+ Parameter uncertainty has compounding effects on statistical metrics.
+ Dimension reduction (principal components) fails.
+ Inequality estimators (GINI or quantile contributions) are not additive and produce wrong results.
+ Many "biases" found in psychology become entirely rational under more sophisticated probability distributions
+ Most of the failures of financial economics, econometrics, and behavioral economics can be attributed to using the wrong distributions.
This book, the first volume of the Technical Incerto, weaves a narrative around published journal articles.2020-01-24T14:45:55ZThird Revised Edition, 2025Nassim Nicholas Talebhttp://arxiv.org/abs/2510.01210v1Minimum Sample Size Calculation for Multivariable Regression of Continuous Outcomes in Chemometrics for Astrobiology and Planetary Science2025-09-17T01:35:14ZOver the last few decades, prediction models have become a fundamental tool in statistics, chemometrics, and related fields. However, to ensure that such models have high value, the inferences that they generate must be reliable. In this regard, the internal validity of a prediction model might be threatened if it is not calibrated with a sufficiently large sample size, as problems such as overfitting may occur. Such situations would be highly problematic in many fields, including space science, as the resulting inferences from prediction models often inform scientific inquiry about planetary bodies such as Mars. Therefore, to better inform the development of prediction models, we applied a theory-based guidance from the biomedical domain for establishing what the minimum sample size is under a range of conditions for continuous outcomes. This study aims to disseminate existing research criteria in biomedical research to a broader audience, specifically focusing on their potential applicability and utility within the field of chemometrics. As such, the paper emphasizes the importance of interdisciplinarity, bridging the gap between the medical domain and chemometrics. Lastly, we provide several examples of work in the context of space science. This work will be the foundation for more evidence-based model development and ensure rigorous predictive modelling in the search for life and possible habitable environments.2025-09-17T01:35:14Z19 pagesM. KonstantinidisE. A. LallaS. J. GonzalezJ. ManriqueG. Lopez-ReyesA. BarlowE. SawyersB. BarriosM. G. Dalyhttp://arxiv.org/abs/2412.12233v2Russian roulette: The need for stochastic potential outcomes when utilities depend on counterfactuals2025-09-16T16:23:51ZIt has been proposed in medical decision analysis to express the ``first do no harm'' principle as an asymmetric utility function in which the loss from killing a patient would count more than the gain from saving a life. Such a utility depends on unrealized potential outcomes, and we show how this yields a paradoxical decision recommendation in a simple hypothetical example involving games of Russian roulette. The problem is resolved if we abandon the stable unit treatment value assumption (SUTVA) and allow the potential outcomes to be random variables. This leads us to conclude that, if you are interested in this sort of asymmetric utility function, you need to move to the stochastic potential outcome framework. We discuss the implications of the choice of parameterization in this setting.2024-12-16T15:29:18ZAndrew GelmanJonas M. Mikhaeil10.1093/biomet/asaf062