Strong Likelihood Principle: Strengthening a Principle or Misunderstanding the Likelihood Function

2026-06-08T03:20:49Z

The strong likelihood principle (SLP) is conventionally derived from the sufficiency principle and a conditionality principle in an argument due to Birnbaum, and much of the literature contests whether the derivation is sound. We take a different approach. We ask what the SLP says when its terms are read carefully, and argue that the principle as ordinarily stated reflects a confusion about the domain of the likelihood function. The likelihood is naturally defined as a function on a family of distributions $M$, not on a parameter space, and once it is so defined the SLP collapses into its weak counterpart, the weak likelihood principle. The diagnosis is illustrated by analogy with monetary value, developed concretely through a comparison of the binomial and negative binomial families that share a parameter, and connected to the geometric structure of $M$ through the Fisher information metric. The same standardization emerges from a statistical argument about comparing measurements across populations and from a geometric argument about manifold distance; this convergence supplies the positive content of the weak likelihood principle.

Active Learning with Bayesian Reasoning: A POGIL-Based Pedagogy in Introductory Statistics

2026-06-07T15:02:43Z

We introduce a Process Oriented Guided Inquiry Learning (POGIL)-style activity for teaching Bayesian reasoning in introductory statistics through conditional probability, Bayes' theorem, and belief updating. The activity is self-contained, uses hand-computable probabilities organized in two-way tables, and engages students in structured team roles. We evaluated the activity in four sections of an undergraduate introductory statistics course using a quasi-experimental comparison of POGIL-style and lecture-based instruction for a Bayes' theorem unit. Outcomes included student performance on Bayes' theorem final exam questions and satisfaction with instruction. We used a Bayesian bivariate generalized linear model to compare the two approaches while accounting for major type, gender, and race. The results indicated similar exam performance and similar probabilities of high satisfaction across instructional styles and demographic groups, with considerable uncertainty and no clear evidence of meaningful differences. These findings suggest that the POGIL-style activity performed comparably to lecture-based instruction for this unit while offering an active and classroom-ready way to introduce Bayesian reasoning without requiring difficult computation or simulation. We provide adaptable instructional materials and a reproducible Bayesian analytic framework for evaluating active learning innovations in introductory statistics. Our study supports the feasible inclusion of Bayesian reasoning in introductory courses and may help instructors considering active learning.

Probabilistic Win Ratio Method For Hierarchical Composite Endpoints With Coarsened Outcomes

2026-06-05T18:22:42Z

The win ratio is increasingly used to analyze prioritized composite endpoints in clinical trials, but standard implementations rely on deterministic pairwise comparisons and can perform poorly in the presence of censoring and endpoint-specific missingness. In such settings, unresolved comparisons are often treated as ties, leading to loss of efficiency and potentially biased inference, particularly when lower-priority outcomes are incompletely observed. We propose the probabilistic win ratio (PWR), a framework for estimating the classical win ratio under coarsened observation. The PWR replaces deterministic pairwise decisions with conditional probabilities of win, loss, or tie given the observed data, allowing partially observed comparisons to contribute fractionally while being explicitly penalized according to their uncertainty. Comparisons with greater coarsening receive smaller effective weight, whereas fully observed comparisons contribute as in the classical analysis, preserving the clinical priority structure. When outcomes are fully observed, the PWR reduces exactly to the standard win ratio estimator. Simulation studies show that the PWR maintains low bias and mean squared error across a range of censoring and missingness scenarios. Two clinical trial case studies illustrate complementary data regimes, demonstrating calibration in near-complete data and stability under substantial right censoring.

Hilbert's Sixth Problem and Soft Logic

2026-06-05T11:58:28Z

Hilbert's sixth problem calls for the axiomatization of physics, particularly the derivation of macroscopic statistical laws from microscopic mechanical principles. A conceptual difficulty arises in classical probability theory: in continuous spaces every individual microstate has probability zero. In this paper, we introduce a probabilistic framework based on Soft Logic and Soft Numbers in which point events possess infinitesimal Soft probabilities rather than the classical zero. We show that Soft probability can be interpreted as an infinitesimal refinement of classical probability and discuss its implications for statistical mechanics and Hilbert's sixth problem. In addition, we show rigorously how to construct a Mobius strip, based on the soft numbers, and we discuss how this Mobius strip representation with soft numbers allows for a deeper understanding of the nature and character of Hilbert's sixth problem. Inspired by the collapsing of that classical probability to zero, we suggest adding an axiom for an Infinitesimal Probability into the list of Kolmogorov's five Probability axioms. Furthermore, we suggest a probabilistic framework based on Soft Numbers for assigning values to probabilities of impossible events of a discrete random variable with realizations outside its support (which, in the ordinary probability, collapse to zero). This assignment of Soft Number values is based on an extension of the Pascal triangle to have soft zeros outside of the regular Pascal triangle (with real values) based on factorials of negative numbers.

Closing the Gap: Can Novice Statistics and Data Science Students Collaborate as Effectively as an Expert?

2026-06-04T03:44:29Z

The ASCCR (Attitude-Structure-Content-Communication-Relationship) framework was recently developed to teach collaboration skills to statisticians and data scientists. However, its effectiveness in real-world settings has not yet been systematically evaluated. To assess this, we evaluated novice undergraduate and graduate students' performances in initial collaboration meetings with real domain experts and compared them to an expert collaborator. Using video recordings, rubric scores, and domain expert feedback surveys, we found that novices performed surprisingly well compared to the expert. Specifically, novices scored nearly as well as the expert on the Attitude, Structure, and Relationship components of the ASCCR framework. Although novices did not initially perform as well on the Content or Communication aspects, they were able to close the gap. By the end of the collaboration projects, the novices had higher overall domain expert feedback scores than the expert. The primary implication of our study is that novices can become effective collaborators in a very short time. We discuss our findings' practical implications and provide recommendations for integrating the ASCCR framework into statistics and data science collaboration, consulting, and capstone courses.

Robust Causal Discovery in Real-World Time Series with Power-Laws

2026-06-03T21:53:44Z

Exploring causal relationships in stochastic time series is a challenging yet crucial task with a vast range of applications, including finance, economics, neuroscience, and climate science. Many algorithms for Causal Discovery (CD) have been proposed; however, they often exhibit a high sensitivity to noise, resulting in spurious causal inferences in real data. In this paper, we observe that the frequency spectra of many real-world time series follow a power-law distribution, notably due to an inherent self-organizing behavior. Leveraging this insight, we build a robust CD method based on the extraction of power-law spectral features that amplify genuine causal signals. Our method consistently outperforms state-of-the-art alternatives on both synthetic benchmarks and real-world datasets with known causal structures, demonstrating its robustness and practical relevance.

Pivoting the paradigm: the role of spreadsheets in K-12 data science

2026-06-03T21:53:03Z

Spreadsheet tools are widely accessible to and commonly used by K-12 students and teachers. While spreadsheets are not ideal for many types of statistical analysis, they have an important role in data collection and organization. From a pedagogical standpoint, spreadsheets make data visible and easy to interact with, facilitating student engagement in data exploration, analysis, and computation. Though not suitable for all tasks, spreadsheets can facilitate learning and practicing data and computing skills for K-12 students. This paper 1) demonstrates the potential utility of spreadsheets in K-12; 2) reviews prior frameworks and standards that are relevant for K-12 data tools; and 3) proposes data-driven data skills to help develop data acumen and computational fluency. We provide some example activities, identify challenges and barriers to adoption, suggest pedagogical approaches to ease the learning curve for instructors and students, and discuss the need for professional development to facilitate deeper use of spreadsheets for data science and STEM disciplines.

Higher-order spacings in the superposed spectra of random matrices with comparison to spacing ratios and application to complex systems

2026-06-02T09:59:22Z

Higher-order spacing statistics in the $m$ superposed spectra of circular random matrices of the same class are studied numerically. We conjecture that for given $m$ (or order $k$) and $β$, the sequence of modified Dyson index $β'(k)$ (or $β'(m)$) obtained using the sum of absolute differences between the cumulative distribution functions method (denoted as $D(β')$) is unique. Also, for a given $k$, the distribution tends to the corresponding $k$-th order Poisson statistics in the limit $m\rightarrow \infty$. The quantum chaotic kicked top model for various Hilbert space dimensions is studied, and it is found to satisfy our conjecture. This involves the numerical verification of $m=2$ case of COE results. Our result can be used as a tool for the characterization of a system and to determine the symmetry structure of the system without desymmetrization of the spectra. Additionally, the comparative study of the higher-order spacing and ratio distributions in both $m=1$ and $m=2$ cases of COE as well as GOE is performed within and across these ensembles numerically using the $D(β')$ method. This study is carried out both by varying the dimension and keeping the number of realizations constant, and vice-versa. The same asymptotic higher-order statistics are observed across COE and GOE in terms of a given spectral fluctuation measure. But, within a given ensemble of COE or GOE, the results of higher-order spacing and ratio distributions agree with each other only up to some lower $k$, and beyond that, they start deviating from each other. Further, the spectral fluctuations of the intermediate map of various dimensions are studied. Various important observations and discussions from the analysis of our extensive numerical computations are presented.

Tackling the 6/49 Lottery and Debunking Common Myths with Probabilistic Methods and Combinatorial Designs

2026-05-31T11:57:39Z

At the end, the house always wins! This simple truth holds for all public games of chance. Nevertheless, since lotteries have existed, people have tried everything to give luck a helping hand. This article compares objective scientific approaches to tackle the 6/49 lottery: probabilistic methods and combinatorial designs. The mathematical models developed herein can be modified and applied to other lotteries. The newly constructed (49, 6, 5) covering design is introduced, which meets the Schönheim bound. For lottery designs and for covering designs, a benchmark based on probabilistic methods is presented. It is demonstrated that common attempts to outwit the odds correspond to limitations of numbers to subsets, which disproportionately reduce the chances of winning.

Extrinsic Analysis on BHV4

2026-05-30T01:12:35Z

One investigates the extrinsic statistical analysis on the space of Billera- Holmes-Vogtmann tree space with four leaves (T4 or BHV4) based on its recently proposed novel representation (see [1])- the Spiky Projective ExcavatedDodecahedron (SPED). Due to the symmetry of the SPED, the Veronese- Whitney (VW) embeddingwe consider here produces a natural extrinsicmetric for a statistical analysis on BHV4. one derives the exact solution for the VW extrinsic mean and applies this novel method on a yeast genome dataset to study the phylogenetic trees of four distinct yeast clades.

Phase-Type Variational Autoencoders for Heavy-Tailed Data

2026-05-26T13:08:34Z

Heavy-tailed distributions are ubiquitous in real-world data, where rare but extreme events dominate risk and variability. However, standard Variational Autoencoders (VAEs) employ simple decoder distributions, such as Gaussian distributions, that fail to capture heavy-tailed behavior, while existing heavy-tail-aware extensions remain restricted to predefined parametric families whose tail behavior is fixed a priori. We propose the Phase-Type Variational Autoencoder (PH-VAE), whose decoder distribution is a latent-conditioned Phase-Type (PH) distribution, defined as the absorption time of a continuous-time Markov chain (CTMC). This formulation composes multiple exponential time scales, yielding a flexible and analytically tractable decoder that adapts its finite-range tail behavior directly from the observed data. Experiments on synthetic and real-world benchmarks demonstrate that PH-VAE accurately approximates diverse heavy-tailed distributions, significantly outperforming Gaussian, Student-t, and extreme-value-based VAE decoders in modeling observed tail behavior and extreme quantiles. In multivariate settings, PH-VAE captures realistic cross-dimensional tail dependence through its shared latent representation. To our knowledge, this is the first work to integrate Phase-Type distributions into deep generative modeling, bridging applied probability and representation learning.

Contested Temporalities in Critical Minerals and Resource Extraction for Electric Vehicles

2026-05-23T02:35:08Z

The global push for electric vehicles (EVs) has sharply increased demand for critical minerals such as cobalt and lithium, creating a tension between rapid industrial growth and long-term sustainability. Extraction is concentrated in a few regions -- notably the Democratic Republic of Congo (DRC), Chile, and Argentina -- where it has produced serious socio-environmental harms, including ecosystem degradation, labour exploitation, and the displacement of Indigenous communities. In the DRC, cobalt mining is frequently linked to child labour and hazardous working conditions; in Chile, lithium extraction intensifies water scarcity and threatens local agriculture and biodiversity. Policy instruments such as the U.S. Inflation Reduction Act (IRA) seek to promote ethical sourcing, but an extraction-driven model continues to deepen global inequalities. This chapter examines the contested temporalities of the transition, in which the short-term economic incentives of extraction conflict with longer-term environmental and social goals. It argues for a place-based framework built on community-centred governance, sustainable mining practices, and circular-economy strategies, including recycling and material substitution, to align resource security with equity and ensure that the shift to EVs does not reproduce the injustices it aims to address.

p-Hacking Inflates Type I Error Rates in the Error Statistical Approach but not in the Formal Inference Approach

2026-05-21T13:40:43Z

p-hacking occurs when researchers conduct multiple significance tests (e.g., p1;H0,1 and p2;H0,2) and then selectively report tests that yield desirable (usually significant) results (e.g., p2 < 0.05;H0,2) without correcting for multiple testing (e.g., 0.05/2 = 0.025). In the present article, I consider p-hacking in the context of two philosophies of significance testing - the error statistical approach and the formal inference approach. I argue that although p-hacking inflates Type I error rates in the error statistical approach, it does not inflate them in the formal inference approach. Specifically, in the error statistical approach, the "actual" familywise error rate (e.g., 1 - [1 - 0.05]2 = 0.098 for two independent tests) is relevant because it covers both the reported and unreported tests in the "actual" test procedure (i.e., p1;H0,1 and p2;H0,2). In this approach, Type I error rate inflation occurs because the "actual" error rate (0.098) is higher than the nominal error rate (0.05). In contrast, in the formal inference approach, the "actual" familywise error rate is irrelevant because (a) the researcher does not report a statistical inference about the corresponding intersection null hypothesis (i.e., H0,1 & H0,2), and (b) the "actual" familywise error rate does not license inferences about the reported individual hypotheses (i.e., H0,2). Instead, in the formal inference approach, only the nominal error rate is relevant, and a comparison with the "actual" error rate is inappropriate. Implications for conceptualizing, demonstrating, and reducing p-hacking are discussed.

A critical comparison of handling zeros in high-dimensional compositional count data

2026-05-21T08:52:24Z

The growing use of high-throughput sequencing (HTS) has enabled the large-scale production of compositional count data, driving progress in microbiome research. However, such count data are often high-dimensional, over-dispersed, and heavily zero-inflated, and they conflict with the continuity assumptions underlying log-ratio-based compositional data analysis (CoDA), creating substantial methodological challenges. This review provides an overview of zero-handling strategies in compositional data, covering zero-tolerant transformations, imputation approaches for rounded zeros, and statistical models for essential zeros. We specifically highlight the problems that arise when applying the log-ratio framework to sequencing-derived compositional count data, where violations of continuity can induce numerical instabilities and biased statistical inferences. Motivated by these issues, we systematically examine how existing imputation strategies behave when adapted to discrete, zero-inflated count data, including an evaluation of how the discrete, lattice-valued nature of the data affects imputation performance. Overall, this review consolidates scattered methodological developments, clarifies appropriate use cases, and identifies open challenges that motivate future zero-handling frameworks capable of jointly accommodating compositional constraints, zero inflation, and the lattice nature of count data, while also providing a detailed discussion of the comparison results.

An Introduction to Copulas: a Complement

2026-05-20T11:24:08Z

For many years I have taught an advanced statistical inference course for master's students using the text of Casella and Berger (2002). The book gives a comprehensive treatment of the core topics at a level that avoids measure theory while remaining mathematically precise, but it does not cover the increasingly important concept of copulas. The present notes are intended to complement the book by adding two sections on copulas in a style that is as close as possible to that of the original text. Numbering of definitions, theorems, examples, and exercises is consistent with Casella and Berger (2002), but the material may also be read as a brief, stand-alone introduction to copula theory.