https://arxiv.org/api/zZDzO9JudHnrm0fV9MjlTh/h4Xs2026-06-10T17:18:46Z168628515http://arxiv.org/abs/2310.04153v4Fair coins tend to land on the same side they started: Evidence from 350,757 flips2025-04-17T19:58:14ZMany people have flipped coins but few have stopped to ponder the statistical and physical intricacies of the process. We collected $350{,}757$ coin flips to test the counterintuitive prediction from a physics model of human coin tossing developed by Diaconis, Holmes, and Montgomery (DHM; 2007). The model asserts that when people flip an ordinary coin, it tends to land on the same side it started -- DHM estimated the probability of a same-side outcome to be about 51\%. Our data lend strong support to this precise prediction: the coins landed on the same side more often than not, $\text{Pr}(\text{same side}) = 0.508$, 95\% credible interval (CI) [$0.506$, $0.509$], $\text{BF}_{\text{same-side bias}} = 2359$. Furthermore, the data revealed considerable between-people variation in the degree of this same-side bias. Our data also confirmed the generic prediction that when people flip an ordinary coin -- with the initial side-up randomly determined -- it is equally likely to land heads or tails: $\text{Pr}(\text{heads}) = 0.500$, 95\% CI [$0.498$, $0.502$], $\text{BF}_{\text{heads-tails bias}} = 0.182$. Furthermore, this lack of heads-tails bias does not appear to vary across coins. Additional analyses revealed that the within-people same-side bias decreased as more coins were flipped, an effect that is consistent with the possibility that practice makes people flip coins in a less wobbly fashion. Our data therefore provide strong evidence that when some (but not all) people flip a fair coin, it tends to land on the same side it started.2023-10-06T11:00:15ZFrantišek BartošAlexandra SarafoglouHenrik R. GodmannAmir SahraniDavid Klein LeunkPierre Y. GuiDavid VossKaleem UllahMalte J. ZoubekFranziska NippoldFrederik AustFelipe F. VieiraChris-Gabriel IslamAnton J. ZoubekSara ShabaniJonas PetterIngeborg B. RoosAdam FinnemannAaron B. LobMadlen F. HoffstadtJason NakJill de RonKoen DerksKaroline HuthSjoerd TerpstraThomas BastelicaMagda MatetoviciVincent L. OttAndreea S. ZeteaKatharina KarnbachMichelle C. DonzallazArne JohnRoy M. MooreFranziska AssionRiet van BorkTheresa E. LeidingerXiaochang ZhaoAdrian Karami MotaghiTing PanHannah ArmstrongTianqi PengMara BialasJoyce Y. -C. PangBohan FuShujun YangXiaoyi LinDana SleifferMiklos BognarBalazs AczelEric-Jan Wagenmakers10.1080/01621459.2025.2516210http://arxiv.org/abs/2504.12481v1Understanding and Evaluating Engineering Creativity:Development and Validation of the Engineering Creativity Assessment Tool (ECAT)2025-04-16T20:37:53ZCreativity is essential in engineering education, enabling students to develop innovative and practical solutions. However, assessing creativity remains challenging due to a lack of reliable, domain-specific tools. Traditional assessments like the Torrance Tests of Creative Thinking (TTCT) may not fully capture the complexity of engineering creativity. This study introduces and validates the Engineering Creativity Assessment Tool (ECAT), designed specifically for engineering contexts. ECAT was tested with 199 undergraduate students who completed a hands-on design task. Five trained raters evaluated the products using the ECAT rubric. Exploratory and confirmatory factor analyses supported a four-factor structure: fluency, originality, cognitive flexibility, and creative strengths. Reliability was high, convergent and discriminant validity were examined using TTCT scores, revealing moderate correlations that support ECATs domain specificity. ECAT offers a reliable, valid framework for assessing creativity in engineering education and provides actionable feedback to educators. Future work should examine its broader applicability across disciplines and instructional settings.2025-04-16T20:37:53Z29 pages, 3 figures. This work will be presented at 2025 ASEE Annual ConferenceZeynep G Akdemir-BeveridgeArash ZaghiConnie Syharathttp://arxiv.org/abs/2407.17076v3The Analytic Stockwell Transform and its Zeros2025-04-15T10:37:59ZA recent original line of research in time--frequency analysis has shifted the interest in energy maxima toward zeros. Initially motivated by the intriguing uniform spread of the zeros of the spectrogram of white noise, it has led to fruitful theoretical developments combining probability theory, complex analysis and signal processing. In this vein, the present work proposes a characterization of the zeros of the Stockwell Transform of white noise, which consists in an hybrid time--frequency multiresolution representation. First of all, an analytic version of the Stockwell Transform is designed. Then, analyticity is leveraged to establish a connection with the hyperbolic Gaussian Analytic Function, whose zero set is invariant under the isometries of the Poincaré disk. Finally, the theoretical spatial statistics of the zeros of the hyperbolic Gaussian Analytic Function and the empirical statistics of the zeros the Analytic Stockwell Transform of white noise are compared through intensive Monte Carlo simulations, supporting the established connection. A publicly available documented Python toolbox accompanies this work.2024-07-24T08:06:15ZAccepted for publication in Chaos, Solitons and Fractals ElsevierAli MoukademBarbara PascalJean-Baptiste CourbotNicolas Juillethttp://arxiv.org/abs/2504.06787v1Communicating complex statistical models to a public health audience: translating science into action with the FARSI approach2025-04-09T11:20:12ZBackground. Effectively communicating complex statistical model outputs is a major challenge in public health. This study introduces the FARSI approach (Fast, Accessible, Reliable, Secure, Informative) as a framework to enhance the translation of intricate statistical findings into actionable insights for policymakers and stakeholders. We apply this framework in a real-world case study on chronic disease monitoring in Italy.
Methods. The FARSI framework outlines key principles for developing user-friendly tools that improve the translation of statistical results. We applied these principles to create an open-access web application using R Shiny, designed to communicate chronic disease prevalence estimates from a Bayesian spatio-temporal logistic model. The case study highlights the importance of an intuitive design for fast accessibility, validated data and expert feedback for reliability, aggregated data for security, and insights into prevalence population subgroups, which were previously unobservable, for informativeness.
Results. The web application enables stakeholders to explore disease prevalence across populations and geographical area through dynamic visualizations. It facilitates public health monitoring by, for instance, identifying disparities at the local level and assessing risk factors such as smoking. Its user-friendly interface enhances accessibility, making statistical findings more actionable. Conclusions. The FARSI framework provides a structured approach to improving the communication of complex research findings. By making statistical models more accessible and interpretable, it supports evidence-based decision-making in public health and increases the societal impact of research.2025-04-09T11:20:12ZMattia StivalLorenzo SchiavonGaia BertarelliStefano Campostrinihttp://arxiv.org/abs/2504.06507v1The Software Behind the Stats: A Student Exploration of Software Trends Across Disciplines2025-04-09T01:01:30ZThis paper presents a student-led activity designed to explore the use of statistical software in academic research across economics, political science, and statistics. Students reviewed replication files from major journals and repositories, gaining hands-on experience with reproducible workflows while contributing to cross-disciplinary datasets. Web-scraped metadata and student data collection, together covering more than 10,000 papers, reveal clear disciplinary patterns: Stata remains dominant in economics, while R is increasingly popular in political science and is the standard in statistics. Within the social sciences, a growing number of articles also use multiple software platforms within a single manuscript. Students reported increased understanding of academic workflows and greater awareness of software diversity in quantitative research. The activity is easy to adapt across course levels and disciplines, and we offer suggestions for follow-up assignments that reinforce key concepts in reproducibility and data fluency. The resulting insights into current software practices are also valuable for instructors seeking to align their teaching with evolving trends in research.2025-04-09T01:01:30ZElizabeth UptonXizhen CaiPamela JakielaOwen OzierShyam Ramanhttp://arxiv.org/abs/2309.11739v3Classroom Community amid Covid-19: A Mixed-Methods Study of Undergraduate Students in Introductory Mathematics and Statistics2025-04-08T00:11:09ZA strong sense of classroom community is associated with many positive learning outcomes and is a critical contributor to undergraduate students' persistence in STEM, particularly for women and students of color. This chapter describes a mixed-methods investigation into the relationship between classroom community and course attributes in introductory undergraduate mathematics and statistics courses, mediated by student demographics. The project was motivated by and conducted amid the Covid-19 pandemic: data were collected from online courses in the 2021-21 academic year and from hybrid and in-person courses in the 2021-22 academic year. Quantitative data was gathered from both students and instructors and analyzed using structural equation modeling. The primary instrument was the validated Classroom Community Scale - Short Form. These quantitative results are complemented and contextualized by thematic and textual analyses of focus group data, gathered using a newly developed protocol piloted during the 2021-22 academic year. All data comes from a highly selective private university in the United States. Preliminary practical implications of the study include the value of synchronous participation in fostering connectedness and the importance of attending to students' personal identities in understanding their experiences of belonging.2023-09-21T02:27:25ZShira VielMaria TackettSarwari DasJoseph Choohttp://arxiv.org/abs/2504.05102v1Underreporting of Intimate Partner Violence in Brazil2025-04-07T14:09:35ZAccording to WHO (2013), in general 30% of all women worldwide who have been in a relationship have experienced physical and/or sexual violence by their intimate partner. However, only a small percentage of intimate partner violence (IPV) victims report it to the police. This phenomenon of under-reporting is known as ``dark figure''. This paper aims to investigate the factors associated with the reporting decision of IPV victims to the police in Brazil using the third wave of the ``Pesquisa de Condições Socioeconômicas e Violência Doméstica e Familiar contra a Mulher ($PCSVDF^{Mulher}$)''. Using a bivariate probit regression model with sample selection, we found that older white women, those who do not tolerate domestic violence, and women who have experienced physical violence are more likely to report IPV to the police. In contrast, married women, those with partners who abuse alcohol and those who witnessed or knew that their mothers had experienced IPV, are less likely to report it to law enforcement.2025-04-07T14:09:35Z25 pages, 6 tablesDiego de Maria AndréJosé Raimundo Carvalhohttp://arxiv.org/abs/2412.10296v3My Statistics is Better than Yours2025-04-05T16:10:57ZStatistical schools-such as Bayesianism and Frequentism-are often presented as competing frameworks, each claiming technical rigour and superiority. Frequentism emphasizes objective inferences through repeated sampling, while Bayesianism incorporates prior beliefs and updates them with new evidence. Despite their strengths, neither school proves universally applicable, and the pursuit of a single "correct" statistical framework is ultimately misguided. Instead, this essay advocates for a context-dependent approach to statistical norms, drawing on Douglas (2004)'s concept of "operational objectivity". The idea is that by aligning the context of the research question with the value judgments inherent to its field, a certain statistical paradigm is warranted. This essay explores the decision-theoretic foundations of Bayesianism, examines its descriptive limitations as highlighted by the Ellsberg paradox, and addresses the challenges of comparing different normative systems.2024-12-13T17:31:50ZSimon Benhaïemhttp://arxiv.org/abs/2504.02960v1An Anytime Valid Test for Complete Spatial Randomness2025-04-03T18:29:33ZA relevant question when analyzing spatial point patterns is that of spatial randomness. More specifically, before any model can be fit to a point pattern a first step is to test the data for departures from complete spatial randomness (CSR). Traditional techniques employ distance or quadrat counts based methods to test for CSR based on batched data. In this paper, we consider the practical scenario of testing for CSR when the data are available sequentially (i.e., online). We present a sequential testing methodology called as {\em PRe-process} that is based on e-values and is a fast, efficient and nonparametric method. Simulation experiments with the truth departing from CSR in two different scenarios show that the method is effective in capturing inhomogeneity over time. Two real data illustrations considering lung cancer cases in the Chorley-Ribble area, England from 1974 - 1983 and locations of earthquakes in the state of Oklahoma, USA from 2000 - 2011 demonstrate the utility of the PRe-process in sequential testing of CSR.2025-04-03T18:29:33ZVaidehi DixitChristopher K. WikleScott H. Holanhttp://arxiv.org/abs/2503.17598v3Coarse-Grained Games: A Framework for Bounded Perception in Game Theory2025-04-03T10:36:31ZIn everyday life, we frequently make coarse-grained judgments. When we say that Olivia and Noah excel in mathematics, we disregard the specific differences in their mathematical abilities. Similarly, when we claim that a particular automobile manufacturer produces high-quality cars, we overlook the minor variations among individual vehicles. These coarse-grained assessments are distinct from erroneous or deceptive judgments, such as those resulting from student cheating or false advertising by corporations. Despite the prevalence of such judgments, little attention has been given to their underlying mathematical structure. In this paper, we introduce the concept of coarse-graining into game theory, analyzing games where players may perceive different payoffs as identical while preserving the underlying order structure. We call it a Coarse-Grained Game (CGG). This framework allows us to examine the rational inference processes that arise when players equate distinct micro-level payoffs at a macro level, and to explore how Nash equilibria are preserved or altered as a result. Our key findings suggest that CGGs possess several desirable properties that make them suitable for modeling phenomena in the social sciences. This paper demonstrates two such applications: first, in cases of overly minor product updates, consumers may encounter an equilibrium selection problem, resulting in market behavior that is not driven by objective quality differences; second, the lemon market can be analyzed not only through objective information asymmetry but also through asymmetries in perceptual resolution or recognition ability.2025-03-22T00:59:22Z49 pagesTakashi Izumohttp://arxiv.org/abs/2504.01276v1Online Fault Detection and Classification of Chemical Process Systems Leveraging Statistical Process Control and Riemannian Geometric Analysis2025-04-02T01:00:36ZIn this work, we study an integrated fault detection and classification framework called FARM for fast, accurate, and robust online chemical process monitoring. The FARM framework integrates the latest advancements in statistical process control (SPC) for monitoring nonparametric and heterogeneous data streams with novel data analysis approaches based on Riemannian geometry together in a hierarchical framework for online process monitoring. We conduct a systematic evaluation of the FARM monitoring framework using the Tennessee Eastman Process (TEP) dataset. Results show that FARM performs competitively against state-of-the-art process monitoring algorithms by achieving a good balance among fault detection rate (FDR), fault detection speed (FDS), and false alarm rate (FAR). Specifically, FARM achieved an average FDR of 96.97% while also outperforming benchmark methods in successfully detecting hard-to-detect faults that are previously known, including Faults 3, 9 and 15, with FDRs being 97.08%, 96.30% and 95.99%, respectively. In terms of FAR, our FARM framework allows practitioners to customize their choice of FAR, thereby offering great flexibility. Moreover, we report a significant improvement in average fault classification accuracy during online monitoring from 61% to 82% when leveraging Riemannian geometric analysis, and further to 84.5% when incorporating additional features from SPC. This illustrates the synergistic effect of integrating fault detection and classification in a holistic, hierarchical monitoring framework.2025-04-02T01:00:36ZUnder review at Computers and Chemical EngineeringComputers & Chemical Engineering Volume 200, September 2025, 109177Alireza MiraliakbarFangyuan MaZheyu Jiang10.1016/j.compchemeng.2025.109177http://arxiv.org/abs/2503.22945v1Statistics at a Crossroads; Who is for the Challenge?2025-03-29T02:32:17ZThis project was sponsored by the National Science Foundation and organized by a steering committee and a group of theme leaders. The six-member steering committee, consisting of James Berger, Xuming He, David Madigan, Susan Murphy, Bin Yu, and Jon Wellner, was responsible for the overall planning of the project.
This report is designed to be accessible to the wider audience of key stakeholders in statistics and data science, including academic departments, university administration, and funding agencies. After the role and the value of Statistics and Data Science are discussed in Section 1, the report focuses on the two goals related to emerging research and data-driven challenges in applications. Section 2 identifies emerging research topics from the data challenges arising from scientific and social applications, and Section 3 discusses a number of emerging areas in foundational research. How to engage with those data-driven challenges and foster interdisciplinary collaborations is also summarized in the Executive Summary. The third goal of creating a vibrant research community and maintaining an appropriate balance is addressed in Sections 4 (Professional Culture and Community Responsibilities) and 5 (Doctoral Education).2025-03-29T02:32:17ZXuming HeDavid MadiganBin YuJon Wellnerhttp://arxiv.org/abs/2406.11590v2Ride-sharing Determinants: Spatial and Spatio-temporal Bayesian Analysis for Chicago Service in 20222025-03-27T20:09:27ZThe rapid expansion of ride-sharing services has caused significant disruptions in the transpor-tation industry and fundamentally altered the way individuals move from one place to another. Accurate estimation of ride-sharing improves service utilization and reliability and reduces travel time and traffic congestion. In this study, we employ two Bayesian models to estimate ride-sharing demand in the 77 Chicago community areas. We consider demographic, scoio-economic, transportation factors as well as land-use characteristics as explanatory variables. Our models assume conditional autoregression (CAR) prior for the explanatory variables. Moreover, the Bayesian frameworks estimate both the unstructured random error and the struc-tured errors for the spatial and the spatiotemporal correlation. We assessed the performance of the estimated models and the residuals of the spatial regression model have no left-over spatial structure. For the spatiotemporal model, the squared correlation between actual ride-shares and the fitted values is 0.95. Our analysis revealed that the demographic factors (populations size and registered crimes) positively impact the ride-sharing demand. Additionally, the ride-sharing demand increases with higher income and increase in the economically active propor-tion of the population as well as the residents with no cars. Moreover, the transit availability and the walkability indices are crucial determinants for the ridesharing in Chicago.2024-06-17T14:35:59ZMohamed ElkhoulyTaqwa Alhadidihttp://arxiv.org/abs/2402.07029v2Using Mathlink Cubes to Introduce Data Wrangling with Examples in R2025-03-21T14:01:34ZThis paper explores an innovative approach to teaching data wrangling skills to students through hands-on activities before transitioning to coding. Data wrangling, a critical aspect of data analysis, involves cleaning, transforming, and restructuring data. We introduce the use of a physical tool, mathlink cubes, to facilitate a tangible understanding of data sets. This approach helps students grasp the concepts of data wrangling before implementing them in coding languages such as R. We detail a classroom activity that includes hands-on tasks paralleling common data wrangling processes such as filtering, selecting, and mutating, followed by their coding equivalents using R's `dplyr` package.2024-02-10T19:30:40ZLucy D'Agostino McGowanhttp://arxiv.org/abs/2503.14839v1Bayesian hierarchical non-stationary hybrid modeling for threshold estimation in peak over threshold approach2025-03-19T02:46:46ZExtreme value theory (EVT) has been utilized to estimate crash risk from traffic conflicts with the peak over threshold approach. However, it's challenging to determine a suitable threshold to distinguish extreme conflicts in an objective way. The subjective and arbitrary selection of the threshold in the peak over threshold approach can result in biased estimation outcomes. This study proposes a Bayesian hierarchical hybrid modeling (BHHM) framework for the threshold estimation in the peak over threshold approach. Specifically, BHHM is based on a piecewise function to model the general conflicts with specific distribution while model the extreme conflicts with generalized Pareto distribution (GPD). The Bayesian hierarchical structure is used to combine traffic conflicts from different sites, incorporating covariates and site-specific unobserved heterogeneity. Five non-stationary BHHM models, including Normal-GPD, Cauchy-GPD, Logistic-GPD, Gamma-GPD, and Lognormal-GPD models, were developed and compared. Traditional graphical diagnostic and quantile regression approaches were also used for comparison. Traffic conflicts collected from three signalized intersections in the city of Surrey, British Columbia were used for the study. The results show that the proposed BHHM approach could estimate the threshold parameter objectively. The Lognormal-GPD model is superior to the other four BHHM models in terms of crash estimation accuracy and model fit. The crash estimates using the threshold determined by the BHHM outperform those estimated based on the graphical diagnostic and quantile regression approaches, indicating the superiority of the proposed threshold determination approach. The findings of this study contribute to enhancing the existing EVT methods for providing a threshold determination approach as well as producing reliable crash estimations.2025-03-19T02:46:46ZAccident Analysis & Prevention, Volume 223, Article 108249, 2025Quansheng YueYanyong GuoTarek SayedLai ZhengHao LyuPan Liu10.1016/j.aap.2025.108249