https://arxiv.org/api/7NSWlsmygsnqMnOyW8SLeHOPXH42026-06-10T20:42:00Z168633015http://arxiv.org/abs/2501.11813v1Utilising Deep Learning to Elicit Expert Uncertainty2025-01-21T01:36:12ZRecent work [ 14 ] has introduced a method for prior elicitation that utilizes records of expert decisions to infer a prior distribution. While this method provides a promising approach to eliciting expert uncertainty, it has only been demonstrated using tabular data, which may not entirely represent the information used by experts to make decisions. In this paper, we demonstrate how analysts can adopt a deep learning approach to utilize the method proposed in [14 ] with the actual information experts use. We provide an overview of deep learning models that can effectively model expert decision-making to elicit distributions that capture expert uncertainty and present an example examining the risk of colon cancer to show in detail how these models can be used.2025-01-21T01:36:12ZJulia R. FalconerEibe FrankDevon L. L. PolaschekChaitanya Joshihttp://arxiv.org/abs/2306.04430v2Evaluating the impact of outcome delay on the efficiency of two-arm group-sequential trials2025-01-20T13:47:42ZAdaptive designs(AD) are a broad class of trial designs that allow preplanned modifications based on patient data providing improved efficiency and flexibility. However, a delay in observing the primary outcome variable can harm this added efficiency. In this paper, we aim to ascertain the size of such outcome delay that results in the realised efficiency gains of ADs becoming negligible compared to classical fixed sample RCTs. We measure the impact of delay by developing formulae for the no. of overruns in 2 arm GSDs with normal data, assuming different recruitment models. The efficiency of a GSD is usually measured in terms of the expected sample size (ESS), with GSDs generally reducing the ESS compared to a standard RCT. Our formulae measures the efficiency gain from a GSD in terms of ESS reduction that is lost due to delay. We assess whether careful choice of design (e.g., altering the spacing of the IAs) can help recover the benefits of GSDs in presence of delay. We also analyse the efficiency of GSDs with respect to time to complete the trial. Comparing the expected efficiency gains, with and without consideration of delay, it is evident GSDs suffer considerable losses due to delay. Even a small delay can have a significant impact on the trial's efficiency. In contrast, even in the presence of substantial delay, a GSD will have a smaller expected time to trial completion in comparison to a simple RCT. Greater efficiency is lost with increase in the no. of stages. The timing of IAs also can impact the efficiency of a GSDs with delay. Particularly, for unequally spaced IAs, conducting IAs too early in the trial can be harmful for the design with delay.2023-06-07T13:41:14ZStatistics in Biopharmaceutical Research, 2025Aritra MukherjeeMichael J. GraylingJames M. S. Wason10.1080/19466315.2025.2565162http://arxiv.org/abs/2304.12482v4Information Theory for Complex Systems Scientists2025-01-17T14:13:28ZIn the 21st century, many of the crucial scientific and technical issues facing humanity can be understood as problems associated with understanding, modelling, and ultimately controlling complex systems: systems comprised of a large number of non-trivially interacting components whose collective behaviour can be difficult to predict. Information theory, a branch of mathematics historically associated with questions about encoding and decoding messages, has emerged as something of a lingua franca for those studying complex systems, far exceeding its original narrow domain of communication systems engineering. In the context of complexity science, information theory provides a set of tools which allow researchers to uncover the statistical and effective dependencies between interacting components; relationships between systems and their environment; mereological whole-part relationships; and is sensitive to non-linearities missed by commonly parametric statistical models.
In this review, we aim to provide an accessible introduction to the core of modern information theory, aimed specifically at aspiring (and established) complex systems scientists. This includes standard measures, such as Shannon entropy, relative entropy, and mutual information, before building to more advanced topics, including: information dynamics, measures of statistical complexity, information decomposition, and effective network inference. In addition to detailing the formal definitions, in this review we make an effort to discuss how information theory can be interpreted and develop the intuition behind abstract concepts like "entropy," in the hope that this will enable interested readers to understand what information is, and how it is used, at a more fundamental level.2023-04-24T22:45:28ZThomas F. Varleyhttp://arxiv.org/abs/2501.10482v1Simulation of Random LR Fuzzy Intervals2025-01-17T04:08:57ZRandom fuzzy variables join the modeling of the impreciseness (due to their ``fuzzy part'') and randomness. Statistical samples of such objects are widely used, and their direct, numerically effective generation is therefore necessary. Usually, these samples consist of triangular or trapezoidal fuzzy numbers. In this paper, we describe theoretical results and simulation algorithms for another family of fuzzy numbers -- LR fuzzy numbers with interval-valued cores. Starting from a simulation perspective on the piecewise linear LR fuzzy numbers with the interval-valued cores, their limiting behavior is then considered. This leads us to the numerically efficient algorithm for simulating a sample consisting of such fuzzy values.2025-01-17T04:08:57ZMaciej RomaniukAbbas ParchamiPrzemysław Grzegorzewskihttp://arxiv.org/abs/2501.09171v1Generative AI Takes a Statistics Exam: A Comparison of Performance between ChatGPT3.5, ChatGPT4, and ChatGPT4o-mini2025-01-15T21:46:01ZMany believe that use of generative AI as a private tutor has the potential to shrink access and achievement gaps between students and schools with abundant resources versus those with fewer resources. Shrinking the gap is possible only if paid and free versions of the platforms perform with the same accuracy. In this experiment, we investigate the performance of GPT versions 3.5, 4.0, and 4o-mini on the same 16-question statistics exam given to a class of first-year graduate students. While we do not advocate using any generative AI platform to complete an exam, the use of exam questions allows us to explore aspects of ChatGPT's responses to typical questions that students might encounter in a statistics course. Results on accuracy indicate that GPT 3.5 would fail the exam, GPT4 would perform well, and GPT4o-mini would perform somewhere in between. While we acknowledge the existence of other Generative AI/LLMs, our discussion concerns only ChatGPT because it is the most widely used platform on college campuses at this time. We further investigate differences among the AI platforms in the answers for each problem using methods developed for text analytics, such as reading level evaluation and topic modeling. Results indicate that GPT3.5 and 4o-mini have characteristics that are more similar than either of them have with GPT4.2025-01-15T21:46:01Z24 pages, 2 figures, 3 tables. Submitted for publication August, 2024; revision submitted January 2025J. data sci. 23(2025), no. 2, 269-286Monnie McGeeBivin Sadler10.6339/25-JDS1174http://arxiv.org/abs/2501.08404v1Extremal events dictate population growth rate inference2025-01-14T19:44:59ZRecent methods have been developed to map single-cell lineage statistics to population growth. Because population growth selects for exponentially rare phenotypes, these methods inherently depend on sampling large deviations from finite data, which introduces systematic errors. A comprehensive understanding of these errors in the context of finite data remains elusive. To address this gap, we study the error in growth rate estimates across different models. We show that under the usual bias-variance decomposition, the bias can be decomposed into a finite-time bias and nonlinear averaging bias. We demonstrate that finite-time bias, which dominates at short times, can be mitigated by fitting its monotonic behavior. In contrast, at longer times, nonlinear averaging bias becomes the predominant source of error, leading to a phase transition. This transition can be understood through the Random Energy Model, a mean-field model of disordered systems, where a few lineages dominate the estimator. Applying these methods to experimental data demonstrates that correcting for biases in lineage-based approaches yields consistent results for the long-term growth rate across multiple methods and enables the reverse-engineering of dynamic models. This new framework provides a quantitative understanding of growth rate estimators, clarifies the conditions under which they can be effectively applied to finite data, and introduces model-free approaches for studying the connections between physiology and cell growth.2025-01-14T19:44:59ZTrevor GrandPreEthan LevienAriel Amirhttp://arxiv.org/abs/2501.08320v1COMBO and COMMA: R packages for regression modeling and inference in the presence of misclassified binary mediator or outcome variables2025-01-14T18:53:22ZMisclassified binary outcome or mediator variables can cause unpredictable bias in resulting parameter estimates. As more datasets that were not originally collected for research purposes are being used for studies in the social and health sciences, the need for methods that address data quality concerns is growing. In this paper, we describe two R packages, COMBO and COMMA, that implement bias-correction methods for misclassified binary outcome and mediator variables, respectively. These likelihood-based approaches do not require gold standard measures and allow for estimation of sensitivity and specificity rates for the misclassified variable(s). In addition, these R packages automatically apply crucial label switching corrections, allowing researchers to circumvent the inherent permutation invariance of the misclassification model likelihood. We demonstrate COMBO for single-outcome cases using a study of bar exam passage. We develop and evaluate a risk prediction model based on noisy indicators in a pretrial risk assessment study to demonstrate COMBO for multi-outcome cases. In addition, we use COMMA to evaluate the mediating effect of potentially misdiagnosed gestational hypertension on the maternal ethnicity-birthweight relationship.2025-01-14T18:53:22Z99 pages, 7 figuresKimberly A. Hochstedler WebbMartin T. Wellshttp://arxiv.org/abs/2501.03457v1A Bureaucratic Theory of Statistics2025-01-07T01:15:43ZThis commentary proposes a framework for understanding the role of statistics in policy-making, regulation, and bureaucratic systems. I introduce the concept of "ex ante policy," describing statistical rules and procedures designed before data collection to govern future actions. Through examining examples, particularly clinical trials, I explore how ex ante policy serves as a calculus of bureaucracy, providing numerical foundations for governance through clear, transparent rules. The ex ante frame obviates heated debates about inferential interpretations of probability and statistical tests, p-values, and rituals. I conclude by calling for a deeper appreciation of statistics' bureaucratic function and suggesting new directions for research in policy-oriented statistical methodology.2025-01-07T01:15:43Z11 pages. To appear in the journal Observational StudiesBenjamin Rechthttp://arxiv.org/abs/2403.03387v2A Systematic Literature Review of Undergraduate Data Science Education Research2025-01-03T21:46:40ZThe presence of data science has been profound in the scientific community in almost every discipline. An important part of the data science education expansion has been at the undergraduate level. We conducted a systematic literature review to (1) portray current evidence and knowledge gaps in self-proclaimed undergraduate data science education research and (2) inform policymakers and the data science education community about what educators may encounter when searching for literature using the general keyword 'data science education.' While open-access publications that target a broader audience of data science educators and include multiple examples of data science programs and courses are a strength, significant knowledge gaps remain. The undergraduate data science literature that we identified often lacks empirical data, research questions and reproducibility. Certain disciplines are less visible. We recommend that we should (1) cherish data science as an interdisciplinary field; (2) adopt a consistent set of keywords/terminology to ensure data science education literature is easily identifiable; (3) prioritize investments in empirical studies.2024-03-06T00:49:08Z4 figures and 2 tablesMine DogucuSinem DemirciHarry BendekgeyFederica Zoe RicciCatalina M. Medina10.1080/26939169.2025.2486656http://arxiv.org/abs/2501.00997v1Stochastic Simulation and Monte Carlo Method2025-01-02T01:31:34ZThese lecture notes are intended to cover some introductory topics in stochastic simulation for scientific computing courses offered by the IT department at Uppsala University, as taught by the author. Basic concepts in probability theory are provided in the Appendix A, which you may review before starting the upcoming sections or refer to as needed throughout the text.2025-01-02T01:31:34ZDavoud Mirzaeihttp://arxiv.org/abs/2412.20175v1An Undergraduate Course on the Statistical Principles of Research Study Design2024-12-28T15:06:25ZThe undergraduate curriculum in statistics and data science is undergoing changes to accommodate new methods, newly interested students, and the changing role of statistics in society. Because of this, it is more important than ever that students understand the role of study design and how to formulate meaningful scientific and statistical research questions. While the traditional Design of Experiments course is still extremely valuable for students heading to industry and research careers, a broader study design course that incorporates survey sampling, observational studies, and the basics of causal inference with randomized experiment design is particularly useful for students with a wide range of applied interests. Here, I describe such a course at a small liberal arts college, along with ways to adapt it to meet different student and instructor background and interests. The course serves as a valuable bridge to advanced statistical coursework, meets key statistical literacy and communication learning goals, and can be tailored to the desired level of computational and mathematical fluency. Through reading, discussing, and critiquing actual published research studies, students learn that statistics is a living discipline with real consequences and become better consumers and producers of scientific research and data-driven insights.2024-12-28T15:06:25Z27 pages, 3 inset boxes, two appendices (18 pages total)Am.Stat. 79 (2025) 520-528Lee Kennedy-Shaffer10.1080/00031305.2025.2509664http://arxiv.org/abs/2412.19938v1Towards Strong AI: Transformational Beliefs and Scientific Creativity2024-12-27T22:02:36ZStrong artificial intelligence (AI) is envisioned to possess general cognitive abilities and scientific creativity comparable to human intelligence, encompassing both knowledge acquisition and problem-solving. While remarkable progress has been made in weak AI, the realization of strong AI remains a topic of intense debate and critical examination. In this paper, we explore pivotal innovations in the history of astronomy and physics, focusing on the discovery of Neptune and the concept of scientific revolutions as perceived by philosophers of science. Building on these insights, we introduce a simple theoretical and statistical framework of weak beliefs, termed the Transformational Belief (TB) framework, designed as a foundation for modeling scientific creativity. Through selected illustrative examples in statistical science, we demonstrate the TB framework's potential as a promising foundation for understanding, analyzing, and even fostering creativity -- paving the way toward the development of strong AI. We conclude with reflections on future research directions and potential advancements.2024-12-27T22:02:36ZSamuel J. EschkerChuanhai Liuhttp://arxiv.org/abs/2412.16657v2A Comprehensive Guide to Item Recovery Using the Multidimensional Graded Response Model in R2024-12-24T17:28:02ZThe purpose of this study is to provide a step-by-step demonstration of item recovery for the Multidimensional Graded Response Model (MGRM) in R. Within this scope, a sample simulation design was constructed where the test lengths were set to 20 and 40, the interdimensional correlations were varied as 0.3 and 0.7, and the sample size was fixed at 2000. Parameter estimates were derived from the generated datasets for the 3-dimensional GRM, and bias and Root Mean Square Error (RMSE) values were calculated and visualized. In line with the aim of the study, R codes for all these steps were presented along with detailed explanations, enabling researchers to replicate and adapt the procedures for their own analyses. This study is expected to contribute to the literature by serving as a practical guide for implementing item recovery in the MGRM. In addition, the methods presented, including data generation, parameter estimation, and result visualization, are anticipated to benefit researchers even if they are not directly engaged in item recovery.2024-12-21T15:00:31ZYesim Beril SoguksuAyse Bilicioglu GunesHatice Gurdilhttp://arxiv.org/abs/2409.01647v2Correlation Properties in Channels with von Mises-Fisher Distribution of Scatterers2024-12-22T01:10:23ZThis letter presents simple analytical expressions for the spatial and temporal correlation functions in channels with von Mises-Fisher (vMF) scattering. In contrast to previous results, the expressions presented here are exact and based only on elementary functions, clearly revealing the impact of the underlying parameters. The derived results are validated by a comparison against numerical integration result, where an exact match is observed. To demonstrate their utility, the presented results are used to analyze spatial correlation across different antenna array geometries and to investigate temporal correlation of a fluctuating radar signal from a moving target.2024-09-03T06:35:59ZPublished in IEEE Wireless Communications Letters (submitted on 2024-05-14; revised and resubmitted on 2024-08-26)IEEE Wireless Communications Letters, vol. 13, no. 12, pp. 3638-3642, Dec. 2024Kenan TurbicMartin KasparickSlawomir Stanczak10.1109/LWC.2024.3484331http://arxiv.org/abs/2412.10643v2Scientific Realism vs. Anti-Realism: Toward a Common Ground2024-12-20T14:55:54ZThe debate between scientific realism and anti-realism remains at a stalemate, making reconciliation seem hopeless. Yet, important work remains: exploring a common ground, even if only to uncover deeper points of disagreement and, ideally, to benefit both sides of the debate. I propose such a common ground. Specifically, many anti-realists, such as instrumentalists, have yet to seriously engage with Sober's call to justify their preferred version of Ockham's razor through a positive account. Meanwhile, realists face a similar challenge: providing a non-circular explanation of how their version of Ockham's razor connects to truth. The common ground I propose addresses these challenges for both sides; the key is to leverage the idea that everyone values some truths and to draw on insights from scientific fields that study scientific inference -- namely, statistics and machine learning. This common ground also isolates a distinctively epistemic root of the irreconcilability in the realism debate.2024-12-14T02:08:38ZHanti Lin